Git Product home page Git Product logo

javascriptdata / danfojs Goto Github PK

View Code? Open in Web Editor NEW
4.6K 31.0 205.0 82.06 MB

Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.

Home Page: https://danfo.jsdata.org/

License: MIT License

JavaScript 26.36% TypeScript 73.64%
data-analytics data-science pandas data-manipulation tensors dataframe javascript data-analysis danfojs stream-data

danfojs's Introduction



Danfojs: powerful javascript data analysis toolkit

Node.js CI Coverage Status Twitter Patreon donate button

What is it?

Danfo.js is a javascript package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It is heavily inspired by Pandas library, and provides a similar API. This means that users familiar with Pandas, can easily pick up danfo.js.

Main Features

  • Danfo.js is fast and supports Tensorflow.js tensors out of the box. This means you can convert Danfo data structure to Tensors.
  • Easy handling of missing-data (represented as NaN) in floating point as well as non-floating point data
  • Size mutability: columns can be inserted/deleted from DataFrame
  • Automatic and explicit alignment: objects can be explicitly aligned to a set of labels, or the user can simply ignore the labels and let Series, DataFrame, etc. automatically align the data for you in computations
  • Powerful, flexible groupby functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data
  • Make it easy to convert Arrays, JSONs, List or Objects, Tensors and differently-indexed data structures into DataFrame objects
  • Intelligent label-based slicing, fancy indexing, and querying of large data sets
  • Intuitive merging and joining data sets
  • Robust IO tools for loading data from flat-files (CSV, Json, Excel).
  • Powerful, flexible and intutive API for plotting DataFrames and Series interactively.
  • Timeseries-specific functionality: date range generation and date and time properties.
  • Robust data preprocessing functions like OneHotEncoders, LabelEncoders, and scalers like StandardScaler and MinMaxScaler are supported on DataFrame and Series

Installation

There are three ways to install and use Danfo.js in your application

  • For Nodejs applications, you can install the danfojs-node version via package managers like yarn and/or npm:
npm install danfojs-node

or

yarn add danfojs-node

For client-side applications built with frameworks like React, Vue, Next.js, etc, you can install the danfojs version:

npm install danfojs

or

yarn add danfojs

For use directly in HTML files, you can add the latest script tag from JsDelivr to your HTML file:

    <script src="https://cdn.jsdelivr.net/npm/[email protected]/lib/bundle.js"></script>

See all available versions here

Quick Examples

Example Usage in the Browser

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <script src="https://cdn.jsdelivr.net/npm/[email protected]/lib/bundle.js"></script>

    <title>Document</title>
  </head>

  <body>
    <div id="div1"></div>
    <div id="div2"></div>
    <div id="div3"></div>

    <script>

      dfd.readCSV("https://raw.githubusercontent.com/plotly/datasets/master/finance-charts-apple.csv")
          .then(df => {

              df['AAPL.Open'].plot("div1").box() //makes a box plot

              df.plot("div2").table() //display csv as table

              new_df = df.setIndex({ column: "Date", drop: true }); //resets the index to Date column
              new_df.head().print() //
              new_df.plot("div3").line({
                  config: {
                      columns: ["AAPL.Open", "AAPL.High"]
                  }
              })  //makes a timeseries plot

          }).catch(err => {
              console.log(err);
          })
    </script>
  </body>
</html>

Output in Browser:

Example usage in Nodejs

const dfd = require("danfojs-node");

const file_url =
  "https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/stuff/titanic.csv";
dfd
  .readCSV(file_url)
  .then((df) => {
    //prints the first five columns
    df.head().print();

    // Calculate descriptive statistics for all numerical columns
    df.describe().print();

    //prints the shape of the data
    console.log(df.shape);

    //prints all column names
    console.log(df.columns);

    // //prints the inferred dtypes of each column
    df.ctypes.print();

    //selecting a column by subsetting
    df["Name"].print();

    //drop columns by names
    let cols_2_remove = ["Age", "Pclass"];
    let df_drop = df.drop({ columns: cols_2_remove, axis: 1 });
    df_drop.print();

    //select columns by dtypes
    let str_cols = df_drop.selectDtypes(["string"]);
    let num_cols = df_drop.selectDtypes(["int32", "float32"]);
    str_cols.print();
    num_cols.print();

    //add new column to Dataframe

    let new_vals = df["Fare"].round(1);
    df_drop.addColumn("fare_round", new_vals, { inplace: true });
    df_drop.print();

    df_drop["fare_round"].round(2).print(5);

    //prints the number of occurence each value in the column
    df_drop["Survived"].valueCounts().print();

    //print the last ten elementa of a DataFrame
    df_drop.tail(10).print();

    //prints the number of missing values in a DataFrame
    df_drop.isNa().sum().print();
  })
  .catch((err) => {
    console.log(err);
  });

Output in Node Console:

Notebook support

  • VsCode nodejs notebook extension now supports Danfo.js. See guide here
  • ObservableHQ Notebooks. See example notebook here

Documentation

The official documentation can be found here

Danfo.js Official Book

We published a book titled "Building Data Driven Applications with Danfo.js". Read more about it here

Discussion and Development

Development discussions take place here.

Contributing to Danfo

All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome. A detailed overview on how to contribute can be found in the contributing guide.

Licence MIT

Created by Rising Odegua and Stephen Oni

Danfo.js - Open Source JavaScript library for manipulating data. | Product Hunt Embed

danfojs's People

Contributors

adithyaakrishna avatar adzo261 avatar ankitskvmdam avatar bowtiedaztec avatar callmekatootie avatar dcrescim avatar dependabot[bot] avatar devwulf avatar gantman avatar geoextra avatar halflings avatar hodovani avatar igonro avatar jankaul avatar jhennertigreros avatar jpjagt avatar kgeis avatar merryman avatar mjarmoc avatar neonspork avatar opeyemibami avatar p0vidl0 avatar pensono avatar prawiragenestonlia avatar rhnsharma avatar risenw avatar robertwalsh0 avatar secretlyamble avatar steveoni avatar woosuk288 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

danfojs's Issues

Danfo.js benchmarks?

Hello!

I have a few quick questions!

Are there any performance benchmarks of danfo.js? I am trying to decide if Danfo would be faster than vanilla JS operations for groupings, getting distinct values by column, and joins. How does memory consumption compare?

Also, which TF.js backend is used when running in the browser (Chrome)? Does danfo.js block the main DOM thread when used in the browser?

Thanks!
-Alex

Data frame rename does not work as specify

Setting inplace to "true" or "false" seems not to be working as expected
{ df.rename({mapper:{'type':'wine_type'},inplace:true}) return print(df.head()) }
{ df.rename({mapper:{'type':'wine_type'},inplace:false}) return print(df.head()) }

both code snippet output same result which is not supposed to be so.

Feature Request: Read Excel file into DataFrame Object

Excel is a popular format for storing data, so we intend to support it, we need to be able to read .xls files in danfo data structures. This should be in the reader.py module and can look something like:

// /**
//  * Reads a Excel file from local or remote address
//  * 
//  * @param {source} URL or local file path to retreive JSON file.
//  * @returns {Promise} DataFrame structure of parsed CSV data
//  */
// export const read_excel = async (source) => {
              data = read_parse_file(source)
              return new DataFrame(data)

// }

[read_csv] support File object in browser

I use read_csv to read a 130MB file in the browser and using developer tools I saw the file get downloaded twice, then it just hang and eventually crash. So clearly there is one or two bugs there.

And it would be nice if read_csv accept File object in the browser as well.

Feature request: New Query operator !=

Hey folks!

Firstly, I want to say this project is awesome. I've been playing with him for two weeks and I've had a lot of fun

I have a little suggestion, when I'm trying to get a new data frame excluding a column value I'd like to use the query method by a new parameter (operator). Example:

const data = {
  'countries': ['Venezuela', 'Argentina', 'Colombia', 'Spain']
  'capital': ['Caracas', 'Buenos Aires', 'Bogota', 'Madrid']
}

let df = new dfd.DataFrame(data)
df = df.query({ "column": "countries", "is": "!=", "to": 'Spain' })

df.print()

// returns a countries table without the Spain row. 

It could be a powerful addition. I'd love to collab with a PR, please let me know your thoughts.

nodejs_kernel_backend.js

Running Win10 Pro, node v14.13.0, [email protected]

I think there are errors in this file (@tensorflow/tfjs-node/dist/nodejs_kernel_backend.js). I made some changes as shown below (as there is apparently no 'createTensor() function', I just changed those references to 'createOutputTensor', which was a blind guess), and stopped a few errors, but I'm not sure my changes are correct. I'm getting errors any time a 'slice' is required, e.g.:

C:\Users\me\Documents\Javascript\danfo>node dict.js
C:\Users\me\node_modules\@tensorflow\tfjs-node\node_modules\@tensorflow\tfjs-core\dist\tf-core.node.js:2567
            throw ex;
            ^

TypeError: Cannot read property 'slice' of undefined
    at new Tensor (C:\Users\HP6300\node_modules\@tensorflow\tfjs-node\node_modules\@tensorflow\tfjs-core\dist\tf-core.node.js:1808:28)
    at Engine.makeTensorFromDataId (C:\Users\HP6300\node_modules\@tensorflow\tfjs-node\node_modules\@tensorflow\tfjs-core\dist\tf-core.node.js:2817:17)
    at NodeJSKernelBackend.createOutputTensor (C:\Users\HP6300\node_modules\@tensorflow\tfjs-node\dist\nodejs_kernel_backend.js:139:28)
    at NodeJSKernelBackend.getInputTensorIds (C:\Users\HP6300\node_modules\@tensorflow\tfjs-node\dist\nodejs_kernel_backend.js:159:30)
    at NodeJSKernelBackend.executeSingleOutput (C:\Users\HP6300\node_modules\@tensorflow\tfjs-node\dist\nodejs_kernel_backend.js:192:73)
    at NodeJSKernelBackend.linspace (C:\Users\HP6300\node_modules\@tensorflow\tfjs-node\dist\nodejs_kernel_backend.js:1498:21)
    at C:\Users\HP6300\node_modules\@tensorflow\tfjs-node\node_modules\@tensorflow\tfjs-core\dist\tf-core.node.js:17877:69
    at C:\Users\HP6300\node_modules\@tensorflow\tfjs-node\node_modules\@tensorflow\tfjs-core\dist\tf-core.node.js:2705:55
    at C:\Users\HP6300\node_modules\@tensorflow\tfjs-node\node_modules\@tensorflow\tfjs-core\dist\tf-core.node.js:2551:22
    at Engine.scopedRun (C:\Users\HP6300\node_modules\@tensorflow\tfjs-node\node_modules\@tensorflow\tfjs-core\dist\tf-core.node.js:2561:23)

Here's 'dict.js' (code straight from 10 Minutes to Danfo-js which produced the errors) .

const dfd = require("danfojs-node")


dates = new dfd.date_range({ start: '2017-01-01', end: "2020-01-01", period: 4, freq: "Y" })

console.log(dates);

obj_data = {'A': dates,
            'B': ["bval1", "bval2", "bval3", "bval4"],
            'C': [10, 20, 30, 40],
            'D': [1.2, 3.45, 60.1, 45],
            'E': ["test", "train", "test", "train"]
            }

df = new dfd.DataFrame(obj_data)
df.print()

My edits to nodejs_kernel_backend.js:

    NodeJSKernelBackend.prototype.getInputTensorIds = function (tensors) {
        var ids = [];
        for (var i = 0; i < tensors.length; i++) {
            if (tensors[i] instanceof int64_tensors_1.Int64Scalar) {
                // Then `tensors[i]` is a Int64Scalar, which we currently represent
                // using an `Int32Array`.
                var value = tensors[i].valueArray;
                var id = this.createOutputTensor([], this.binding.TF_INT64, value); // var id = this.binding.createTensor([], this.binding.TF_INT64, value);
                ids.push(id);
            }
            else {
                var info = this.tensorMap.get(tensors[i].dataId);
                // TODO - what about ID in this case? Handle in write()??
                if (info.values != null) {
                    // Values were delayed to write into the TensorHandle. Do that before
                    // Op execution and clear stored values.
                    info.id =
                        this.createOutputTensor(info.shape, info.dtype, info.values); // this.binding.createTensor(info.shape, info.dtype, info.values);
                    info.values = null;
                }
                ids.push(info.id);
            }
        }
        return ids;

and

    NodeJSKernelBackend.prototype.executeEncodeImageOp = function (name, opAttrs, imageData, imageShape) {
        var inputTensorId = this.binding.createOutputTensor(imageShape, this.binding.TF_UINT8, imageData); //         var inputTensorId = this.binding.createTensor(imageShape, this.binding.TF_UINT8, imageData);
        var outputMetadata = this.binding.executeOp(name, opAttrs, [inputTensorId], 1);
        var outputTensorInfo = outputMetadata[0];
        // prevent the tensor data from being converted to a UTF8 string, since
        // the encoded data is not valid UTF8
        outputTensorInfo.dtype = this.binding.TF_UINT8;
        return this.createOutputTensor(outputTensorInfo);
    };

All help and comments greatly appreciated.

TensorFlow.js Double includes

Should TensorFlow.js be a peer dependency? I'm having some issues trying to include Danfo.js and when I'm in Node, I get errors when I try to bring tf.

I have similar issues when I try to bring it in on observablehq

Screen Shot 2020-12-06 at 8 19 00 PM

My guess is that you might need the TF version to be a peer dependency. If that's a no-go, then you'll need to expose your included version of TFJS with something like cosnt tf = dfd.tfjs

df["column"].pct_change()

Hi everyone,

thanks for doing this.

I just had a quick view on the docs and was missing the classic .pct_change() function.

Did I miss something?

I mostly do pandas, what would be easiest way to achieve this with danfo.js?

Thanks,
Christian

Error: Fail on fresh ubuntu 20.04 install

This could very well be user error but it would be great if someone could either validate that something has gone terribly wrong or not.

Using proxmox container image: ubuntu-20.04-standard_20.04-1_amd64.tar.gz

Steps:

  • apt update && apt dist-upgrade
  • apt install nodejs npm

At this point here are my versions:

  • python2 --version // Python 2.7.18rc1
  • python3 --version // Python 3.8.2
  • nodejs --version // v10.19.0
  • npm --version // 6.14.4

Now time for the fireworks.

npm install danfojs-node

> @tensorflow/[email protected] install /root/node_modules/@tensorflow/tfjs-node
> node scripts/install.js

CPU-linux-2.1.0.tar.gz
* Downloading libtensorflow
[==============================] 2774145/bps 100% 0.0s
* Building TensorFlow Node.js bindings

> [email protected] postinstall /root/node_modules/core-js
> node -e "try{require('./postinstall')}catch(e){}"

Thank you for using core-js ( https://github.com/zloirock/core-js ) for polyfilling JavaScript standard library!

The project needs your help! Please consider supporting of core-js on Open Collective or Patreon: 
> https://opencollective.com/core-js 
> https://www.patreon.com/zloirock 

Also, the author of core-js ( https://github.com/zloirock ) is looking for a good job -)

npm WARN saveError ENOENT: no such file or directory, open '/root/package.json'
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN enoent ENOENT: no such file or directory, open '/root/package.json'
npm WARN @tensorflow/[email protected] requires a peer of @tensorflow/[email protected] but none is installed. You must install peer dependencies yourself.
npm WARN @tensorflow/[email protected] requires a peer of @tensorflow/[email protected] but none is installed. You must install peer dependencies yourself.
npm WARN @tensorflow/[email protected] requires a peer of @tensorflow/[email protected] but none is installed. You must install peer dependencies yourself.
npm WARN @tensorflow/[email protected] requires a peer of @tensorflow/[email protected] but none is installed. You must install peer dependencies yourself.
npm WARN @tensorflow/[email protected] requires a peer of @tensorflow/[email protected] but none is installed. You must install peer dependencies yourself.
npm WARN root No description
npm WARN root No repository field.
npm WARN root No README data
npm WARN root No license field.

+ [email protected]
added 128 packages from 121 contributors and audited 128 packages in 83.191s

3 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities

npm init -y

index.js:

console.log("before");
const dfd = require("danfojs-node");
console.log("after");
root@tensor3:~# node index.js
before
Platform node has already been set. Overwriting the platform with [object Object].
Platform node has already been set. Overwriting the platform with [object Object].
node-pre-gyp info This Node instance does not support builds for N-API version 6
node-pre-gyp info This Node instance does not support builds for N-API version 6
Illegal instruction

For fun:

npm install @tensorflow/[email protected] @tensorflow/[email protected] @tensorflow/[email protected] @tensorflow/[email protected] @tensorflow/[email protected] @tensorflow/[email protected]

Packages installed fine. Same result.

nano node_modules/danfojs-node/dist/index.js

console.log('series danfo'); // This logs
var _series = require("./core/series");
console.log('frame danfo'); // This does not
var _frame = require("./core/frame");

nano node_modules/danfojs-node/dist/core/series.js

console.log('tf'); // This logs
var tf = _interopRequireWildcard(require("@tensorflow/tfjs-node"));
console.log('mathjs'); // This does not

Uh oh, seems to be a tensorflow issue.

I kept going down the rabbit hole and because I don't know how to use a debugger on the node side, it was painful.

nano node_modules/@tensorflow/tfjs-node/dist/index.js

It fails on this line: var bindings = require(bindingPath);

bindingPath is defined as:
var bindingPath = binary.find(path.resolve(path.join(__dirname, '/../package.json')));

logging bindingPath:
/root/node_modules/@tensorflow/tfjs-node/lib/napi-v5/tfjs_binding.node

Aha!

npm install @tensorflow/tfjs-node

Then I installed nvm and node v14.9.0

npm rebuild @tensorflow/tfjs-node build-addon-from-source

Now my index.js consists only of:
import * as tf from '@tensorflow/tfjs-node';

And I'm still getting that ILLEGAL INSTRUCTION. Clearly not a danfo issue, but still hoping someone here can point me in the right direction.

Feature Request: provide ESM conform dist target

I noticed this while I was trying to import danfojs via the https://jspm.dev cdn.
Right now the distributed script of danfos expects itself to be executed globally.
That is all fine and dandy as long as you either include the script tag directly into the dom or load the script manually in the browser. However it kind of causes problems when you want to use danfojs in the context of an ES-Module based build system.
Since every declared variable inside a ES-Module is local to that module, the dfd var is not exposed and can also not be accessed via the global window (since danfojs is run in its own context).
Is it possible to add an additional target in the webpack bundle that provides an esm conform export of the dfd variable?
I think that should do the job.

Thank you for the hard work!

the query example is not work

It throws error.
The example code is from https://danfo.jsdata.org/getting-started#selection.
At section "Selecting values from a DataFrame works on string columns:"

let data = [{"A": ["Ng", "Yu", "Mo", "Ng"]},
{"B": [34, 4, 5, 6]},
{"C": [20, 20, 30, 40]}]
let df = new dfd.DataFrame(data)

df.print()

let query_df = df.query({ column: "A", is: "==", to: "Ng"})
query_df.print() //after query

index.min.js:10560
Shape: (3,1,4)

index.min.js:10560
╔═══╤═══════════════════╗
║ │ A ║
╟───┼───────────────────╢
║ 0 │ Ng,Yu,Mo,Ng ║
╟───┼───────────────────╢
║ 1 │ 34,4,5,6 ║
╟───┼───────────────────╢
║ 2 │ 20,20,30,40 ║
╚═══╧═══════════════════╝

VM74817:1 Uncaught ReferenceError: Ng is not defined
at eval (eval at query (index.min.js:10277), :1:1)
at DataFrame.query (index.min.js:10277)
at :8:19

Support Apache Arrow input/output

Would you be open to supporting Apache Arrow as an input and export format? It would help with faster network transfers! They have a JS library available.

Thanks!

Dataframes Merge Issue

Joining with dfd.merge two dataframes fails whenever the left dataframe has more rows than the right dataframe (no matter if you do a left, right, inner or outer join).

Similarly, the output of a right join (with a longer dataframe as the right dataframe) is wrong. It outputs only the number of rows of the left dataframe.

A basic example that ends up falling on my machine:

const dfd = require("danfojs-node");

const arrayShort = [
    {
        label: "ABC",
        value: 1
    },
    {
        label: "DEF",
        value: 2
    },
    {
        label: "GHI",
        value: 3
    }
];
const dfShort = new dfd.DataFrame(arrayShort);

const arrayLong = [
    {
        label: "ABC",
        value2: 4
    },
    {
        label: "DEF",
        value2: 5
    },
    {
        label: "JKL",
        value2: 6
    },
    {
        label: "MNO",
        value2: 7
    }
];
const dfLong = new dfd.DataFrame(arrayLong);

console.log("Short: ");
dfShort.print();

console.log("Long: ");
dfLong.print();

howStr = "left"
console.log("Long as right dataframe");
try {
    dfd.merge({
        left: dfShort,
        right:dfLong,
        on: ["label"],
        how: howStr
    }).print()
    console.log("Merge Succeeded");
} catch (err) {
    console.log("Merge failed");
    console.log(err);
}

console.log("Long as left dataframe");
try {
    dfd.merge({
        left: dfLong,
        right:dfShort,
        on: ["label"],
        how: howStr
    }).print()
    console.log("Merge Succeeded");
} catch (err) {
    console.log("Merge failed");
    console.log(err);
}

The second attempt to join fails.

Merge failed
TypeError: Cannot read property '0' of undefined

read_json method returns an error.

TypeError: source.startsWith is not a function
    at Object.read_json (/Users/<user>/Projects/ML/tfjs-trainer-server/node_modules/danfojs-node/dist/io/reader.js:48:16)

Inplace option for fillna

Currently fillna performs filling and returns a new Dataframe. This will quite expensive for large data. We need an inplace option here.

Inplace drop column not working

I created a dataframe with a column called 'raw_timestamp'. Then I derived two additional columns out of it, 'raw_data' and 'raw_time' in the following way:
df = new dfd.DataFrame(raw_data, {columns:column_names}); df.addColumn({"column":"raw_date","value":df["raw_timestamp"].apply((x) => { return x.split(" ")[0]})}); df.addColumn({"column":"raw_time","value":df["raw_timestamp"].apply((x) => { return x.split(" ")[1]})});

After creating these columns, I decided to drop the column 'raw_timestamp' using the following line:
df.drop({columns:["raw_timestamp"],axis:1,inplace:true});

However, when I try to run this, I get an error saying:
TypeError: Cannot delete property 'raw_date' of [object Object] at /home/ghost/Desktop/Balanced/balanced-nodejs-server/node_modules/danfojs-node/dist/core/frame.js:1791:19 at Array.forEach (<anonymous>) at DataFrame.__set_col_property (/home/ghost/Desktop/Balanced/balanced-nodejs-server/node_modules/danfojs-node/dist/core/frame.js:1790:19) at DataFrame.drop (/home/ghost/Desktop/Balanced/balanced-nodejs-server/node_modules/danfojs-node/dist/core/frame.js:129:14) at /home/ghost/Desktop/Balanced/balanced-nodejs-server/index.js:50:6 at Layer.handle [as handle_request] (/home/ghost/Desktop/Balanced/balanced-nodejs-server/node_modules/express/lib/router/layer.js:95:5) at next (/home/ghost/Desktop/Balanced/balanced-nodejs-server/node_modules/express/lib/router/route.js:137:13) at Route.dispatch (/home/ghost/Desktop/Balanced/balanced-nodejs-server/node_modules/express/lib/router/route.js:112:3) at Layer.handle [as handle_request] (/home/ghost/Desktop/Balanced/balanced-nodejs-server/node_modules/express/lib/router/layer.js:95:5) at /home/ghost/Desktop/Balanced/balanced-nodejs-server/node_modules/express/lib/router/index.js:281:22

This issue occurs when I try to drop the column in place.
However, when I drop the column and save the result as a new dataframe, things work just fine.
df1 = df.drop({columns:["raw_timestamp"],axis:1});
Am I doing something wrong here?

Thank you.

Error dfd.merge

Hello I have 2 DataFrames that I try to join using a common column, but I receive the following

let df_merge = dfd.merge({ left: df_surveys, right: df_stacked_row, on: ["survey"], how: "left" });
df_merge.print();

>>> Uncaught TypeError: n.filter is not a function

// df_surveys
╔═══╤═══════════════════╗
║   │ survey            ║
╟───┼───────────────────╢
║ 0 │ Padron Nominal    ║
╟───┼───────────────────╢
║ 1 │ Limpieza y de...  ║
╟───┼───────────────────╢
║ 2 │ Indicadores d...  ║
╟───┼───────────────────╢
║ 3 │ Visita domici...  ║
╟───┼───────────────────╢
║ 4 │ Georref. de c...  ║
╚═══╧═══════════════════╝

// df_stacked_row
╔════╤═══════════════════╤═══════════════════╤═══════════════════╗
║    │ survey            │ calc_depa         │ cod_eval_coun...  ║
╟────┼───────────────────┼───────────────────┼───────────────────╢
║ 2  │ Padron Nominal    │ 8                 │ 1682              ║
╟────┼───────────────────┼───────────────────┼───────────────────╢
║ 5  │ Limpieza y de...  │ 8                 │ 160               ║
╟────┼───────────────────┼───────────────────┼───────────────────╢
║ 8  │ Indicadores d...  │ 8                 │ 78                ║
╟────┼───────────────────┼───────────────────┼───────────────────╢
║ 14 │ Georref. de c...  │ 8                 │ 235               ║
╚════╧═══════════════════╧═══════════════════╧═══════════════════╝

excuse my English.

Creating empty DataFrames as in pandas

newDF = pd.DataFrame()

In danfojs DataFrame class expects two parameters with first parameter being data in object[ ] or array[] type. Tried passing both but, I am getting error .

1. core.js:4442 ERROR Error: Uncaught (in promise): TypeError: Cannot read property 'length' of undefined
TypeError: Cannot read property 'length' of undefined
    at Utils.__get_t (utils.js:245)

2..Uncaught (in promise): TypeError: Cannot read property 'length' of undefined
TypeError: Cannot read property 'length' of undefined

Using danfojs v 0.1.2 (cilent)

not be able to query data after reading from csv

I am reading data from a csv file. I want to query data with a very simple statement:

dfd.read_csv(dataPath)
.then(df => {
  let query_df = df.query({ column: "A", is: "==", to: "foo"});
  query_df.print();
})

it gives me this error: ReferenceError: bar is not defined and "bar" is one of the values in column "A"

StandartScaler results

Why Danfojs StandartScaler results don’t equal to python sklear.StandartScaler results?

Integrate Danfo.js with data.js for efficiently loading of more files.

data.js is a lightweight, standardized "stream-plus-metadata" interface for accessing files and datasets, especially tabular ones (CSV, Excel).

data.js follows the "Frictionless Data Lib Pattern".

  • Open it fast: simple open method for data on disk, online and inline
  • Data plus: data plus metadata (size, path, etc) in standardized way
  • Stream it: raw streams and object streams
  • Tabular: open CSV, Excel or arrays and get a row stream
  • Frictionless: compatible with [Frictionless Data standards][fd]

Extending reader function means we'll have a generic function read that uses data.js to load CSV, excels, and Dataset from a path or Description following the Frictionless data specs.

Acceptance Criteria

  • read method works in Node for CSV, Excel, and Datasets
  • read method works in the browser for CSV, Excel, and Datasets

Task

  • Add read function to load files as Stream with data.js
  • Add function to convert stream to array in node environment
  • Add function to convert stream to array in browser environment (Currently Here, HeLp !!! 😢)

Editing required to run latest version on Windows 10

In @tensorflow/tfjs-node/dist/index.js, line 58:
var bindings = bindingPath; // var bindings = require(bindingPath);
In @tensorflow/tfjs-node/dist/nodejs_kernel_backend.js, line 77:
_this.isUsingGpuDevice = false; // _this.isUsingGpuDevice = _this.binding.isUsingGpuDevice(); - function doesn't exist

In place query and editing

I wish there is something like 'eval' in Pandas.

Currently df.query will return a new data frame, this will be expensive for updating and cleaning data.

[read_csv] Support relative file paths

With pandas, we can pass a relative filepath into read_csv.

For example:

pd.read_csv('data.csv')  

Or:

pd.read_csv("../data_folder/data.csv")

The paths are assumed to be relative to the script where read_csv is called.

Are there plans to support this, and would a pull-request be welcomed?

Typo in error messgae

Minor issue but I thought I should raise it. This error comes up when you pass more values than columns specified


let tensor_arr = [[12, 34, 2.2, 2], [30, 30, 2.1, 7]]
let df = new dfd.DataFrame(tensor_arr, {columns: ["A", "B"]})


Column length mismatch. You provided a column of length 2 but data has **lenght** of 4

Creating DataFrame from json that contains "index" key/property causes error

The Following Code:

const jsonData = [{name:'test1',index: 0},{name:'test2',index: 1}]
const df = new dfd.DataFrame(jsonData);

This will cause the error: UnhandledPromiseRejectionWarning: TypeError: Cannot set property index of [object Object] which has only a getter

Looks like within the DataFrame class "index" is reserved


class DataFrame extends _generic.default {
  constructor(data, kwargs) {
    super(data, kwargs);

    this.__set_column_property();
  }
 __set_column_property() {
    let col_vals = this.col_data;
    let col_names = this.column_names;
    col_vals.forEach((col, i) => {
      this[col_names[i]] = null;   <<<< Error happens here
      Object.defineProperty(this, col_names[i], {
        get() {
          return new _series.Series(this.col_data[i], {
            columns: col_names[i],
            index: this.index
          });
        },

        set(value) {
          this.addColumn({
            column: col_names[i],
            value: value
          });
        }

      });
    });
  }

Feature request: read netCDF / HDF5 file option.

Binary formats are heavily used in the Earth sciences and can provide an easy way to lazy load and manage large amounts of data.

I know it is likely out of scope, but it would be a very useful tool if it is at all possible.

iloc function for Series

The Series function should support index location (iloc). This should be a simple slicing of the values making up a Series, and returning it as Series.

Feature : Load data from file

As of now we can only load from https:// urls , I tried http, reading the json file and passing to the method and none of them worked.

It would be great to add reading from file or http urls too.

Create Dataframe from json data which contains null value failure

I tried to create datafram from json data, but I got error. This is because there is null value. Can you handle this?

Below is the exception stack from browser:
index.min.js:10277 Uncaught TypeError: Cannot read property 'toString' of null
at index.min.js:10277
at Array.forEach ()
at index.min.js:10277
at Array.forEach ()
at a.__get_t (index.min.js:10277)
at DataFrame.__set_col_types (index.min.js:10560)
at DataFrame.__read_object (index.min.js:10560)
at new u (index.min.js:10560)
at new DataFrame (index.min.js:10277)
at :1:1

Wrong exception thrown for empty dataframe

Using the get started query example
https://danfo.jsdata.org/api-reference/dataframe/danfo.dataframe.query

let data = {"A": [30, 1, 2, 3],
           "B": [34, 4, 5, 6],
           "C": [20, 20, 30, 40]}
           
let cols = ["A", "B", "C"]
let df = new dfd.DataFrame(data, { columns: cols })
df.print() //before query

let query_df = df.query({ "column": "B", "is": ">", "to": 5 })
table(query_df)

I modified the condition to

let data = {"A": [30, 1, 2, 3],
           "B": [34, 4, 5, 6],
           "C": [20, 20, 30, 40]}
           
let cols = ["A", "B", "C"]
let df = new dfd.DataFrame(data, { columns: cols })
df.print() //before query

let query_df = df.query({ "column": "B", "is": ">", "to": 40 })
table(query_df)

Error message

File format not supported for now

Expected Output: Return empty result/array

Sorting DataFrame on encoded String corrupts data

Hello,
If you have a dataframe that contains strings, then you cannot sort it. So I tried to use the label encoder and create a new column filled with the encoded labels. If you sort on the encoded labels, the dataframe is corrupted.

Here is an browser example:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/index.min.js"></script>
    <title>Document</title>
  </head>
  <body>
    <script>
      df = new dfd.DataFrame({
        X1: ["c", "a", "b", "c", "c", "a", "b"],
        X2: ["^", "%", "!", "#", "$", "&", "*"],
      });
      let encoder = new dfd.LabelEncoder();
      let x2 = df.X1.unique().values.sort();
      console.log(x2);
      encoder.fit(x2);
      df.addColumn({ column: "X1_encoded", value: encoder.transform(df.X1) });
      df2 = df.sort_values({ by: "X1_encoded" });

      df.print()
      df2.print()
    </script>
  </body>
</html>

The output is:

i X1 X2 X1_encoded
0 c ^ 2
1 a % 0
2 b ! 1
3 c # 2
4 c $ 2
5 a & 0
6 b * 1
i X1 X2 X1_encoded
1 a % 0
1 a % 0
2 b ! 1
2 b ! 1
0 c ^ 2
0 c ^ 2

Notice that the rows are all the same for each level of X1_encoded. The original rows 3-6 are lost.

Integrate with vue 3

This seems like a novice question and yet I don't understand it. How do I import and use a third party js library in vue 3? In this case I'm trying to use Danfo.js https://danfo.jsdata.org/getting-started by doing npm install danfojs (even though it only shows to use cdn for browser use I think this is the same thing but correct me if wrong). Then idk if this is something that I import in each file I want to use or if I do it in main.js and it works globally automatically or what. I tried making main.js

import { createApp } from 'vue'
import App from './App.vue'

import danfo from 'danfojs';

const app = createApp(App)
app.use(danfo)
app.mount('#app')

and then idk if that's correct but if so then how do I call dfd from inside the setup() of a component

function danfoTest(){
      console.log('idk', dfd)
      const json_data = [
        { A: 0.4612, B: 4.28283, C: -1.509, D: -1.1352 },
        { A: 0.5112, B: -0.22863, C: -3.39059, D: 1.1632 },
        { A: 0.6911, B: -0.82863, C: -1.5059, D: 2.1352 },
        { A: 0.4692, B: -1.28863, C: 4.5059, D: 4.1632 }]
                    
        const df = new dfd.DataFrame(json_data)
        console.log('here')
        df['A'].print()
    }

Idk if this is a lack of vue 3 understanding or lack of Danfo.js understanding but either way would appreciate some help, thanks!

Also is is possible can is only option? When adding the <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/index.min.js"></script> tag to index.js it did work but I was getting errors in terminal about dfd not being defined although calling dfd did work. I assume because it loads the script later idk either way I think I want the npm install way and the npm install danfojs-node I believe is for a node version not the browser version which is why I did npm install danfojs

Also side question is this a long-term supported project with tensor or more of a side project

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.