Git Product home page Git Product logo

dynamodbtocsv's Introduction

AWS DynamoDBtoCSV

Join the chat at https://gitter.im/edasque/DynamoDBtoCSV

This application will export the content of a DynamoDB table into CSV (comma-separated values) output. All you need to do is update config.json with your AWS credentials and region.

The output is comma-separated and each field is enclosed by double quotes ("). Double quotes in the data as escaped as "

This software is governed by the Apache 2.0 license.

Usage

typically, to use you'd run:

node dynamoDBtoCSV.js -t Hourly_ZEDO_Impressions_by_IP > output.csv

or even:

node dynamoDBtoCSV.js -t Hourly_ZEDO_Impressions_by_IP -f output.csv

to export to CSV

Use -d to describe the table prior so you can have an idea of the number of rows you are going to export

node dynamoDBtoCSV.js -t Hourly_ZEDO_Impressions_by_IP -d

to get some information about the table.

Full syntax is:

node dynamoDBtoCSV.js --help
	Usage: dynamoDBtoCSV.js [options]

Options:

  -h, --help                            output usage information
  -V, --version                         output the version number
  -t, --table [tablename]               Add the table you want to output to csv
  -i, --index [indexname]               Add the index you want to output to csv
  -k, --keyExpression [keyExpression]   The name of the partition key to filter results on
  -v, --KeyExpressionValues [value]     The expression for filtering on the primary key
  -S, --select [list of fields]         The list of fields to select on
  -c, --count                           Only get count, requires -pk flag
  -a, --stats [fieldname]               Gets the count of all occurances by a specific field name 
                                        (only string fields are supported presently)
  -e, --endpoint [url]                  Endpoint URL, can be used to dump from local DynamoDB
  -f, --file [file]                     Name of the file to be created
  -d, --describe                        Describe the table
  -p, --profile [profile]               Use profile from your credentials file
  -ec --envcreds                        Load AWS Credentials using AWS Credential Provider Chain

Pre-requisites

You'll need to install a few modules, including:

  • aws-sdk
  • commander
  • dynamodb-marshaler
  • papaparse

npm install

should do it.

Example output

"HashOf10","DateIPAdID","adcount"
"37693cfc748049e45d87b8c7d8b9aacd","2013011720024058205168000000010002","1"
"37693cfc748049e45d87b8c7d8b9aacd","2013011720050084232194000000010002","1"

Advanced queries

Output a selection of columns

node dynamoDBtoCSV.js -t my-table -i rule_type_id_index -k "rule_type_id = :v1" -v "{\":v1\": {\"S\": \"my_primary_key_valye\"}}" -s "rule_type_id, created_by" -r us-west-2

Output stats

node dynamoDBtoCSV.js -t my-table -i rule_type_id_index -k "rule_type_id = :v1" -v "{\":v1\": {\"S\": \"my_primary_key_valye\"}}" -s "rule_type_id, created_by" -r us-west-2 -a created_by

dynamodbtocsv's People

Contributors

andarilhoz avatar arfe avatar brianfu9 avatar darthbear avatar dependabot[bot] avatar edasque avatar ejwood79 avatar gitter-badger avatar hannahapuan avatar jcn avatar kloppster avatar lvirginie avatar mporracindie avatar msambol avatar okofish avatar purohit avatar rhargraves avatar ricardclau avatar tcchau avatar trevor-scott avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dynamodbtocsv's Issues

Unexpected key \'Limit\' found in params

node dynamoDBtoCSV.js -d -t my_table

config.json:

{
    "accessKeyId": "XXXXX",
    "secretAccessKey": "XXX",
    "region": "us"
}

fails with:

{ [UnexpectedParameter: Unexpected key 'Limit' found in params]
  message: 'Unexpected key \'Limit\' found in params',
  code: 'UnexpectedParameter',
  time: Thu Jun 19 2014 12:19:10 GMT+0200 (CEST) }

Limit not working

Limit does not appear to be working.

It defaults to 1000 but several of my scans
returned more than 1000.

@edasque

Add option to write a file

The only way to save data currently is using the ">" operator.
that on the bash causes to write all log data to the desired file, but with that, we can't have access to any kind of possible progress (or even in my case, losing char encoding), it would be great to have an fs function to write a file, at given name in options.

I could do a PR with this change.

Export not working because of -v parameter

Hello,

Since this change:
f524ce4

export from DynamoDB was not working for me anymore.

I was getting this exception:
SyntaxError: Unexpected token u in JSON at position 0
at JSON.parse ()
at Object. (/opt/DynamoDBtoCSV/dynamoDBtoCSV.js:89:35)
at Module._compile (internal/modules/cjs/loader.js:778:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:789:10)
at Module.load (internal/modules/cjs/loader.js:653:32)
at tryModuleLoad (internal/modules/cjs/loader.js:593:12)
at Function.Module._load (internal/modules/cjs/loader.js:585:3)
at Function.Module.runMain (internal/modules/cjs/loader.js:831:12)
at startup (internal/bootstrap/node.js:283:19)
at bootstrapNodeJSCore (internal/bootstrap/node.js:623:3)

My command was this and it was working fine:
node /opt/DynamoDBtoCSV/dynamoDBtoCSV.js -t table_name -f /opt/DynamoDBtoCSV/csv/export.csv

After your merge pull I had to change it to this to make it work again:
node /opt/DynamoDBtoCSV/dynamoDBtoCSV.js -v "{}" -t table_name -f /opt/DynamoDBtoCSV/csv/export.csv

Will you be changing it as it was or -v parameter will be mandatory now?

Thanks in advance.
Slobodan

Out of memory on huge amount of records

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

It happens on huge amount of records. In my case about 4 million records

"You must specify a table"

I've filled out my config.json file and attempting to run the script, but getting a recurring error "You must specify a table" no matter what I attempt. I've tried wrapping my table name in quotes but no dice.

e.g: node dynamoDBtoCSV.js -t "tableName" > output.csv

Cannot connect to Dynamo Db

I have found a below error when run the file

{ [NetworkingError: connect ECONNREFUSED]
message: 'connect ECONNREFUSED',
code: 'NetworkingError',
errno: 'ECONNREFUSED',
syscall: 'connect',
region: 'eu-central-1',
hostname: 'dynamodb.eu-central-1.amazonaws.com',
retryable: true,
time: Fri Oct 14 2016 10:29:33 GMT+0000 (UTC) }

[Bug] Profile aren't properly handled, return wrong credentials

I won't make a PR because those existing seem dead and don't want to waste my time, but know that the profile isn't correctly managed as it is now.

Current version (buggy):

if (program.profile) {
  var newCreds = AWS.config.credentials;
  newCreds.profile = program.profile;
  AWS.config.update({ credentials: newCreds });
}

This doesn't work as expected, it doesn't select the credentials for the right profile but from the first defined profile. It may have worked with a very simple setup (1 profile) but doesn't really handle multi profiles.


Fixed version:

if (program.profile) {
  var newCreds = new AWS.SharedIniFileCredentials({ profile: program.profile });
  newCreds.profile = program.profile;
  AWS.config.update({ credentials: newCreds });
}

See https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/loading-node-credentials-shared.html

TypeError: undefined is not a function

CentOS 6.5 64-bit. All pre-requisites installed.

[root@default DynamoDBtoCSV]# node dynamoDBtoCSV.js -t events_log_080414 > events_log_080414.csv

/var/www/html/DynamoDBtoCSV/dynamoDBtoCSV.js:4
var dynamoDB = new AWS.DynamoDB.Client();
^
TypeError: undefined is not a function
at Object. (/var/www/html/DynamoDBtoCSV/dynamoDBtoCSV.js:4:16)
at Module._compile (module.js:456:26)
at Object.Module._extensions..js (module.js:474:10)
at Module.load (module.js:356:32)
at Function.Module._load (module.js:312:12)
at Function.Module.runMain (module.js:497:10)
at startup (node.js:119:16)
at node.js:902:3

Assume Role

Can we assume role and use the same script

Is the script really recognize CLI options?

I wonder whether there are some lack of codes on dynamoDBtoCSV.js.

I cloned this repository, thinking "Wow, this is what I exactly wanted".
And I ran the scripts along the instruction on README.

But there are so much "undefined"s on running it. As I inspected the code, maybe I think that lacks the some instances.
Adding the code below and modifying something related, It finally ran properly:

const options = program.opts();

I saw that the official instruction of the library "commander" tells us that we have to add the instance above.
https://www.npmjs.com/package/commander

If the scripts has some insufficient components, I will make a pull request.

Please ensure the real specification of this script.

Thanks!

Cannot find module 'commander' - a more detailed description in readme would be nice.

I tried to run the script on a MAC OSX bash and have some issues. I'm not a developer and just try as a "noob" to transfer a DynamoDB JSON File in some CSV format to do some fancy excel stuff with it.

Problem 1) Get the script running.

  • What is the pre-requisites "commander"? A module? Where I can get it? Always looked at "brew", but didn't find it. I've the latest Command Line tools 8.3.2 for xcode on my Mac.

Problem 2) When I'm able to run the script I guess a dynamoDB file export (via AWS Datapipeline of ~600MB) might be a bit too much?

Current error:

sh-3.2# node dynamoDBtoCSV.js --help
module.js:472
throw err;
^

Error: Cannot find module 'commander'
at Function.Module._resolveFilename (module.js:470:15)
at Function.Module._load (module.js:418:25)
at Module.require (module.js:498:17)
at require (internal/module.js:20:19)
at Object. (/Users/Machine/Downloads/Data/dynamoDBtoCSV.js:1:77)
at Module._compile (module.js:571:32)
at Object.Module._extensions..js (module.js:580:10)
at Module.load (module.js:488:32)
at tryModuleLoad (module.js:447:12)
at Function.Module._load (module.js:439:3)
sh-3.2#

my question: what is "commander", where can i get and how to install it? Thanks for support.

papaparse and dynamodb-marshaler

thanks for the tool!
just a little update for your readme...

The following are also necessary:
npm install dynamodb-marshaler
npm install papaparse

Limit the request rate

It would be great (and hopefully not hard) to add a config for how quickly it makes the request to dynamo in order to not exceed the read limits.

Error: connect ETIMEDOUT

{ Error: connect ETIMEDOUT 52.94.17.76:443
at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1113:14)
message: 'connect ETIMEDOUT 52.94.17.76:443',
errno: 'ETIMEDOUT',
code: 'NetworkingError',
syscall: 'connect',
address: '52.94.17.76',
port: 443,
region: 'eu-central-1',
hostname: 'dynamodb.eu-central-1.amazonaws.com',
retryable: true,
time: 2019-01-09T14:44:42.333Z }

Question about 'region'

Could anyone tell me the content of 'region'? How could I know the 'region' of my AWS account? Can I see this parameter in my AWS account?
Thank you so much!

SynstaxError when running on ubuntu

Im getting this error when running on ubuntu:

`/home/ubuntu/DynamoDBtoCSV/dynamoDBtoCSV.js:74
console.log(Papa.unparse( { fields: [ ...headers ], data: unMarshalledArray } ));
^^^

SyntaxError: Unexpected token ...
at exports.runInThisContext (vm.js:53:16)
at Module._compile (module.js:374:25)
at Object.Module._extensions..js (module.js:417:10)
at Module.load (module.js:344:32)
at Function.Module._load (module.js:301:12)
at Function.Module.runMain (module.js:442:10)
at startup (node.js:136:18)
at node.js:966:3`

Versions:
npm 3.5.2
node 4.2.6

Dummy line

Im getting a 'dummy' line, how can i avoid this from happening?

Error on MacOS

mac:DynamoDBtoCSV-master mac$ node dynamoDBtoCSV.js 
/Users/mac/Documents/R2R/aws/DynamoDBtoCSV-master/dynamoDBtoCSV.js:55
        console.log(Papa.unparse( { fields: [ ...headers ], data: unMarshalled
                                              ^
SyntaxError: Unexpected token .
    at exports.runInThisContext (vm.js:73:16)
    at Module._compile (module.js:443:25)
    at Object.Module._extensions..js (module.js:478:10)
    at Module.load (module.js:355:32)
    at Function.Module._load (module.js:310:12)
    at Function.Module.runMain (module.js:501:10)
    at startup (node.js:129:16)
    at node.js:814:3

UnrecognizedClientException: The security token included in the request is invalid.

I use your application as in the description, but I see the error:
{ UnrecognizedClientException: The security token included in the request is invalid.
at Request.extractError (/Users//Documents/DynamoDBtoCSV/node_modules/aws-sdk/lib/protocol/json.js:48:27)
at Request.callListeners (/Users//Documents/DynamoDBtoCSV/node_modules/aws-sdk/lib/sequential_executor.js:105:20)
at Request.emit (/Users//Documents/DynamoDBtoCSV/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
at Request.emit (/Users//Documents/DynamoDBtoCSV/node_modules/aws-sdk/lib/request.js:683:14)
at Request.transition (/Users//Documents/DynamoDBtoCSV/node_modules/aws-sdk/lib/request.js:22:10)
at AcceptorStateMachine.runTo (/Users//Documents/DynamoDBtoCSV/node_modules/aws-sdk/lib/state_machine.js:14:12)
at /Users//Documents/DynamoDBtoCSV/node_modules/aws-sdk/lib/state_machine.js:26:10
at Request. (/Users//Documents/DynamoDBtoCSV/node_modules/aws-sdk/lib/request.js:38:9)
at Request. (/Users//Documents/DynamoDBtoCSV/node_modules/aws-sdk/lib/request.js:685:12)
at Request.callListeners (/Users//Documents/DynamoDBtoCSV/node_modules/aws-sdk/lib/sequential_executor.js:115:18)
message: 'The security token included in the request is invalid.',
code: 'UnrecognizedClientException',
time: 2018-04-20T04:38:47.883Z,
requestId: '6M8VSR0LJ87E9VFCRT8FTHAQ0FVV4KQNSO5AEMVJF66Q9ASUAAJG',
statusCode: 400,
retryable: false,
retryDelay: 15.516383183565807 }

Could you explain why it happens? The configuration file contains the same settings as my mobile application.

Nothing happened while executing the backup command

Hi,

Im using ubuntu 14.04,
I did the following steps.

Git clone
cd DynamoDBtoCSV/
Edit config.json (IAM key with DynamoDB full access)
apt-get install node
apt-get install npm
npm install

Then,
My DynamoDB table name is table
node dynamoDBtoCSV.js -t table >table.csv

but nothing happened while executing the above command.
Find the below snap for your reference.
image

Location of config.json

Dear Sir,
I am new to nodejs and DynamoDB.
Can you please advise where should I put the config.json file?

I tried to export a table in the DynamoDB into a csv file with the following command:
node dynamoDBtoCSV.js -d -t test2

However, it kept on giving me the following error:
{ [UnexpectedParameter: Unexpected key 'Limit' found in params]
message: 'Unexpected key 'Limit' found in params',
code: 'UnexpectedParameter',
time: Mon Nov 10 2014 15:27:11 GMT+0800 (Malay Peninsula Standard Time) }

Here is the content of my config.json:
{
"accessKeyId": "XXXXXXXXXXXXXXXXX",
"secretAccessKey": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"region": "ap-northeast-1"
}

Please kindly advise.

Many thanks,
Willie

Limit in conversion to CSV

Hi Erik,

I am new to DynamoDB and nodejs.

I tried to convert a DynamoDB table of 1426 rows * 204 columns into a csv file.
When I used the following command:
node dynamoDBtoCSV.js -t table1 -d
it returns the correct ItemCount: 1426.
It also returns TableSizeBytes: 1335111 bytes (which I am not sure how to verify this for DynamoDB table).

However, when I used the following command:
node dynamoDBtoCSV.js -t table1 > output.csv
The output.csv file created doesn't contain the correct table structure.
I have tried to change the "Limit" value into 1000000 (initially was 1000) inside dynamoDBtoCSV.js file as follows:
var query = {
"TableName": program.table,
"Limit": 1000000,
};
but the produced output.csv is still wrong (as before).

Is there any limit in many rows / columns or table size that can be converted?
Can you please advise how to handle this issue.

Many thanks,
Willie

Error: Cannot find module 'commander'

I tried run "node dynamoDBtoCSV.js --help" either localhost or docker compose file. After running
both sides the same error occurried (I checked all the dependencies before has installed).

web_1 | Error: Cannot find module 'commander'
web_1 | at Function.Module._resolveFilename (module.js:485:15)
web_1 | at Function.Module._load (module.js:437:25)
web_1 | at Module.require (module.js:513:17)
web_1 | at require (internal/module.js:11:18)
web_1 | at Object. (/code/dynamoDBtoCSV.js:1:77)
web_1 | at Module._compile (module.js:569:30)
web_1 | at Object.Module._extensions..js (module.js:580:10)
web_1 | at Module.load (module.js:503:32)
web_1 | at tryModuleLoad (module.js:466:12)
web_1 | at Function.Module._load (module.js:458:3)

Invalid string length

My dynamodb table has 70K data. And when i run cmd, i get below error :
RangeError: Invalid string length
at serialize (/var/www/DynamoDBtoCSV/node_modules/papaparse/papaparse.js:397:13)
at Object.JsonToCsv [as unparse] (/var/www/DynamoDBtoCSV/node_modules/papaparse/papaparse.js:316:11)
at Response. (/var/www/DynamoDBtoCSV/dynamoDBtoCSV.js:74:26)
at Request. (/var/www/DynamoDBtoCSV/node_modules/aws-sdk/lib/request.js:364:18)
at Request.callListeners (/var/www/DynamoDBtoCSV/node_modules/aws-sdk/lib/sequential_executor.js:105:20)
at Request.emit (/var/www/DynamoDBtoCSV/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
at Request.emit (/var/www/DynamoDBtoCSV/node_modules/aws-sdk/lib/request.js:683:14)
at Request.transition (/var/www/DynamoDBtoCSV/node_modules/aws-sdk/lib/request.js:22:10)
at AcceptorStateMachine.runTo (/var/www/DynamoDBtoCSV/node_modules/aws-sdk/lib/state_machine.js:14:12)
at /var/www/DynamoDBtoCSV/node_modules/aws-sdk/lib/state_machine.js:26:10

unable to verify the first certificate

$ node dynamoDBtoCSV.js -t swift-us-east-1-prod.table1 -ec

{ Error: unable to verify the first certificate
    at TLSSocket.onConnectSecure (_tls_wrap.js:1051:34)
    at TLSSocket.emit (events.js:189:13)
    at TLSSocket.EventEmitter.emit (domain.js:441:20)
    at TLSSocket._finishInit (_tls_wrap.js:633:8)
  message: 'unable to verify the first certificate',
  code: 'NetworkingError',
  region: 'us-east-1',
  hostname: 'c',
  retryable: true,
  time: 2019-10-28T21:49:37.197Z }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.