Git Product home page Git Product logo

export-csv-to-influx's Introduction

Export CSV To Influx

Export CSV To Influx: Process CSV data, and write the data to influx db

Support:

  • Influx 0.x, 1.x
  • influx 2.x: Start supporting from 0.2.0

Important Note: Influx 2.x has build-in csv write feature, it is more powerful: https://docs.influxdata.com/influxdb/v2.1/write-data/developer-tools/csv/

Install

Use the pip to install the library. Then the binary export_csv_to_influx is ready.

pip install ExportCsvToInflux

Features

  1. [Highlight ๐ŸŒŸ๐ŸŽ‰๐Ÿ˜] Allow to use binary export_csv_to_influx to run exporter
  2. [Highlight ๐ŸŒŸ๐ŸŽ‰๐Ÿ˜] Allow to check dozens of csv files in a folder
  3. [Highlight ๐ŸŒŸ๐ŸŽ‰๐Ÿ˜๐ŸŽŠ๐Ÿ€๐ŸŽˆ] Auto convert csv data to int/float/string in Influx
  4. [Highlight ๐ŸŒŸ๐ŸŽ‰๐Ÿ˜] Allow to match or filter the data by using string or regex.
  5. [Highlight ๐ŸŒŸ๐ŸŽ‰๐Ÿ˜] Allow to count, and generate count measurement
  6. Allow to limit string length in Influx
  7. Allow to judge the csv has new data or not
  8. Allow to use the latest file modify time as time column
  9. Auto Create database if not exist
  10. Allow to drop database before inserting data
  11. Allow to drop measurements before inserting data

Command Arguments

You could use export_csv_to_influx -h to see the help guide.

Note:

  1. You could pass * to --field_columns to match all the fields: --field_columns=*, --field_columns '*'
  2. CSV data won't insert into influx again if no update. Use to force insert, default True: --force_insert_even_csv_no_update=True, --force_insert_even_csv_no_update True
  3. If some csv cells have no value, auto fill the influx db based on column data type: int: -999, float: -999.0, string: -
# Option Mandatory Default Description
1 -c, --csv Yes CSV file path, or the folder path
2 -db, --dbname For 0.x, 1.x only: Yes InfluxDB Database name
3 -u, --user For 0.x, 1.x only: No admin InfluxDB User name
4 -p, --password For 0.x, 1.x only: No admin InfluxDB Password
5 -org, --org For 2.x only: No my-org For 2.x only, my org
6 -bucket, --bucket For 2.x only: No my-bucket For 2.x only, my bucket
7 -http_schema, --http_schema For 2.x only: No http For 2.x only, influxdb http schema, could be http or https
8 -token, --token For 2.x only: Yes For 2.x only, n
9 -m, --measurement Yes Measurement name
10 -fc, --field_columns Yes List of csv columns to use as fields, separated by comma
11 -tc, --tag_columns No None List of csv columns to use as tags, separated by comma
12 -d, --delimiter No , CSV delimiter
13 -lt, --lineterminator No \n CSV lineterminator
14 -s, --server No localhost:8086 InfluxDB Server address
15 -t, --time_column No timestamp Timestamp column name. If no timestamp column, the timestamp is set to the last file modify time for whole csv rows. Note: Also support the pure timestamp, like: 1517587275. Auto detected
16 -tf, --time_format No %Y-%m-%d %H:%M:%S Timestamp format, see more: https://strftime.org/
17 -tz, --time_zone No UTC Timezone of supplied data
18 -b, --batch_size No 500 Batch size when inserting data to influx
19 -lslc, --limit_string_length_columns No None Limit string length column, separated by comma
20 -ls, --limit_length No 20 Limit length
21 -dd, --drop_database Compatible with 2.x: No False Drop database or bucket before inserting data
22 -dm, --drop_measurement No False Drop measurement before inserting data
23 -mc, --match_columns No None Match the data you want to get for certain columns, separated by comma. Match Rule: All matches, then match
24 -mbs, --match_by_string No None Match by string, separated by comma
25 -mbr, --match_by_regex No None Match by regex, separated by comma
26 -fic, --filter_columns No None Filter the data you want to filter for certain columns, separated by comma. Filter Rule: Any one filter success, the filter
27 -fibs, --filter_by_string No None Filter by string, separated by comma
28 -fibr, --filter_by_regex No None Filter by regex, separated by comma
29 -ecm, --enable_count_measurement No False Enable count measurement
30 -fi, --force_insert_even_csv_no_update No True Force insert data to influx, even csv no update
31 -fsc, --force_string_columns No None Force columns as string type, separated as comma
32 -fintc, --force_int_columns No None Force columns as int type, separated as comma
33 -ffc, --force_float_columns No None Force columns as float type, separated as comma
34 -uniq, --unique No False Write duplicated points
35 --csv_charset, --csv_charset No None The csv charset. Default: None, which will auto detect

Programmatically

Also, we could run the exporter programmatically.

from ExportCsvToInflux import ExporterObject

exporter = ExporterObject()
exporter.export_csv_to_influx(...)

# You could get the export_csv_to_influx parameter details by:
print(exporter.export_csv_to_influx.__doc__)

Sample

  1. Here is the demo.csv
timestamp,url,response_time
2022-03-08 02:04:05,https://jmeter.apache.org/,1.434
2022-03-08 02:04:06,https://jmeter.apache.org/,2.434
2022-03-08 02:04:07,https://jmeter.apache.org/,1.200
2022-03-08 02:04:08,https://jmeter.apache.org/,1.675
2022-03-08 02:04:09,https://jmeter.apache.org/,2.265
2022-03-08 02:04:10,https://sample-demo.org/,1.430
2022-03-08 03:54:13,https://sample-show.org/,1.300
2022-03-07 04:06:00,https://sample-7.org/,1.289
2022-03-07 05:45:34,https://sample-8.org/,2.876
  1. Command samples
# Description Influx 0.x, 1.x Influx 2.x
1 Write whole data into influx
export_csv_to_influx \
--csv demo.csv \
--dbname demo \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--user admin \
--password admin \
--server 127.0.0.1:8086
 export_csv_to_influx \
--csv demo.csv \
--org my-org \
--bucket my-bucket \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--token YourToken \
--server 127.0.0.1:8086
2 Write whole data into influx, but: drop database or bucket
export_csv_to_influx \
--csv demo.csv \
--dbname demo \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--user admin \
--password admin \
--server 127.0.0.1:8086 \
--drop_database=True
 // The Read/Write API Token cannot create bucket. Before you using the --drop_database, make sure your toke have the access  
// See the bug here: https://github.com/influxdata/influxdb/issues/23170
export_csv_to_influx \
--csv demo.csv \
--org my-org \
--bucket my-bucket \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--token YourToken \
--server 127.0.0.1:8086 \
--drop_database=True
3 Write part of data: timestamp matches 2022-03-07 and url matches sample-\d+
export_csv_to_influx \
--csv demo.csv \
--dbname demo \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--user admin \
--password admin \
--server 127.0.0.1:8086 \
--drop_database=True \
--match_columns=timestamp,url \
--match_by_reg='2022-03-07,sample-\d+'
export_csv_to_influx \
--csv demo.csv \
--org my-org \
--bucket my-bucket \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--token YourToken \
--server 127.0.0.1:8086 \
--drop_measurement=True \
--match_columns=timestamp,url \
--match_by_reg='2022-03-07,sample-\d+'
4 Filter part of data, and write into influx: url filters sample
export_csv_to_influx \
--csv demo.csv \
--dbname demo \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--user admin \
--password admin \
--server 127.0.0.1:8086 \
--drop_database True \
--filter_columns url \
--filter_by_reg 'sample'
export_csv_to_influx \
--csv demo.csv \
--org my-org \
--bucket my-bucket \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--token YourToken \
--server 127.0.0.1:8086 \
--drop_measurement=True \
ย ย --filter_columns url \
--filter_by_reg 'sample'
5 Enable count measurement. A new measurement named: demo.count generated, with match: timestamp matches 2022-03-07 and url matches sample-\d+
export_csv_to_influx \
--csv demo.csv \
--dbname demo \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--user admin \
--password admin \
--server 127.0.0.1:8086 \
--drop_database True \
--match_columns timestamp,url \
--match_by_reg '2022-03-07,sample-\d+' \
--enable_count_measurement True
export_csv_to_influx \
--csv demo.csv \
--org my-org \
--bucket my-bucket \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--token YourToken \
--server 127.0.0.1:8086 \
--drop_measurement=True \
--match_columns=timestamp,url \
--match_by_reg='2022-03-07,sample-\d+' \
ย ย --enable_count_measurement True
  1. If enable the count measurement, the count measurement is:

    // Influx 0.x, 1.x
    select * from "demo.count"
    
    name: demo.count
    time                match_timestamp match_url total
    ----                --------------- --------- -----
    1562957134000000000 3               2         9
    
    // Influx 2.x: For more info about Flux, see https://docs.influxdata.com/influxdb/v2.1/query-data/flux/
    influx query 'from(bucket:"my-bucket") |> range(start:-100h) |> filter(fn: (r) => r._measurement == "demo.count")' --raw
    
    #group,false,false,true,true,false,false,true,true
    #datatype,string,long,dateTime:RFC3339,dateTime:RFC3339,dateTime:RFC3339,long,string,string
    #default,_result,,,,,,,
    ,result,table,_start,_stop,_time,_value,_field,_measurement
    ,,2,2022-03-04T09:51:49.7425566Z,2022-03-08T13:51:49.7425566Z,2022-03-07T05:45:34Z,2,match_timestamp,demo.count
    ,,3,2022-03-04T09:51:49.7425566Z,2022-03-08T13:51:49.7425566Z,2022-03-07T05:45:34Z,2,match_url,demo.count
    ,,4,2022-03-04T09:51:49.7425566Z,2022-03-08T13:51:49.7425566Z,2022-03-07T05:45:34Z,9,total,demo.count
    

Special Thanks

The lib is inspired by: https://github.com/fabio-miranda/csv-to-influxdb

export-csv-to-influx's People

Contributors

bugazelle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

export-csv-to-influx's Issues

When run from subprocess call gives an 'UnboundLocalError'

Hi.
I've been using version 0.1.25 script by calling it from within a python script as a 'subprocess.check_output("export_csv_to_influx....")' which parses through and collates a bunch of .dat files and then pushing those datapoints onto an InfluxDB1.8 server for several years now. All works really well! Thanks.
My issue now is that we've upgraded to InfluxDB2 and so have had to upgrade your script to the 0.2.2 version. If I run your script by itself then it works OK but when running from within my script it gives the 'UnboundLocalError: local variable 'headers' referenced before assignment' error, I renamed a variable I was using (also called 'headers') to 'heads' which I thought might be the problem but it still throws the same error.
My script IS a bit clunky, I'm wondering if there's a better way for me to call your module from within my script?
You mention your module can be run 'Programmatically' but I'm sorry I really don't understand the example you gave...
Can you please give me some direction on how to use your module from within a script so I don't have to call it with a subprocess.check_output, I believe it will work well for me then as it has done for many years on the old version.
Thank you for your script!

Error: This is not a valid influx http://localhost:8086/

I wanted to switch from 1.8 to 2.6 and installed influxdb and client 2.6.(1) on Ubuntu 22.04 and the export-csv-to-influx again. Using it on 1.8 since two years.

When I do the export I'm getting following error: "Error: This is not a valid influx http://localhost:8086/"

export_csv_to_influx \ --csv test.csv \ --org test \ --bucket bucket1 \ --measurement flow \ --tag_columns temp \ --field_columns value \ --token 'exxxxxxxxxxxxxxxxxxxxxxxxxxxxx==' \ --http_schema https

Even with or without https or adding the server parameter, I'm getting the same error

Edit: Now started influxdb without tls cert&key and it writes with success. Starting with I'm getting in syslog x509: cannot validate certificate for xxx.xxx.xx.xx because it doesn't contain any IP SANs" So there is an issue with Let's Encrypt but GUI is working properly.

We can't export csv to influx by influxdb-relay

Hi,

We have loadbalancer influxdb-relay this software only expose endpoint /write of API.
Your script before sent write with data, try /get+db , for test if database in influxdb exist and of course it gives 404 error.

How could I force the csv to be sent without first checking if there is a bbdd in influx?

I have not seen any parameter with which you can force it

Cannot handle times before posix 1000000000

When processing 9-digit posix timestamps an error is thrown and the times are not inserted into influxdb.
I haven't tested it but I expect all timestamps less than 10 digits long will also fail.

I think the code expects a 10-digit timestamp and appends zeros to the end of the timestamp in an attempt to convert seconds to nanoseconds.

timestamp_float = float(row[time_column])
timestamp_remove_decimal = int(str(timestamp_float).replace('.', ''))
timestamp_influx = '{:<019d}'.format(timestamp_remove_decimal)
timestamp = int(timestamp_influx)

time zone format for Date only?

Hi, I have some CSV data I'm trying to import into influx with only a Time value of MM/dd/yy. I tried a bunch of formats but keep getting the error "Unexpected time with format"

tried all of these formats:

--time_format %MM/%dd/%yy
--time_format '%MM/%dd/%yy'
--time_format %MM/%dd/%yyyy
--time_format '%MM/%dd/%yyyy'

fail with error: Unexpected time with format & Warning: Failed to force "xxx" to float, skip...

hello sir, i get Error when running cammand.

cammand:
export_csv_to_influx --csv /mnt/ssd/docker/influxdb/data/data/apcupsd.csv --dbname Ubuntu --measurement apcupsd -u admin -p admin --force_float_columns battery_charge_percent,battery_voltage,input_frequency,input_voltage,internal_temp,load_percent,nominal_battery_voltage,nominal_input_voltage,output_voltage --force_int_columns nominal_power,status_flags,time_left_ns,time_on_battery_ns --force_string_columns battery_date,firmware -fc battery_charge_percent,battery_date,battery_voltage,firmware,input_frequency,input_voltage,internal_temp,load_percent,nominal_battery_voltage,nominal_input_voltage,nominal_power,output_voltage,status_flags,time_left_ns,time_on_battery_ns -tc host,model,serial,status,ups_name -t time

i had to force all field because if not , it will get the same name columns but different type..

fieldKey:
ๅ›พ็‰‡

tagKey:
ๅ›พ็‰‡

csv:
ๅ›พ็‰‡

error msg:
ๅ›พ็‰‡
ๅ›พ็‰‡

did i miss something?

field type error

Hi @Bugazelle ,

While injecting my csv with your Python module, it works really well if I drop the measurement first.
But now I've got the following problem: I' want to inject a lot more point and don't want to drop the measurement each time. If I drop_measurement=False, then I get the following error while re-injecting the data:

raise InfluxDBClientError(response.content, response.status_code)
influxdb.exceptions.InfluxDBClientError: 400: {"error":"partial write: field type conflict: input field \"my_column\" on measurement \"my_data\" is type float, already exists as type integer dropped=9"}

The weird thing is, my .csv is float only but it seems to be injected as integer? and then it gives me this error, that the type is different?

Is this a bug or is this intentional? How can I set the type while injecting? I really don't care about the type, I just want all fields as float.

Thank you in advance!

TypeError: 'dict_keys' object is not subscriptable

Hi,

I'm using the exact example provided on the readme with the following inputs:

export_csv_to_influx \
--csv demo.csv \
--dbname demo \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--user myuser \
--password mypass \
--force_insert_even_csv_no_update True \
--server 127.0.0.1:8086

and the following error is being returned:

Traceback (most recent call last):
  File "/home/xx/bitbucket/python/collector/venv/bin/export_csv_to_influx", line 10, in <module>
    sys.exit(export_csv_to_influx())
  File "/home/xx/bitbucket/python/collector/venv/lib/python3.7/site-packages/ExportCsvToInflux/exporter_object.py", line 496, in export_csv_to_influx
    force_insert_even_csv_no_update=args.force_insert_even_csv_no_update)
  File "/home/xx/bitbucket/python/collector/venv/lib/python3.7/site-packages/ExportCsvToInflux/exporter_object.py", line 279, in export_csv_to_influx
    csv_object.add_columns_to_csv(file_name=csv_file_item, target=new_csv_file, data=data)
  File "/home/xx/bitbucket/python/collector/venv/lib/python3.7/site-packages/ExportCsvToInflux/csv_object.py", line 271, in add_columns_to_csv
    new_headers = [x.keys()[0] for x in data]
  File "/home/xx/bitbucket/python/collector/venv/lib/python3.7/site-packages/ExportCsvToInflux/csv_object.py", line 271, in <listcomp>
    new_headers = [x.keys()[0] for x in data]
TypeError: 'dict_keys' object is not subscriptable

Thanks for your time on writing this useful tool!

argument error

it asks for --dbname while I am using it for influxdbv2.1 with a provided bucket, org, and token id.

image

exporting 100+ tags into influxdb

Hi @Bugazelle ,

I'm trying to read a csv into influxdb with almost 100+ tags, i read the tags into a variable and tried using that variable against the tag_columns but its not working, I'm not sure how to pass these many tags. Is there a way to pass these tags to the tags_columns instead of manually entering them.

from ExportCsvToInflux import ExporterObject
from influxdb import InfluxDBClient
exporter = ExporterObject()
import pandas as pd
path = r'G:\data.csv'
df = pd.read_csv(path)
par=df.columns[2:104]
print (par)
exporter.export_csv_to_influx (csv_file =r'G:\data.csv'
db_name= 'mydb',
db_measurement= 'demo',
time_column= 'Time',
tag_columns= par,
field_columns= 'Blo',
db_user= 'admin',
db_password= '',
force_insert_even_csv_no_update= True,
db_server_name= 'localhost:8086',
time_format='%H:%M')

The output is as follows:

Index(['dat1', 'dat2', 'dat3', 'dat4', 'dat5', 'dat6', 'dat7', 'dat8', 'dat9',
'dat10',
...
'dat93', 'dat94', 'dat95', 'dat96', 'dat97', 'dat98', 'dat99', 'dat100',
'dat101', 'dat102'],
dtype='object', length=102)
Traceback (most recent call last):
File "<pyshell#17>", line 11, in
time_format='%H:%M')
File "C:\Users\AppData\Local\Programs\Python\Python37-32\lib\site-packages\ExportCsvToInflux\exporter_object.py", line 175, in export_csv_to_influx
field_columns = base_object.str_to_list(field_columns)
File "C:\Users\AppData\Local\Programs\Python\Python37-32\lib\site-packages\ExportCsvToInflux\base_object.py", line 33, in str_to_list
elif bool(string) is False:
File "C:\Users\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\indexes\base.py", line 2387, in nonzero
self.class.name
ValueError: The truth value of a Index is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

My csv is as follows:

Time Blo dat1 dat2 dat3 dat4 dat5 dat6 dat7 dat8 dat9 dat10 dat11 dat12 dat13 dat14 dat15
06:08 rain 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
06:09 rain 0 0 0 0 0 0.001 0 0 0 0 0 0 0 0 0
06:10 rain 0 0 0 0 0 0.015 0 0 0 0 0 0.003 0 0 0
06:11 rain 0 0 0 0 0.013 0.068 0 0 0 0 0 0.024 0 0 0
06:12 rain 0 0 0 0 0.015 0.064 0 0.019 0.027 0 0 0.044 0 0 0.016
06:13 rain 0.022 0.001 0 0 0.047 0.074 0.002 0.023 0.048 0 0.046 0.064 0 0.016 0.036
06:14 rain 0.055 0.011 0 0 0.051 0.084 0.013 0.045 0.058 0.001 0.021 0.086 0.032 0.037 0.057
06:15 rain 0.065 0.041 0.014 0 0.076 0.074 0.039 0.075 0.081 0.036 0.058 0.111 0.053 0.068 0.082
06:16 rain 0.109 0.075 0.034 0.025 0.103 0.137 0.073 0.101 0.119 0.054 0.098 0.146 0.086 0.091 0.116

Thank you for your support.

Problem importing csv TypeError: strptime() argument 1 must be string, not int

Good morning im trying to import a csv but im having some type of problem.

This is my csv Structure:

id,ida,timestamp,latitude,longitude,altitude,speed,vspeed,quawk,track,ground,leg,date
0,fr24-34530E-1517587275,1517587275,44.224740000000004,9.47696,26725,462,-960,6463,78,False,ANE8786_34530E_1517580840_1517588845,2018-02-02

I dont know where im failing, i hope you can help me!
Thanks!

TimeStamp Issues (postic/unix) 19 digits posix

Seems to be unable to import 19 digit posix dates

python3 csvtoinflux.py -i server1.csv -s server5:8086 --create --dbname NewTest --tagcolumns host --fieldcolumns CPU,GPU --metricname name --timecolumn time --timeformat posix
Deleting database NewTest
Creating database NewTest
Traceback (most recent call last):
File "csvtoinflux.py", line 200, in
args.timezone, args.ssl)
File "csvtoinflux.py", line 78, in loadCsv
datetime_naive = datetime.datetime.fromtimestamp(int(row[timecolumn]))
OverflowError: timestamp out of range for platform time_t

CSV looks like this :

name,time,CPU,GPU,host
temperature,1590335330162702714,53.5,53,server1
temperature,1590335335297033388,54,53,server1

Time zone format

Very simple question. what is the format for entering a parameter for the time_zone variable? My data is no taken in the UTC zone, but rather UTC-7

I have tried a few options, UTC-7, GMT-7, MDT (Mountain Daylight Time in US) but none are recognized. What is the proper format please?
By the way, so far, your program is working very well for me.

Unable to export csv to influxDB

Hi, I am new to python and I'm trying to export a csv file into influxdb, i am getting the following error, please help me
-----------------This is what I wrote------------------
C:\Users\AppData\Local\Programs\Python\Python37-32\Scripts>export_csv_to_influx.exe -c C:\Users\AppData\Local\Programs\Python\Python37-32\Scripts\test1.csv -d ; -db exa1 -m demo -t time -tc value1 -fc value2,value3,value4,value5 -u admin -p -fi true -s localhost:8086

-----------------This is the error---------------------
Info: Database exa1 already exists
Traceback (most recent call last):
File "C:\Users\AppData\Local\Programs\Python\Python37-32\Scripts\export_csv_to_influx-script.py", line 11, in
load_entry_point('ExportCsvToInflux==0.1.16', 'console_scripts', 'export_csv_to_influx')()
File "c:\users\appdata\local\programs\python\python37-32\lib\site-packages\ExportCsvToInflux\exporter_object.py", line 497, in export_csv_to_influx
force_insert_even_csv_no_update=args.force_insert_even_csv_no_update)
File "c:\users\appdata\local\programs\python\python37-32\lib\site-packages\ExportCsvToInflux\exporter_object.py", line 287, in export_csv_to_influx
for row, int_type, float_type in convert_csv_data_to_int_float:
File "c:\users\appdata\local\programs\python\python37-32\lib\site-packages\ExportCsvToInflux\csv_object.py", line 211, in convert_csv_data_to_int_float
int_status = int_type[key]
KeyError: 'time;value1;value2;value3;value4;value5;md5'
-----------------This is how my csv file looks like----------------

test1.csv

time;value1;value2;value3;value4;value5
01/01/90 00:00;0;0;0;10.9;-167.06
01/01/90 01:00;0;0;0;10.7;-167.06
01/01/90 02:00;0;0;0;10.4;-167.06
01/01/90 03:00;0;0;0;10.2;-167.06
01/01/90 04:00;0;0;0;10;-167.06
01/01/90 05:00;0;0;0;9.0997;-167.06
01/01/90 06:00;0;0;0;7.6999;-167.06
01/01/90 07:00;31.983;8.7807;32.008;8.0002;3230.9
01/01/90 08:00;88.324;12.105;90.002;10;10711
01/01/90 09:00;114.16;15.716;118.01;13;14033
01/01/90 10:00;152.04;19.021;155;15.4;18964
01/01/90 11:00;294.84;24.542;277;17.5;37532
01/01/90 12:00;384.71;28.33;335.01;19;49312
01/01/90 13:00;632.51;35.63;510.99;20;79933
01/01/90 14:00;522.94;33.246;418;20.4;66567
01/01/90 15:00;155.93;24.37;140.01;20.6;19289

A separate csv is created after I run my command line as follows, i'm not sure what this is for
----------automatically created csv file--------------------
test1_influx.csv
time;value1;value2;value3;value4;value5;md5
01/01/90 00:00;0;0;0;10.9;-167.06;e0a1485ccf44c8c4859ff6d445a6fa3b
01/01/90 01:00;0;0;0;10.7;-167.06;e0a1485ccf44c8c4859ff6d445a6fa3b
01/01/90 02:00;0;0;0;10.4;-167.06;e0a1485ccf44c8c4859ff6d445a6fa3b
01/01/90 03:00;0;0;0;10.2;-167.06;e0a1485ccf44c8c4859ff6d445a6fa3b
01/01/90 04:00;0;0;0;10;-167.06;e0a1485ccf44c8c4859ff6d445a6fa3b
01/01/90 05:00;0;0;0;9.0997;-167.06;e0a1485ccf44c8c4859ff6d445a6fa3b
01/01/90 06:00;0;0;0;7.6999;-167.06;e0a1485ccf44c8c4859ff6d445a6fa3b
01/01/90 07:00;31.983;8.7807;32.008;8.0002;3230.9;e0a1485ccf44c8c4859ff6d445a6fa3b
01/01/90 08:00;88.324;12.105;90.002;10;10711;e0a1485ccf44c8c4859ff6d445a6fa3b
01/01/90 09:00;114.16;15.716;118.01;13;14033;e0a1485ccf44c8c4859ff6d445a6fa3b
01/01/90 10:00;152.04;19.021;155;15.4;18964;e0a1485ccf44c8c4859ff6d445a6fa3b
01/01/90 11:00;294.84;24.542;277;17.5;37532;e0a1485ccf44c8c4859ff6d445a6fa3b
01/01/90 12:00;384.71;28.33;335.01;19;49312;e0a1485ccf44c8c4859ff6d445a6fa3b

Please help me.

  1. I need some help for using the same as a library, can you please elaborate on how to do that with a small example as the below mentioned is not clear for me. Sorry for the silly questions

from ExportCsvToInflux import ExporterObject

exporter = ExporterObject()
exporter.export_csv_to_influx(...)

Thank you!!

insert all columns or none

Hi,

I would like to insert all columns of the csv as field_columns (except timestamp, of course) and no columns as tag_columns. How would be the syntax?

(My CSV has too many columns to write in the main command... and doesn't have tag columns)

I tried:
--field_columns *
--field_columns all
--field_columns columns
--tag_columns none

but nothing works.

Any requirement for csv file encoding ?

Any requirement for csv file encoding ? I use the csv file generated by python dataframe.to_csv with 'utf_8_sig'. then got below errors. and the CSV file include chinese language. Many Thanks.

File "C:/WorkPlace/Python_Project/JQData/Upload_JQ_data.py", line 15, in exporter.export_csv_to_influx(csv_file="beishang_sh_change_20210201231306.csv",db_server_name="192.168.78.136:8086",db_user="admin",db_password="!234qwer",db_name="BeiShang",db_measurement="BeiShang",time_column="day",tag_columns="link_name,code,name",time_format="%Y-%m-%d",field_columns="share_number,share_ratio",batch_size=5000)
File "C:\WorkPlace\Anaconda3\envs\Python_Project\lib\site-packages\ExportCsvToInflux\exporter_object.py", line 329, in export_csv_to_influx
csv_file_length = csv_object.get_csv_lines_count(csv_file_item)
File "C:\WorkPlace\Anaconda3\envs\Python_Project\lib\site-packages\ExportCsvToInflux\csv_object.py", line 136, in get_csv_lines_count
has_header = self.get_csv_header(file_name)
File "C:\WorkPlace\Anaconda3\envs\Python_Project\lib\site-packages\ExportCsvToInflux\csv_object.py", line 34, in get_csv_header
has_header = sniffer.has_header(f.read(40960))
File "C:\WorkPlace\Anaconda3\envs\Python_Project\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 118: character maps to

InfluxDB 2.0 Support

Does this project support InfluxDB 2.0?

I had to enable password auth for backward compatibility with 1.x and now CSV import fails with

influxdb.exceptions.InfluxDBClientError: not implemented: CREATE DATABASE
after
Database foo not found, trying to create it

Is this supposed to be working with 2.0?

don't write _influx.csv file with md5

Hi again @Bugazelle !
ExportCsvToInflux writes a _influx.csv with a md5 column for each processed file.
Now I have the following question: is it possible (i.e. with an arg) to NOT generate this _influx.csv file?

My CSVs are 100mb each, so this just would double the used space on the hard drive.

As workaroudn I could just delete them after injecting, but of course, it would be much better not to generate them ;)

Thank you in advance!

Row with one empty column drops all columns.

I have a csv that resembles the following:

Time,mean,climatology,anomaly
1325548800,,,
1325808000,0.00011200000000000001,,
1593475200,6.1e-05,0.000165,-0.00010400000000000001

My fields are mean,climatology,anomaly.
In this example I expect row 1 to be dropped, row 2 to be partially submitted, and row 3 submitted for all fields.
Rows 1 and 3 work as expected.
In row 2 all fields are set to the NA value -999.
The mean field for row 2 should be 0.00011200000000000001, not -999.

empty columns throws cryptic error

I have a file with empty field columns; for example:

,Time,mean,climatology,anomaly,location,sensor
0,1029196800,,,,IFB,modis
1,1029283200,,,,IFB,modis

When I try csv-to-influx on this I get the following error:

# export_csv_to_influx --csv /tmp/FKdbv2_ABI_TS_MODA_daily_IFB.csv --dbname fwc_coral_disease --measurement modis_abi --field_columns mean,climatology,anomaly --tag_columns location,sensor --force_insert_even_csv_no_update True --server tylar-pc:8086 --time_column Time

Info: Database fwc_coral_disease already exists
Traceback (most recent call last):
  File "/usr/local/bin/export_csv_to_influx", line 8, in <module>
    sys.exit(export_csv_to_influx())
  File "/usr/local/lib/python3.7/site-packages/ExportCsvToInflux/exporter_object.py", line 643, in export_csv_to_influx
    force_float_columns=args.force_float_columns)
  File "/usr/local/lib/python3.7/site-packages/ExportCsvToInflux/exporter_object.py", line 402, in export_csv_to_influx
    for row, int_type, float_type in convert_csv_data_to_int_float:
  File "/usr/local/lib/python3.7/site-packages/ExportCsvToInflux/csv_object.py", line 244, in convert_csv_data_to_int_float
    int_status = int_type[key]
KeyError: 'anomaly'

A more accurate error message would be nice.

same time

Hello ,

I am trying to import data in infludb by using export-vvs-to-influx . It is pretty amazing how fast is it and how easy to use. Thanks for your job .

I am facing a problem. In my datas a lot of entries have the same timestamp. So in this case only one is recorded by Influxdb.

I am wondering if your script have take in charge this feature?

https://docs.influxdata.com/influxdb/v2.0/write-data/best-practices/duplicate-points/
"
Preserve duplicate points

To preserve both old and new field values in duplicate points, use one of the following strategies:

Add an arbitrary tag
Increment the timestamp

"
Do you plan something in your tool ?

Thanks in advance for your reply.
Xavier

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.