Comments (1)
Hi Daniel,
It is kind of you to have done all of this research for the project. At the time I didn't look at it but Pandas did some changes to their CSV parser and I took another look today.
It is important to realize that speed is a combination of lots of factors, and while I have no doubt the HD5 format is very efficient, it does not appear that Pandas's interface to it is as performant as reading CSV files.
import pandas as pd
def load_old():
CRC_organic_data = pd.read_csv('/tmp/Physical Constants of Organic Compounds.csv', sep='\t', index_col=0)
%timeit load_old()
18.9 ms ± 329 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
CRC_organic_data = pd.read_csv('/tmp/Physical Constants of Organic Compounds.csv', sep='\t', index_col=0)
CRC_organic_data.to_hdf('/tmp/example.h5', 'my_hdf_table', mode='w', format='table', complib='blosc:zstd', complevel=9 )#
%timeit pd.read_hdf('/tmp/example.h5', key='my_hdf_table')
CRC_organic_data.to_hdf('/tmp/example.h5', 'my_hdf_table', mode='w', format='table')#
%timeit pd.read_hdf('/tmp/example.h5', key='my_hdf_table')
CRC_organic_data.to_hdf('/tmp/example.h5', 'my_hdf_table', mode='w', format='fixed', complib='blosc:zstd', complevel=9 )#
%timeit pd.read_hdf('/tmp/example.h5', key='my_hdf_table')
CRC_organic_data.to_hdf('/tmp/example.h5', 'my_hdf_table', mode='w', format='fixed')#
%timeit pd.read_hdf('/tmp/example.h5', key='my_hdf_table')
34.7 ms ± 874 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
32 ms ± 1.43 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
26.6 ms ± 3.82 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
24.2 ms ± 1.02 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
No matter the settings it seems to be considerably slower. The CSVs also have the benefit of having track-able changes in git.
I am closing the issue because I am cleaning up and it seems certain to me that csv-style files read through Pandas is good enough.
Sincerely,
Caleb
from thermo.
Related Issues (20)
- CoolProp vs thermo HOT 2
- Gas phase mixture enthalpy values decreasing with increased pressure HOT 3
- Data for CAS# 64742-48-9 (Petroleum Naptha) missing HOT 4
- Certain properties missing from nitrogen gas HOT 4
- Mixture diffusion coefficient HOT 1
- PT_surface_special fails for mixture PR78MIX due to missing Tc and Pc attributes HOT 1
- Example for P-T diagrams?
- Mercury density HOT 3
- air.lemmon2000_rho HOT 2
- ImportError: cannot import name 'horner' from 'chemicals.utils' HOT 2
- Proposal on lists/numpy array support
- Chemical serialize able HOT 3
- Issues for Zg calculation HOT 1
- SQLite Fail on threading HOT 4
- Stream generation for air - cas_id references different chemical and H_reactive is attempted to use in calculation, but is None HOT 2
- Thermo v. 0.2.26 flash_vl - Raising Exception on PH flash HOT 4
- Examples on Raoult's Law PT Flash
- Phase envelope fails for mixtures HOT 2
- HELP VaporPressure HOT 1
- Inconsistent "rhol_60Fs_mass" values for normal paraffins HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from thermo.