Git Product home page Git Product logo

pgreaper's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

zahidsqldba

pgreaper's Issues

To Do/Improvements Pt. 2

Deploy First Non-Beta Version

  • 80%+ code coverage

Features

  • Add ability to use Python's csv sniffer to auto-detect CSV format
  • Add Excel support
  • Add LaTeX output

HTML Parsing

  • Add JSON output

Performance

  • Memory profiling for SQL uploads
  • Start benchmarking HTML parsing

User Friendliness

  • Add functions which allow a user to preview a file and how it will look before uploading
  • Add aliases for keyword arguments (e.g. db --> database, user --> username)
    • Add unit tests for aliases
    • Document them
  • If a row has more columns than others and this causes a DataError, report a more specific error message

To Do/Improvements

Deploy to PyPI

  • Complete bolded tasks below before releasing first version

Features

  • Add ability to use Python's csv sniffer to auto-detect CSV format
  • Add Excel support (Easy)
  • Add JSON support (Hard) (Probably trivial in PostgreSQL)
  • Add ability to connect to PostgresSQL databases not on localhost
  • Add ability to take subsets of columns
  • Add a mutate()-like function for Tables
  • SQLite to PostgresSQL conversion

HTML Parsing

  • Add ability to parse HTML <table>s
  • Write unit tests for basic <table> layouts
  • Write documentation

Jupyter Notebook Integration

  • Add _repr_html_() method for Table

Aesthetics

  • Make Tables repr() prettier and more useful, e.g. showing a summary of number of columns
  • Consistent text width throughout the package

Performance

  • Add feature so that if file is formatted "correctly", use Postgres COPY instead of bulk INSERTs for better performance
  • Look into more database specific optimizations
  • Implement faster column guessing algorithm
  • Start benchmarking SQL uploads
  • Memory profiling for SQL uploads
  • Start benchmarking HTML parsing

User Friendliness

  • Add functions which allow a user to preview a file and how it will look before uploading
  • Add aliases for keyword arguments (e.g. db --> database, user --> username)
    • Add unit tests for aliases
    • Document them
  • If a row has more columns than others and this causes a DataError, report a more specific error message

Code

  • Add tests for N/A value replacement
  • Add tests using real world data sets
  • Add tests for PostgresSQL
  • Add tests for command line interface

Bugs

...

requests not installed and ModuleNotFoundError error with python 3

(sqlify3) dave@asr:/tmp$ python -V
Python 3.5.2
(sqlify3) dave@asr:/tmp$ pip list
Package    Version
---------- -------
pip        9.0.1  
psycopg2   2.7.1  
setuptools 36.2.0 
sqlify     1.0.0b2
wheel      0.29.0 
(sqlify3) dave@asr:/tmp$ python
Python 3.5.2 (default, Nov 17 2016, 17:05:23) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlify
/home/dave/.virtualenvs/sqlify3/lib/python3.5/site-packages/sqlify/config.py:108: UserWarning: No default Postgres settings found. Use sqlify.settings(username='', password='', database='', hostname='') to set them.
  warnings.warn("No default Postgres settings found. Use sqlify.settings(username='', password='', database='', hostname='') to set them.")
Traceback (most recent call last):
  File "/home/dave/.virtualenvs/sqlify3/lib/python3.5/site-packages/sqlify/html/parser.py", line 2, in <module>
    import requests
ImportError: No module named 'requests'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dave/.virtualenvs/sqlify3/lib/python3.5/site-packages/sqlify/__init__.py", line 55, in <module>
    from .html import from_file, from_url
  File "/home/dave/.virtualenvs/sqlify3/lib/python3.5/site-packages/sqlify/html/__init__.py", line 1, in <module>
    from .parser import get_tables_from_file as from_file, \
  File "/home/dave/.virtualenvs/sqlify3/lib/python3.5/site-packages/sqlify/html/parser.py", line 4, in <module>
    except ModuleNotFoundError:
NameError: name 'ModuleNotFoundError' is not defined
>>> quit()

ImportError: No module named settings with Python 2

(sqlify2) dave@asr:/tmp$ python -V
Python 2.7.12
(sqlify2) dave@asr:/tmp$ pip list
Package    Version
---------- -------
click      6.7    
pip        9.0.1  
psycopg2   2.7.1  
setuptools 36.2.0 
sqlify     1.0.0a1
wheel      0.29.0 
(sqlify2) dave@asr:/tmp$ python
Python 2.7.12 (default, Nov 19 2016, 06:48:10) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlify
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dave/.virtualenvs/sqlify2/local/lib/python2.7/site-packages/sqlify/__init__.py", line 1, in <module>
    from sqlify.sqlify import *
  File "/home/dave/.virtualenvs/sqlify2/local/lib/python2.7/site-packages/sqlify/sqlify.py", line 1, in <module>
    from sqlify.settings import *
ImportError: No module named settings
>>>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.