man-group / arctic Goto Github PK

View Code? Open in Web Editor NEW

3.0K 174.0 583.0 2.26 MB

High performance datastore for time series and tick data

Home Page: https://arctic.readthedocs.io/en/latest/

License: GNU Lesser General Public License v2.1

Python 100.00%

tickstore mongodb pandas arctic timeseries timeseries-database python database

arctic's Introduction

Development moved to ArcticDB GitHub Repository

This repository and project are now in maintenance mode. Development has migrated to ArcticDB.

Information on how to set up, install and use Arctic has been moved to README-arctic.md.

arctic's People

Contributors

Stargazers

Watchers

Forkers

jamesblackburn vanife pperezrubio bjorskog theahura isentium andersonhaynes winterflower gitter-badger alessio harick1 finallybiubiu robertsudwarts drglitch mstampfer cityhunterok schevalier sdvillal monkeini vishnuvr cdfpaz keepsimpler parrondo r0k3 einarhuseby themltrader llazzaro arpanlepcha dikoufu maxtang willferreira seangeleno richwu rockhowse fabrizioff testmana2 femtotrader goldtpython vdt hariani rlcjj keel1982 alphabenj sys-git romanshestakov jianfengyu dgoverde wzhtc mindis simonfunston joshuaostrom sksundaram-learning hjlwik ocakgun multidis cmorgan cc272309126 mmitkevich cozmacib artiom33 sahanduiuc jason-m-evans lyrvid bromrector dwintour xgdgsc hpsilva trelsco alexveden edisonqkj mckelvin zeyu-h mbrukman calsaviour dg32 mannumeric msoelr2500 suryabahubalendruni arita37 leolle pjcrosbie ljoublanc faguoma jkhoogland systemtrader timzhangau turtlelabs johnsonc asandholm kiapper as102375 janr lf-shaw josephlau davidradernj svsamipillai kalaytan jimmytoronto ndwade bigfatwhale

arctic's Issues

No handlers could be found for logger "arctic.store.version_store"

So when I run this code I get the following warning. Do you know what might be causing that?

store.initialize_library('NASDAQ')
No handlers could be found for logger "arctic.store.version_store"

can't install on mac os x

clang -fno-strict-aliasing -fno-common -dynamic -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python/2.7.10/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c src/_compress.c -o build/temp.macosx-10.10-x86_64-2.7/src/_compress.o -fopenmp
src/_compress.c:259:10: fatal error: 'omp.h' file not found
#include "omp.h"
         ^
1 error generated.
error: command 'clang' failed with exit status 1

i installed clang-omp..

Setup.py install will fail if dependencies are not already installed

OS where problem was observed: Ubuntu 14.04 and OSX Yosemite
Setuptools versions: 18.0.1 and 17.1.1

Installing Arctic dependencies (such as PyMongo) causes the entire system to crash.
On the other hand, if dependencies are installed before running

python setup.py install

everything is fine.

A snippet from the stacktrace:

Processing dependencies for arctic==1.0.0
Searching for pymongo>=3.0
Reading https://pypi.python.org/simple/pymongo/
Best match: pymongo 3.0.3
Downloading https://pypi.python.org/packages/source/p/pymongo/pymongo-3.0.3.tar.gz#md5=0425d99c2a453144b9c95cb37dbc46e9
Processing pymongo-3.0.3.tar.gz
Writing /tmp/easy_install-MK91Bw/pymongo-3.0.3/setup.cfg
Running pymongo-3.0.3/setup.py -q bdist_egg --dist-dir /tmp/easy_install-MK91Bw/pymongo-3.0.3/egg-dist-tmp-usQbt3
Traceback (most recent call last):
  File "setup.py", line 121, in <module>
    "Topic :: Software Development :: Libraries",
  File "/home/winterflower/anaconda/lib/python2.7/distutils/core.py", line 151, in setup
    dist.run_commands()
  File "/home/winterflower/anaconda/lib/python2.7/distutils/dist.py", line 953, in run_commands
    self.run_command(cmd)
  File "/home/winterflower/anaconda/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/command/install.py", line 67, in run
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/command/install.py", line 117, in do_egg_install
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/command/easy_install.py", line 380, in run
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/command/easy_install.py", line 610, in easy_install
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/command/easy_install.py", line 661, in install_item
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/command/easy_install.py", line 709, in process_distribution
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/pkg_resources/__init__.py", line 836, in resolve

  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/pkg_resources/__init__.py", line 1081, in best_match

  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/pkg_resources/__init__.py", line 1093, in obtain

  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/command/easy_install.py", line 629, in easy_install
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/command/easy_install.py", line 659, in install_item
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/command/easy_install.py", line 842, in install_eggs
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/command/easy_install.py", line 1070, in build_and_install
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/command/easy_install.py", line 1056, in run_setup
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/sandbox.py", line 240, in run_setup
  File "/home/winterflower/anaconda/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/sandbox.py", line 193, in setup_context
  File "/home/winterflower/anaconda/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/sandbox.py", line 152, in save_modules
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/sandbox.py", line 126, in __exit__
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/sandbox.py", line 110, in dump
  File "/home/winterflower/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg/setuptools/sandbox.py", line 110, in dump

I believe the problem is caused by a bug in setuptools. The trace I get from running the setup.py for Arctic is identical to the one attached in this bugzilla report. Enthought/Enable has also had a similar issue with setuptools (see this issue)

I ran top while running the setup.py script, and the resident size (RES) jumps to 2.0G as soon as the script begins installing the first dependency. I believe memory leaks like this have been observed in setuptools before.

Need to check that metadata exists before enumerating

702ac62

(see _pandas_ndarray_store.py, line 75)

Should check that recarr.dtype.metadata.get('index_tz') is not None before enumerating to avoid the error "TypeError: 'NoneType' object is not iterable"

String/bytes issue in cython with Python3

The cython code has some issues with Python3. It assumes input is in bytes, not string format, and it returns things in bytes format (not string). Its easy to fix this, you can use this directive:

cython: c_string_type=str, c_string_encoding=ascii

to tell it to expect str format and you can cast the return from decompress to a c string from bytes. However, this breaks compatibility with the lz4 code, which continues to return things in bytes.

So if you then decide to cast things to bytes, and return them in bytes you have a different issue in that the unit tests are all in strings, and so everything fails.

There are several ways to fix/work around this, but I'm not sure what is best. Also, the
tests in tests/unit/test_compress seem to call directly into the cython code. When does the wrapper in arctic/_compression.py ever get used? That could be a place to address this since in there you can do all the UTF-8 encoding and decoding without needing to modify any other code.

Let me know what you think @TomTaylorLondon @jamesblackburn

Install of 1.16.0 failed

pip install arctic==1.16.0

Traceback (most recent call last):
      File "<string>", line 20, in <module>
      File "/tmp/pip-build-OQyXkO/arctic/setup.py", line 31, in <module>
        long_description = open('README.md').read()
    IOError: [Errno 2] No such file or directory: 'README.md'

Does arctic support query? I want to fetch data of some stock between a start time and a end time.

New to arctic.
I want to do some backtests in a specific time range. So I have to fetch data between a start time and a end time.
Many thanks.

Continuing Timezone issues

@AdrianTeng Take a look at the test failure in Travis-CI:

https://travis-ci.org/manahl/arctic/jobs/99426650

specifically this one: test_read_ts_raw

it looks like another timezone issue. assert_frame_equal checks the dtypes on the index, and one index is timestamp, the other is timestamp with a timezone:

AssertionError: Index are different

       Attribute "dtype" are different
       [left]:  datetime64[ns, tzfile(u'/usr/share/zoneinfo/UTC')]
       [right]: datetime64[ns]

I believe this error isnt showing up in CircleCI because Travis is using different versions of the dependencies.

Publish the documentation

publish project on pipy

this will allow to install the project from pip directly with "pip install arctic"

instructions:
http://peterdowns.com/posts/first-time-with-pypi.html

running setup.py without dependencies pre-installed gives errors

Installed /home/bryant/arctic
Processing dependencies for arctic==1.19.0
Searching for tzlocal
Reading https://pypi.python.org/simple/tzlocal/
Best match: tzlocal 1.2
Downloading https://pypi.python.org/packages/source/t/tzlocal/tzlocal-1.2.tar.gz#md5=2e36ceb1260bf1233ed2f018a1df536e
Processing tzlocal-1.2.tar.gz
Writing /tmp/easy_install-971xQ2/tzlocal-1.2/setup.cfg
Running tzlocal-1.2/setup.py -q bdist_egg --dist-dir /tmp/easy_install-971xQ2/tzlocal-1.2/egg-dist-tmp-WFrc5P
Moving tzlocal-1.2-py2.7.egg to /home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages
Adding tzlocal 1.2 to easy-install.pth file

Installed /home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/tzlocal-1.2-py2.7.egg
Searching for pymongo>=3.0
Reading https://pypi.python.org/simple/pymongo/
Best match: pymongo 3.2
Downloading https://pypi.python.org/packages/source/p/pymongo/pymongo-3.2.tar.gz#md5=463b4d325d8fb4070c04f15391b457bf
Processing pymongo-3.2.tar.gz
Writing /tmp/easy_install-UxPSJi/pymongo-3.2/setup.cfg
Running pymongo-3.2/setup.py -q bdist_egg --dist-dir /tmp/easy_install-UxPSJi/pymongo-3.2/egg-dist-tmp-A3Utib
Traceback (most recent call last):
  File "setup.py", line 131, in <module>
    "Topic :: Software Development :: Libraries",
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/distutils/core.py", line 151, in setup
    dist.run_commands()
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/distutils/dist.py", line 953, in run_commands
    self.run_command(cmd)
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/develop.py", line 33, in run
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/develop.py", line 132, in install_for_development
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/easy_install.py", line 719, in process_distribution
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/pkg_resources/__init__.py", line 846, in resolve
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/pkg_resources/__init__.py", line 1091, in best_match
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/pkg_resources/__init__.py", line 1103, in obtain
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/easy_install.py", line 639, in easy_install
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/easy_install.py", line 669, in install_item
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/easy_install.py", line 852, in install_eggs
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/easy_install.py", line 1080, in build_and_install
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/easy_install.py", line 1066, in run_setup
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/sandbox.py", line 242, in run_setup
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/sandbox.py", line 195, in setup_context
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/sandbox.py", line 166, in save_modules
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/sandbox.py", line 141, in resume
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/sandbox.py", line 154, in save_modules
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/sandbox.py", line 195, in setup_context
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/sandbox.py", line 239, in run_setup
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/sandbox.py", line 269, in run
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/sandbox.py", line 238, in runner
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/sandbox.py", line 46, in _execfile
  File "/tmp/easy_install-UxPSJi/pymongo-3.2/setup.py", line 309, in <module>

  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/distutils/core.py", line 151, in setup
    dist.run_commands()
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/distutils/dist.py", line 953, in run_commands
    self.run_command(cmd)
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/bdist_egg.py", line 151, in run
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/distutils/cmd.py", line 326, in run_command
    self.distribution.run_command(command)
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/egg_info.py", line 185, in run
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/egg_info.py", line 208, in find_sources
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/egg_info.py", line 292, in run
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/egg_info.py", line 321, in add_defaults
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/site-packages/setuptools-19.1.1-py2.7.egg/setuptools/command/sdist.py", line 132, in add_defaults
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/distutils/command/build_ext.py", line 420, in get_source_files
    self.check_extensions_list(self.extensions)
  File "/home/bryant/anaconda2/envs/arctic/lib/python2.7/distutils/command/build_ext.py", line 362, in check_extensions_list
    ("each element of 'ext_modules' option must be an "
setuptools.sandbox.UnpickleableException: DistutilsSetupError("each element of 'ext_modules' option must be an Extension instance or 2-tuple",)

If I install pymongo with conda, and run it again, I get a similar issue with LZ4. If I install that and run again, it works without issue.

Add initial_image as optional parameter on tickstore write()

tickstore documents can contain initial images [IMAGE_DOC]. Add (, initial_image=None) to write to allow them to be populated. If the image is given and the document is split into smaller documents then compute and write the new initial image for each new document.. read() already has the flag to return the image data. For toplevel tickstore writes the image has to be updated for each new library.

Trouble installing with Anaconda

Hi, I've heard a bit about arctic and I'm excited to give it a try. My system is Windows 7 32-bit and I've installed the Anaconda 2.4.0 distribution.

I have read Issue #6 and I think I am having a different problem, I checked C:\Anaconda\Lib\distutils\distutils.cfg and it says compiler=mingw32

Here is the output when I run pip install git+https://github.com/manahl/arctic.git

Collecting git+https://github.com/manahl/arctic.git
  Cloning https://github.com/manahl/arctic.git to c:\users\aclerc\appdata\local\temp\pip-kxthfn-build
Requirement already satisfied (use --upgrade to upgrade): decorator in c:\anaconda\lib\site-packages (from arctic==1.13.0)
Requirement already satisfied (use --upgrade to upgrade): enum34 in c:\anaconda\lib\site-packages (from arctic==1.13.0)
Requirement already satisfied (use --upgrade to upgrade): lz4 in c:\anaconda\lib\site-packages (from arctic==1.13.0)
Requirement already satisfied (use --upgrade to upgrade): mockextras in c:\anaconda\lib\site-packages (from arctic==1.13.0)
Requirement already satisfied (use --upgrade to upgrade): pandas in c:\anaconda\lib\site-packages (from arctic==1.13.0)
Requirement already satisfied (use --upgrade to upgrade): pymongo>=3.0 in c:\anaconda\lib\site-packages (from arctic==1.13.0)
Requirement already satisfied (use --upgrade to upgrade): python-dateutil in c:\anaconda\lib\site-packages (from arctic==1.13.0)
Requirement already satisfied (use --upgrade to upgrade): pytz in c:\anaconda\lib\site-packages (from arctic==1.13.0)
Requirement already satisfied (use --upgrade to upgrade): tzlocal in c:\anaconda\lib\site-packages (from arctic==1.13.0)
Requirement already satisfied (use --upgrade to upgrade): mock>=0.8.0 in c:\anaconda\lib\site-packages (from mockextras->arctic==1.13.0)
Requirement already satisfied (use --upgrade to upgrade): numpy>=1.7.0 in c:\anaconda\lib\site-packages (from pandas->arctic==1.13.0)
Installing collected packages: arctic
  Running setup.py install for arctic

... more install messages which look fine ...

running build_ext
    cythoning src/_compress.pyx to src\_compress.c
    building 'arctic._compress' extension
    creating build\temp.win32-2.7
    creating build\temp.win32-2.7\Release
    creating build\temp.win32-2.7\Release\src
    C:\Anaconda\Scripts\gcc.bat -mdll -O -Wall -IC:\Anaconda\include -IC:\Anaconda\PC -c src\_compress.c -o build\temp.win32-2.7\Release\src\_compress.o -fopenmp
    src\_compress.c:259:17: fatal error: omp.h: No such file or directory
    compilation terminated.
    warning: src\_compress.pyx:167:38: Use boundscheck(False) for faster access
    warning: src\_compress.pyx:177:19: Use boundscheck(False) for faster access
    warning: src\_compress.pyx:29:5: Cannot profile nogil function.
    warning: src\_compress.pyx:35:5: Cannot profile nogil function.
    warning: View.MemoryView:288:5: Cannot profile nogil function.

... more warnings related to nogil ...

    error: command 'C:\\Anaconda\\Scripts\\gcc.bat' failed with exit status 1

it seems gcc.bat is the problem, this script contains

@echo off
"%~f0\..\..\MinGW\bin\gcc.exe" %*

'python setup.py install' leaves out several subpackages

Hi.

Running the current version of setup.py will leave out several of the subpackages, e.g. arctic.store and arctic.tickstore.

Instead of using "packages=['arctic', 'tests']" in setup.py, would it not be better to use "packages=find_packages()"? After applying this change it installs nicely.

I'm on OSX so I don't have a pull that fixes this due to clang and OpenMP lacking compatibility leading to other changes, and thus lots of failing unittests.

Update a dataset automatically

Hello,

It will be nice if you could show how to update a dataset automatically

For example

Let's say that we have run this in 2015-01-01

from arctic import Arctic

# Connect to Local MONGODB
store = Arctic('localhost')

# Create the library - defaults to VersionStore
store.initialize_library('CURRFX')

# Access the library
library = store['CURRFX']

# Load some data - maybe from Quandl
authtoken = "your token here"
eurusd = Quandl.get("CURRFX/EURUSD", authtoken=authtoken, trim_start='20140101', trim_end='20141231')

# Store the data in the library
library.write('EURUSD', eurusd, metadata={'source': 'Quandl'})

# Reading the data
item = library.read('EURUSD')
aapl = item.data
metadata = item.metadata

now (2015-12-21) we want to update data (so from 2015-01-01 to today).

So we could query Quandl this way:

eurusd_new = Quandl.get("CURRFX/EURUSD", authtoken=authtoken, trim_start='20150101')

I don't think that reading previous (stored) dataset (and putting the whole dataset to memory in order to get Max(Date) (eurusd.index.max() so 2014-12-31) is very efficient.

Moreover we will need to concatenate eurusd and eurusd_new and write this new concatenate dataset to the library.

I think that we should also store frequency (daily in our case as metadata)

Isn't there any API for this ?

Updating a library automatically and the whole datastore is something to also consider.

Kind regards

What's new file

Hello,

it will be nice to add a What's new file in doc directory

see an example
https://github.com/xray/xray/blob/master/doc/whats-new.rst
or a more complex approach (one file per version) here : https://github.com/pydata/pandas-datareader/tree/master/docs/source/

So building doc will also display release notes

I noticed https://github.com/manahl/arctic/blob/master/CHANGES.md
but the idea is to have this in docs directory like other pandas related projects

Kind regards

Zipline datasource with Arctic datastore

Hello,

Zipline could be an interesting example for Arctic.
here is some instructions to build a datasource for zipline
https://github.com/quantopian/zipline/wiki/How-To-code-a-data-source

Kind regards

PS: on zipline side
quantopian/zipline#660

str(tick library) fails in IPython

When stringifying a library object, it blows up in IPython:

In [38]: tickdb = Arctic('database_name')
In [39]: tickdb['library_name']
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
/ipython-3.2.0_1_ahl1-py2.7.egg/IPython/core/formatters.pyc in __call__(self, obj)
    688                 type_pprinters=self.type_printers,
    689                 deferred_pprinters=self.deferred_printers)
--> 690             printer.pretty(obj)
    691             printer.flush()
    692             return stream.getvalue()

/ipython-3.2.0_1_ahl1-py2.7.egg/IPython/lib/pretty.pyc in pretty(self, obj)
    407                             if callable(meth):
    408                                 return meth(obj, self, cycle)
--> 409             return _default_pprint(obj, self, cycle)
    410         finally:
    411             self.end_group()

/ipython-3.2.0_1_ahl1-py2.7.egg/IPython/lib/pretty.pyc in _default_pprint(obj, p, cycle)
    527     if _safe_getattr(klass, '__repr__', None) not in _baseclass_reprs:
    528         # A user-provided repr. Find newlines and replace them with p.break_()
--> 529         _repr_pprint(obj, p, cycle)
    530         return
    531     p.begin_group(1, '<')

/ipython-3.2.0_1_ahl1-py2.7.egg/IPython/lib/pretty.pyc in _repr_pprint(obj, p, cycle)
    709     """A pprint that just redirects to the normal repr function."""
    710     # Find newlines and replace them with p.break_()
--> 711     output = repr(obj)
    712     for idx,output_line in enumerate(output.splitlines()):
    713         if idx:

/arctic/tickstore/tickstore.py in __repr__(self)
    107 
    108     def __repr__(self):
--> 109         return str(self)
    110 
    111     def delete(self, symbol, date_range=None):

/arctic/tickstore/tickstore.py in __str__(self)
    104     def __str__(self):
    105         return """<%s at %s>
--> 106 %s""" % (self.__class__.__name__, hex(id(self)), indent(str(self._arctic_lib), 4))
    107 
    108     def __repr__(self):

NameError: global name 'indent' is not defined

append not working as expected

I have a dataframe stored in mongodb using arctic and I would like to append to the existing dataframe, e.g. updating daily prices.
I've tried using version storage and the append() function, however it gives me not implemented for handler error

" File "C:\Anaconda\lib\site-packages\arctic\store\version_store.py", line 496, in append
raise Exception("Append not implemented for handler %s" % handler)
Exception: Append not implemented for handler <arctic.store._pickle_store.PickleStore object at 0x09274AB0>"

've tried register_library_type('dataframestore', PandasDataFrameStore) but received some other error.

Do you have an example of how to update existing dataframe/series data or is there a rule of thumb?

Code quality metrics: add codify / landscape to project

Hello,

it will be nice to enable Landscape https://landscape.io/
It can help to track many bugs (even before you noticed them)

For example
this #49
https://github.com/manahl/arctic/blob/master/arctic/tickstore/tickstore.py#L106
should be found easily by Landscape

Codacy https://www.codacy.com/ might also be something to consider

Coveralls provide code coverage https://coveralls.io/
and is also something that should be considered

Landscape and Codacy are very easy to use as they are services.
Coveralls is a little bit harder to setup... but nothing impossible.

It can be also interesting to test code quality locally...

so it may be worth to also set tox https://testrun.org/tox/latest/
flake8 https://flake8.readthedocs.org/ ...
(other tools are pylint http://www.pylint.org/ pychecker http://pychecker.sourceforge.net/
pyflaskes https://launchpad.net/pyflakes/
flake8 https://flake8.readthedocs.org/
pep8 http://pep8.readthedocs.org/
mccabe http://www.mccabe.com/
) but flake8 is a wrapper around PyFlakes, pep8, Ned Batchelder’s McCabe script
so it may be enough

You can find interesting project templates with these tools enabled
https://github.com/audreyr/cookiecutter-pypackage

We use Travis, Landscape, Coveralls in
https://github.com/pydata/pandas-datareader

So a first step could be to enable Landscape as it's easy and can help a lot.

Kind regards

Build fails with src/_compress.c:348:10: fatal error: 'omp.h' file not found

Using OSX 10.9.5 with Anaconda Python 2.7

Installation fails with both pip and setuptools.

jasons-mbp:arctic jason$ sudo pip install git+https://github.com/manahl/arctic.git
Password:
Collecting git+https://github.com/manahl/arctic.git
  Cloning https://github.com/manahl/arctic.git to /tmp/pip-g3vzNe-build
Requirement already satisfied (use --upgrade to upgrade): decorator in /usr/local/anaconda/lib/python2.7/site-packages (from arctic==1.0.0)
Requirement already satisfied (use --upgrade to upgrade): enum34 in /usr/local/anaconda/lib/python2.7/site-packages (from arctic==1.0.0)
Requirement already satisfied (use --upgrade to upgrade): lz4 in /usr/local/anaconda/lib/python2.7/site-packages (from arctic==1.0.0)
Requirement already satisfied (use --upgrade to upgrade): mockextras in /usr/local/anaconda/lib/python2.7/site-packages (from arctic==1.0.0)
Requirement already satisfied (use --upgrade to upgrade): pandas in /usr/local/anaconda/lib/python2.7/site-packages (from arctic==1.0.0)
Collecting pymongo>=3.0 (from arctic==1.0.0)
  Downloading pymongo-3.0.3.tar.gz (419kB)
    100% |████████████████████████████████| 421kB 646kB/s 
Requirement already satisfied (use --upgrade to upgrade): python-dateutil in /usr/local/anaconda/lib/python2.7/site-packages (from arctic==1.0.0)
Requirement already satisfied (use --upgrade to upgrade): pytz in /usr/local/anaconda/lib/python2.7/site-packages (from arctic==1.0.0)
Collecting tzlocal (from arctic==1.0.0)
  Downloading tzlocal-1.2.tar.gz
Requirement already satisfied (use --upgrade to upgrade): mock>=0.8.0 in /usr/local/anaconda/lib/python2.7/site-packages (from mockextras->arctic==1.0.0)
Requirement already satisfied (use --upgrade to upgrade): numpy>=1.7.0 in /usr/local/anaconda/lib/python2.7/site-packages (from pandas->arctic==1.0.0)
Installing collected packages: pymongo, tzlocal, arctic
  Running setup.py install for pymongo
  Running setup.py install for tzlocal
  Running setup.py install for arctic
    Complete output from command /usr/local/anaconda/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip-g3vzNe-build/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-TDueqF-record/install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_py
    creating build
    creating build/lib.macosx-10.5-x86_64-2.7
    creating build/lib.macosx-10.5-x86_64-2.7/arctic
    copying arctic/__init__.py -> build/lib.macosx-10.5-x86_64-2.7/arctic
    copying arctic/_compression.py -> build/lib.macosx-10.5-x86_64-2.7/arctic
    copying arctic/_util.py -> build/lib.macosx-10.5-x86_64-2.7/arctic
    copying arctic/arctic.py -> build/lib.macosx-10.5-x86_64-2.7/arctic
    copying arctic/auth.py -> build/lib.macosx-10.5-x86_64-2.7/arctic
    copying arctic/decorators.py -> build/lib.macosx-10.5-x86_64-2.7/arctic
    copying arctic/exceptions.py -> build/lib.macosx-10.5-x86_64-2.7/arctic
    copying arctic/hooks.py -> build/lib.macosx-10.5-x86_64-2.7/arctic
    copying arctic/hosts.py -> build/lib.macosx-10.5-x86_64-2.7/arctic
    copying arctic/logging.py -> build/lib.macosx-10.5-x86_64-2.7/arctic
    creating build/lib.macosx-10.5-x86_64-2.7/tests
    copying tests/__init__.py -> build/lib.macosx-10.5-x86_64-2.7/tests
    copying tests/conftest.py -> build/lib.macosx-10.5-x86_64-2.7/tests
    copying tests/util.py -> build/lib.macosx-10.5-x86_64-2.7/tests
    running build_ext
    cythoning src/_compress.pyx to src/_compress.c
    warning: src/_compress.pyx:167:38: Use boundscheck(False) for faster access
    warning: src/_compress.pyx:177:19: Use boundscheck(False) for faster access
    warning: src/_compress.pyx:29:5: Cannot profile nogil function.
    warning: src/_compress.pyx:35:5: Cannot profile nogil function.
    warning: View.MemoryView:294:5: Cannot profile nogil function.
    warning: View.MemoryView:772:5: Cannot profile nogil function.
    warning: View.MemoryView:908:5: Cannot profile nogil function.
    warning: View.MemoryView:1070:5: Cannot profile nogil function.
    warning: View.MemoryView:1077:5: Cannot profile nogil function.
    warning: View.MemoryView:1131:5: Cannot profile nogil function.
    warning: View.MemoryView:1138:5: Cannot profile nogil function.
    warning: View.MemoryView:1149:5: Cannot profile nogil function.
    warning: View.MemoryView:1170:5: Cannot profile nogil function.
    warning: View.MemoryView:1230:5: Cannot profile nogil function.
    warning: View.MemoryView:1301:5: Cannot profile nogil function.
    warning: View.MemoryView:1323:5: Cannot profile nogil function.
    warning: View.MemoryView:1358:5: Cannot profile nogil function.
    warning: View.MemoryView:1368:5: Cannot profile nogil function.
    building 'arctic._compress' extension
    creating build/temp.macosx-10.5-x86_64-2.7
    creating build/temp.macosx-10.5-x86_64-2.7/src
    gcc -fno-strict-aliasing -I/usr/local/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/local/anaconda/include/python2.7 -c src/_compress.c -o build/temp.macosx-10.5-x86_64-2.7/src/_compress.o -fopenmp
    clang: warning: argument unused during compilation: '-fopenmp'
    src/_compress.c:348:10: fatal error: 'omp.h' file not found
    #include "omp.h"
             ^
    1 error generated.
    error: command 'gcc' failed with exit status 1

    ----------------------------------------
Command "/usr/local/anaconda/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip-g3vzNe-build/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-TDueqF-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-g3vzNe-build

attribute 'dict' of 'type' objects is not writable

Hello,

$ pip install git+https://github.com/manahl/arctic.git
Collecting git+https://github.com/manahl/arctic.git
  Cloning https://github.com/manahl/arctic.git to /var/folders/j_/v8b1bst93_94t724ptsswfsr0000gn/T/pip-u36_hcpf-build
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 20, in <module>
      File "/var/folders/j_/v8b1bst93_94t724ptsswfsr0000gn/T/pip-u36_hcpf-build/setup.py", line 63, in <module>
        m.Extension.__dict__ = m._Extension.__dict__
    AttributeError: attribute '__dict__' of 'type' objects is not writable

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /var/folders/j_/v8b1bst93_94t724ptsswfsr0000gn/T/pip-u36_hcpf-build

I'm using Mac OS X 10.10 with Anaconda (Python 3)

Kind regards

Python 3.5 incompatible with latest version of Pandas

See pandas-dev/pandas#11798

Once pandas 0.18.0 is released, Python 3.5 should work with Arctic

chunk slice read from version_store

not sure if this is right place, but do have question on using library.read function.
Say if i have a symbol called AA which is pandas dataframe that looks like the following:

    open_ps  high_ps  close_ps

date
1994-05-16 21459 NaN 21459
1994-05-17 21402 NaN 21402
1994-05-18 21296 NaN 21296

and it's being stored in version_store (not as blob), is it possible for me query based on a start date and end date? think @jamesblackburn had a slide on chunking on tick data store, but not sure how this is being done on version store.
workaround is obviously load whole dataframe in memory and filter after, not sure how much of a performance impact this will have.
cheers
B.

Is Python 3.4 supported?

Hey there. I see Python 2.7 in the Requirements, do you know if this works with Python 3.4?

I only ask because it isn't clear if it isn't supported or just isn't tested. Personally I've been using Python 3.4+Pandas+PyMongo (on Linux+Windows) for almost 2 years with no hiccups.

Allow read_metadata to expose internal information about the data

per @TomTaylorLondon sometimes you want to determine information about the data without actually reading it back (i.e. type, number of columns, column names, etc). This allows the user to get this data via VersionStore::read_metadata (or via any VersionedItem).

Unit test failure in bitemporal store

def test_last_update(bitemporal_library):
bitemporal_library.update('spam', ts1, as_of=dt(2015, 1, 1))
bitemporal_library.update('spam', ts1, as_of=dt(2015, 1, 2))

  assert bitemporal_library.read('spam').last_updated == dt(2015, 1, 2)
E assert Timestamp('2015-01-02 05:00:00') == datetime.datetime(2015, 1, 2, 0, 0)
E + where Timestamp('2015-01-02 05:00:00') = BitemporalItem(symbol='spam', library=u'arctic_test.TEST', data= ...:06:11.040 3.0, metadata=None, last_updated=Timestamp('2015-01-02 05:00:00')).last_updated
E + where BitemporalItem(symbol='spam', library=u'arctic_test.TEST', data= ...:06:11.040 3.0, metadata=None, last_updated=Timestamp('2015-01-02 05:00:00')) = <bound method BitemporalStore.read of <arctic.store.bitemporal_store.BitemporalStore object at 0x7f1c2da62650>>('spam')
E + where <bound method BitemporalStore.read of <arctic.store.bitemporal_store.BitemporalStore object at 0x7f1c2da62650>> = <arctic.store.bitemporal_store.BitemporalStore object at 0x7f1c2da62650>.read
E + and datetime.datetime(2015, 1, 2, 0, 0) = dt(2015, 1, 2)

tests/integration/store/test_bitemporal_store.py:60: AssertionError

Basically we're comparing Timestamp('2015-01-02 05:00:00') to Timestamp('2015-01-02 00:00:00')

Not sure why last updated is reporting 05:00:00 as the timestamp. Is this because I'm not running my machine on GMT or is this some other anomaly? Doesn't appear version related (I tried with 1.9.2/0.16.2 and latest, same behavior)

With lib_type='TickStoreV3': No field of name index - index.name and index.tzinfo not preserved - max_date returning min date (without timezone)

Hello,

this code

from pandas_datareader import data as pdr
symbol = "IBM"
df = pdr.DataReader(symbol, "yahoo", "2010-01-01", "2015-12-29")
df.index = df.index.tz_localize('UTC')

from arctic import Arctic
store = Arctic('localhost')
store.initialize_library('library_name', 'TickStoreV3')
library = store['library_name']
library.write(symbol, df)

raises

ValueError: no field of name index

I'm using TickStoreV3 as lib_type because I'm not very interested (at least for now) by
audited write, versioning...

I noticed that

>>> df['index']=0
>>> library.write(symbol, df)
1 buckets in 0.015091: approx 6626466 ticks/sec

seems to fix this... but

>>> library.read(symbol)
                           index        High   Adj Close     ...             Low       Close        Open
1970-01-01 01:00:00+01:00      0  132.970001  116.564610     ...      130.850006  132.449997  131.179993
1970-01-01 01:00:00+01:00      0  131.850006  115.156514     ...      130.100006  130.850006  131.679993
1970-01-01 01:00:00+01:00      0  131.490005  114.408453     ...      129.809998  130.000000  130.679993
1970-01-01 01:00:00+01:00      0  130.250000  114.012427     ...      128.910004  129.550003  129.869995
1970-01-01 01:00:00+01:00      0  130.919998  115.156514     ...      129.050003  130.850006  129.070007
...                          ...         ...         ...     ...             ...         ...         ...
1970-01-01 01:00:00+01:00      0  135.830002  135.500000     ...      134.020004  135.500000  135.830002
1970-01-01 01:00:00+01:00      0  138.190002  137.929993     ...      135.649994  137.929993  135.880005
1970-01-01 01:00:00+01:00      0  139.309998  138.539993     ...      138.110001  138.539993  138.300003
1970-01-01 01:00:00+01:00      0  138.880005  138.250000     ...      138.110001  138.250000  138.429993
1970-01-01 01:00:00+01:00      0  138.039993  137.610001     ...      136.539993  137.610001  137.740005

[1507 rows x 7 columns]

It looks like as if write was looking for a DataFrame with a column named 'index'... which is quite odd.

If I do

df['index']=1
library.write(symbol, df)

then

library.write(symbol, df)

raises

OverflowError: Python int too large to convert to C long

Any idea ?

AttributeError: 'module' object has no attribute 'version'

When trying to run the unit tests I encounter this error.

File "/home/bryant/Desktop/arctic/arctic/store/_version_store_utils.py", line 71, in
pickle_compat_load = _define_compat_pickle_load()
File "/home/bryant/Desktop/arctic/arctic/store/_version_store_utils.py", line 66, in _define_compat_pickle_load
if pd.version.version.startswith("0.14"):
AttributeError: 'module' object has no attribute 'version'

In an interpreter:

Python 2.7.10 (default, Oct 14 2015, 16:09:02)
[GCC 5.2.1 20151010] on linux2
Type "help", "copyright", "credits" or "license" for more information.
import pandas as pd
pd.version
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'module' object has no attribute 'version'
pd.version
u'0.17.0'

Does arctic support version 0.17 of Pandas?

version_store.delete fails has_symbol() assertion

We're seeing the has_symbol() assertion failing sporadically still in a clustered mongo environment, even after making has_symbol force reads from the Primary in the the v1.5.0 release.

arctic/store/version_store.py", line 666, in delete
    assert not self.has_symbol(symbol)
AssertionError

A retry on the operation will work, but it's not ideal!

intermittent test failure in 3.4

See https://travis-ci.org/manahl/arctic/jobs/103143033

For an unknown reason this test fails periodically in Python 3.4. Usually a re-run will pass.

in test_ts_write_pandas[tickstore]

        read = tickstore_lib.read('SYM', columns=None)
>       assert_frame_equal(read, data, check_names=False)
tests/integration/tickstore/test_ts_write.py:79: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../virtualenv/python3.4.2/lib/python3.4/site-packages/pandas/util/testing.py:1024: in assert_frame_equal
    obj='{0}.columns'.format(obj))
../../../virtualenv/python3.4.2/lib/python3.4/site-packages/pandas/util/testing.py:690: in assert_index_equal
    obj=obj, lobj=left, robj=right)
pandas/src/testing.pyx:58: in pandas._testing.assert_almost_equal (pandas/src/testing.c:3809)
    ???
pandas/src/testing.pyx:139: in pandas._testing.assert_almost_equal (pandas/src/testing.c:2450)
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
>   ???
E   AssertionError: 'c' != 'b'

arctic.exceptions.QuotaExceededException

hey guys, i got this exception, does anyone knows why? Thanks in advance!

arctic.exceptions.QuotaExceededException: Quota Exceeded: 32.972 / 10 GB used

Mock version dependency

Why is mock required to be version 1.0.1? It would be ideal if we could run with the latest version of mock and have more of the unit tests work in Python3

configuring TIME_ZONE_DATA_SOURCE

Hi,

I have installed and run arctic on my window box. The only issue is the configuration of TIME_ZONE_DATA_SOURCE in arctic\date_mktz.py currently it points to a directory

TIME_ZONE_DATA_SOURCE = '/usr/share/zoneinfo/'

Can you tell me how to configure so the test example

stocks.list_versions('aapl')

runs without error

c:\python2710_64\lib\site-packages\arctic-1.11.0-py2.7-win-amd64.egg\arctic\date\_mktz.py in mktz(zone)
     71         tz = tzfile(_path)
     72     except (ValueError, IOError) as err:
---> 73         raise TimezoneError('Timezone "%s" can not be read, error: "%s"' % (zone, err))
     74     # Stash the zone name as an attribute (as pytz does)
     75     tz.zone = zone if not zone.startswith(TIME_ZONE_DATA_SOURCE) else zone[len(TIME_ZONE_DATA_SOURCE):]

TimezoneError: Timezone "Europe/London" can not be read, error: "[Errno 2] No such file or directory: '/usr/share/zoneinfo/Europe/London'"

Benchmarking

Hello,

it will be nice to provide some benchmarks files
nose-timer can help https://github.com/mahmoudimus/nose-timer

Here is an example which can be extend to Arctic

import time
import numpy as np
import numpy.ma as ma
import pandas as pd
pd.set_option('max_rows', 10)
pd.set_option('expand_frame_repr', False)
pd.set_option('max_columns', 12)

import pymongo
import monary
import xray

URI_DEFAULT = 'mongodb://127.0.0.1:27017'
N_DEFAULT = 50000


def ticks(N):
    idx = pd.date_range('20150101',freq='ms',periods=N)
    bids = np.random.uniform(0.8, 1.0, N)
    spread = np.random.uniform(0, 0.0001, N)
    asks = bids + spread
    df_ticks = pd.DataFrame({'Bid': bids, 'Ask': asks}, index=idx)
    df_ticks['Symbol'] = 'CUR1/CUR2'
    df_ticks = df_ticks.reset_index()
    return df_ticks


class Test00Pandas:
    @classmethod
    def setupClass(cls):
        N = N_DEFAULT
        cls.df = ticks(N)

    def test_01_to_dict_01_records(self):
        d = self.df.to_dict('records')

    def test_01_to_dict_02_split(self):
        d = self.df.to_dict('split')


class Test01PyMongoPandasDataFrame:
    """
    PyMongo and Pandas DataFrame
    """


    @classmethod
    def setupClass(cls):
        N = N_DEFAULT
        URI = URI_DEFAULT
        cls.db_name = 'benchdb_pymongo'
        cls.collection_name = 'ticks'
        cls.df = ticks(N)
        cls.columns = ['Bid', 'Ask']
        cls.df = cls.df[cls.columns]

        cls.client = pymongo.MongoClient(URI)
        cls.client.drop_database(cls.db_name)
        cls.collection = cls.client[cls.db_name][cls.collection_name]

    def setUp(self):
        pass

    def tearDown(self):
        pass

    def test_01_store(self):
        print(self.df)
        self.collection.insert_many(self.df.to_dict('records'))
        #time.sleep(2)

    def test_02_retrieve(self):
        df_retrieved = pd.DataFrame(list(self.client[self.db_name][self.collection_name].find()))
        print(df_retrieved)


class Test02MonaryPandasDataFrame:
    """
    Monary and Pandas DataFrame
    """

    @classmethod
    def setupClass(cls):
        N = N_DEFAULT
        URI = URI_DEFAULT
        cls.db_name = 'benchdb_monary'
        cls.collection_name = 'ticks'
        cls.df = ticks(N)
        cls.columns = ['Bid', 'Ask']

        cls._client = pymongo.MongoClient(URI)
        cls._client.drop_database(cls.db_name)

        cls.m = monary.Monary(URI)

    def test_01_store(self):
        #ma.masked_array(self.df['Symbol'].values, self.df['Symbol'].isnull()),
        mparams = monary.MonaryParam.from_lists([
            ma.masked_array(self.df['Bid'].values, self.df['Bid'].isnull()),
            ma.masked_array(self.df['Ask'].values, self.df['Ask'].isnull())],
            self.columns)
        self.m.insert(self.db_name, self.collection_name, mparams)

    def test_02_retrieve(self):
        arrays = self.m.query(self.db_name, self.collection_name, {}, self.columns, ['float64', 'float64'])
        print(arrays)
        df_retrieved = pd.DataFrame(arrays)
        print(df_retrieved)


class Test03MonaryXrayDataset:
    """
    Monary and xray

    https://bitbucket.org/djcbeach/monary/issues/21/use-xraydataset-with-monary
    """

    @classmethod
    def setupClass(cls):
        N = N_DEFAULT
        URI = URI_DEFAULT
        cls.db_name = 'benchdb_monary_xray'
        cls.collection_name = 'ticks'
        cls._df = ticks(N)
        cls.ds = xray.Dataset.from_dataframe(cls._df)
        cls.columns = ['Bid', 'Ask']
        cls.ds = cls.ds[cls.columns]
        cls._client = pymongo.MongoClient(URI)
        cls._client.drop_database(cls.db_name)

        cls.m = monary.Monary(URI)

    def test_01_store(self):
        lst_cols = list(map(lambda col: self.ds[col].to_masked_array(), self.ds.data_vars))
        mparams = monary.MonaryParam.from_lists(lst_cols, list(self.ds.data_vars), ['float64', 'float64'])
        self.m.insert(self.db_name, self.collection_name, mparams)

class Test04OdoPandasDataFrame:
    """
    Pandas DataFrame and odo
    """
    @classmethod
    def setupClass(cls):
        N = N_DEFAULT
        URI = URI_DEFAULT
        cls.db_name = 'benchdb_odo'
        cls.collection_name = 'ticks'
        cls.df = ticks(N)
        cls.columns = ['Bid', 'Ask']
        cls.df = cls.df[cls.columns]

        cls.client = pymongo.MongoClient(URI)
        cls.client.drop_database(cls.db_name)
        cls.collection = cls.client[cls.db_name][cls.collection_name]

    def test_01_store(self):
        odo(self.df, self.collection)

    def test_02_retrieve(self):
        df_retrieved = odo(self.collection, pd.DataFrame)

it shows:

test_mongodb.Test02MonaryPandasDataFrame.test_02_retrieve: 3.7676s
test_mongodb.Test01PyMongoPandasDataFrameToDictRecords.test_01_store: 3.1900s
test_mongodb.Test04OdoPandasDataFrame.test_01_store: 3.0213s
test_mongodb.Test00Pandas.test_01_to_dict_01_records: 1.6180s
test_mongodb.Test02MonaryPandasDataFrame.test_01_store: 1.3025s
test_mongodb.Test03MonaryXrayDataset.test_01_store: 1.2680s
test_mongodb.Test00Pandas.test_01_to_dict_02_split: 1.2489s
test_mongodb.Test01PyMongoPandasDataFrameToDictRecords.test_02_retrieve: 0.5064s
test_mongodb.Test04OdoPandasDataFrame.test_02_retrieve: 0.4867s

Pandas uses vbench https://github.com/pydata/vbench

Future Development? - Java API

Hi there,

I have been trialling this out and it looks a great framework for storage retrieval, is there any plans for the Java API or are you looking for contributors to help? I'm testing this out for data storage for zipline/quantopian backtester and also for a JVM based project and was wondering what stage that was at if any?

cython issues with Python3

As I've begun porting everything to Python3, its become apparent that something with the way the compress shared library is created is not compatible with Python3. Here is the output from the unit test:

$ py.test -x
=== test session starts ===
platform linux -- Python 3.4.3, pytest-2.8.1, py-1.4.30, pluggy-0.3.1
rootdir: /home/bryant/arctic, inifile:
plugins: dbfixtures-0.12.0
Couldn't import cython lz4
collecting 8 items / 1 errors
=== ERRORS ====
___ ERROR collecting tests/integration/test_compress_integration.py ___
tests/integration/test_compress_integration.py:7: in
import arctic._compress as c
E ImportError: /home/bryant/arctic/arctic/_compress.cpython-34m.so: undefined symbol: PyString_AsString
!!!!!! Interrupted: stopping after 1 failures !!!!
=== 1 error in 0.60 seconds ====

Any thoughts?

xfailed tests in integration/tickstore/test_toplevel.py

There are 4 xfailed tests in integration/tickstore/test_toplevel.py

Because the tests previously were not running, its unknown if they ever worked. Either the tests need to be fixed, or the code they are testing need to be fixed.

@jamesblackburn can you either assign someone to this (ideally whoever originally wrote the tests, or at least someone who knows about the top level tickstore) or provide me with some background on the TopLevelTickStore so that I can attempt to fix the tests myself?

tickstore query slowly

Arctic said that can query millions of rows per second per client, but when I try to use it in our team, and found that it only thousand of rows per second, Here the code, Does anyone got the same problem or I use it with wrong way.

    @property
    def arctic(self):
        if not self._arctic:
            log.info("init arctic")
            mongo_conn = MongoDB()
            self._arctic = Arctic(mongo_host=mongo_conn.client)
            library = self._arctic.list_libraries()
            if self.tick_db not in library:
                self._arctic.initialize_library(self.tick_db, lib_type=arctic.TICK_STORE)
            if self.bar_db not in library:
                self._arctic.initialize_library(self.bar_db, lib_type=arctic.TICK_STORE)
        return self._arctic

...
# res is a dict of tick data
index = self.int_to_date(tick_time)
data = pd.DataFrame(res, [index])
self.arctic[self.tick_db].write(symbol, data)

...

>>> now = time.time(); ac['tick'].read('IF1601', date_range=dr); print(time.time() - now)
Output：
[4021 rows x 26 columns]
3.56284999847

thanks.

how_to_use_arctic.py fails with SyntaxError: unexpected EOF while parsing on library.read()

Attempting to run the demo as a test, and getting errors on library.read(). Also, I get an error the first time I run store.initialize_library('username.scratch'), but the second time it did works. I did have to make the following change to get import Arctic to run:

store/_version_store_utils.py, line 66: if pd.version.startswith("0.14"):

See paste of output included the EOF error below. I'm not sure if it's a numpy bug or arctic, but I'm not sure where to go from here.

Python 2.7.10 |Anaconda 2.3.0 (64-bit)| (default, Oct 19 2015, 18:04:42)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org

from arctic import Arctic
from datetime import datetime as dt
import pandas as pd
store = Arctic('localhost')
store.list_libraries()
[u'NASDAQ', u'username.scratch']
store.initialize_library('username.scratch')
No handlers could be found for logger "arctic.store.version_store"
store.initialize_library('username.scratch')
library = store['username.scratch']
df = pd.DataFrame({'prices': [1, 2, 3]},[dt(2014, 1, 1), dt(2014, 1, 2), dt(2014, 1, 3)])
library.write('SYMBOL', df)
VersionedItem(symbol=SYMBOL,library=arctic_username.scratch,data=<type 'NoneType'>,version=19,metadata=None
library.read('SYMBOL')
Traceback (most recent call last):
File "", line 1, in
File "/home/jeff/anaconda/lib/python2.7/site-packages/arctic/store/version_store.py", line 319, in read
date_range=date_range, read_preference=read_preference, *_kwargs)
File "/home/jeff/anaconda/lib/python2.7/site-packages/arctic/store/version_store.py", line 363, in _do_read
data = handler.read(self._arctic_lib, version, symbol, from_version=from_version, *_kwargs)
File "/home/jeff/anaconda/lib/python2.7/site-packages/arctic/store/_pandas_ndarray_store.py", line 279, in read
item = super(PandasDataFrameStore, self).read(arctic_lib, version, symbol, *_kwargs)
File "/home/jeff/anaconda/lib/python2.7/site-packages/arctic/store/_pandas_ndarray_store.py", line 193, in read
date_range=date_range, *_kwargs)
File "/home/jeff/anaconda/lib/python2.7/site-packages/arctic/store/_ndarray_store.py", line 180, in read
return self._do_read(collection, version, symbol, index_range=index_range)
File "/home/jeff/anaconda/lib/python2.7/site-packages/arctic/store/_ndarray_store.py", line 219, in _do_read
dtype = self._dtype(version['dtype'], version.get('dtype_metadata', {}))
File "/home/jeff/anaconda/lib/python2.7/site-packages/arctic/store/_ndarray_store.py", line 139, in _dtype
return np.dtype(string, metadata=metadata)
File "/home/jeff/anaconda/lib/python2.7/site-packages/numpy/core/_internal.py", line 191, in _commastring
newitem = (dtype, eval(repeats))
File "", line 1
(
^
SyntaxError: unexpected EOF while parsing

Store xray.Dataset

Hello,

Pandas maintainers are considering deprecating Panel in favor of xray Dataset.
http://xray.readthedocs.org/

pandas-dev/pandas#8906 (comment)
https://gitter.im/pydata/pandas see discussion 2015-12-30

Adding xray.Dataset, DataArray support to Arctic is probably something to consider in the medium or long term.

Kind regards

Write error on end field when writing with pandas dataframes

Very minor error, sets doc['e'] to be the start time of the ticks rather than the time of the last tick. Write with lists of dicts does not have this bug.

toplevel read raises NoDataFoundException if any lower level library raises this error

Only raise the error from the top level read if no data is returned from all the low level reads put together.

Querying VersionStore on hour/minute

Hello,

I understand that one can query data using a date range but could I specify instead a time range? For example getting all data between datetime.time(9,0) and datetime.time(12,0)?

Get the build running on CircleCI

Setup.py install is taking > 8GB for pymongo

Installed /home/ubuntu/virtualenvs/venv-system/lib/python2.7/site-packages/python_dateutil-2.4.2-py2.7.egg
Searching for pymongo>=3.0
Reading https://pypi.python.org/simple/pymongo/
Best match: pymongo 3.0.2
Downloading https://pypi.python.org/packages/source/p/pymongo/pymongo-3.0.2.tar.gz#md5=9a6af8b349946d9759d817e6f50db413
Processing pymongo-3.0.2.tar.gz
Writing /tmp/easy_install-oYSwDC/pymongo-3.0.2/setup.cfg

python setup.py install died unexpectedly

Running pymongo-3.0.2/setup.py -q bdist_egg --dist-dir /tmp/easy_install-oYSwDC/pymongo-3.0.2/egg-dist-tmp-Yqy3Kq Action failed: python setup.py install

a bug in _ndarray_store.py in windows?

Hi all, maybe there was a bug in _ndarray_store

use pandas dataframe

index = pd.date_range('1/1/2010', periods=8, tz=mktz())
df = pd.DataFrame(np.random.randn(8, 3), index=index, columns=list('abc'))
arctic = Arctic('localhost')
arctic.initialize_library('nasdaq')
store_db = arctic.get_library('nasdaq')
store_db.append('sym001', df, metadata={'source': 'test'})

it's ok...... then read it

print store_db.read('sym001', date_range=DateRange(start=20100101)).data

Here got a exception:

 File "D:\Python27\lib\site-packages\arctic-1.17.0-py2.7-win-amd64.egg\arctic\store\version_store.py", line 321, in read
date_range=date_range, read_preference=read_preference, **kwargs)

File "D:\Python27\lib\site-packages\arctic-1.17.0-py2.7-win-amd64.egg\arctic\store\version_store.py", line 366, in _do_read
data = handler.read(self._arctic_lib, version, symbol, from_version=from_version, *_kwargs)
File "D:\Python27\lib\site-packages\arctic-1.17.0-py2.7-win-amd64.egg\arctic\store_pandas_ndarray_store.py", line 301, in read
item = super(PandasDataFrameStore, self).read(arctic_lib, version, symbol, *_kwargs)
File "D:\Python27\lib\site-packages\arctic-1.17.0-py2.7-win-amd64.egg\arctic\store_pandas_ndarray_store.py", line 197, in read
date_range=date_range, *_kwargs)
File "D:\Python27\lib\site-packages\arctic-1.17.0-py2.7-win-amd64.egg\arctic\store_ndarray_store.py", line 170, in read
return self._do_read(collection, version, symbol, index_range=index_range)
File "D:\Python27\lib\site-packages\arctic-1.17.0-py2.7-win-amd64.egg\arctic\store_ndarray_store.py", line 194, in _do_read
for i, x in enumerate(collection.find(spec, sort=[('segment', pymongo.ASCENDING)],)):
File "D:\Python27\lib\site-packages\pymongo\cursor.py", line 1097, in next
if len(self.__data) or self._refresh():
File "D:\Python27\lib\site-packages\pymongo\cursor.py", line 1019, in _refresh
self.__read_concern))
File "D:\Python27\lib\site-packages\pymongo\cursor.py", line 850, in __send_message
*_kwargs)
File "D:\Python27\lib\site-packages\pymongo\mongo_client.py", line 794, in _send_message_with_response
exhaust)
File "D:\Python27\lib\site-packages\pymongo\mongo_client.py", line 805, in _reset_on_error
return func(_args, *_kwargs)
File "D:\Python27\lib\site-packages\pymongo\server.py", line 108, in send_message_with_response
set_slave_okay, sock_info.is_mongos, use_find_cmd)
File "D:\Python27\lib\site-packages\pymongo\message.py", line 275, in get_message
spec, self.fields, self.codec_options)
bson.errors.InvalidDocument: Cannot encode object: 7

It looked very stranger...
I checked this question a whole day.... and I found the problem was:
spec = {'symbol': symbol,
'parent': version.get('base_version_id', version['_id']),
'segment': {'$lt': to_index}
}
if from_index:
----->
spec['segment']['$gte'] = from_index

I change this line:
spec['segment']['$gte'] = from_index
to
spec['segment']['$gte'] = long(from_index)

because the type(from_index) was numpy.int64, it was not compatible with pymongo in windows64,
the solution was use python type instead of numpy type, here was the two way to do this:

to_index.item() and from_index.item()
long(to_index()) and long(from_index)

Were there some better solution?

Cannot compile on Windows

I installed the recommended C++ Compiler for Python 2.7, however I can't seem to run the installer.

The fatal error I got is:
C1083: Cannot open include file: 'stdint.h': No such file or directory

Is it possible to release a compiled version of this?

Enable continuous integration with Travis

Hello,

it will be nice to enable continuous integration (with Travis)

https://docs.travis-ci.com/user/database-setup/#MongoDB

It will help external contributors to send PRs and see if they can be merged safely.

Kind regards

End date bug in tickstore _pandas_to_bucket

Looks like a potential issue - needs a test and
isentium@363cca9

 @@ -531,7 +531,7 @@ def _ensure_supported_dtypes(self, array):

     def _pandas_to_bucket(self, df, symbol):
         start = self._to_dt(df.index[0].to_datetime())
-        end = self._to_dt(df.index[0].to_datetime())
+        end = self._to_dt(df.index[-1].to_datetime())
         rtn = {START: start, END: end, SYMBOL: symbol}
         rtn[VERSION] = CHUNK_VERSION_NUMBER
         rtn[COUNT] = len(df)

Quandl data-set in README doesn't exist

The README makes reference to "NASDAQ/AAPL" on Quandl. Perhaps they have re-shuffled the data but no such set exists. All worked fine for me using "WIKI/AAPL".

man-group / arctic Goto Github PK

arctic's Introduction

Development moved to ArcticDB GitHub Repository

arctic's People

Contributors

Stargazers

Watchers

Forkers

arctic's Issues

it's ok...... then read it

Here got a exception:

Recommend Projects

Recommend Topics

Recommend Org