Git Product home page Git Product logo

xlref's Introduction

xlref: Excel table reader.

Latest Version in PyPI Build status Code coverage Documentation status Issues count Supported Python versions Project License

release

1.2.4

date

2024-02-27 15:30:00

repository

https://github.com/vinci1it2000/xlref

pypi-repo

https://pypi.org/project/xlref/

docs

http://xlref.readthedocs.io/

wiki

https://github.com/vinci1it2000/xlref/wiki/

download

http://github.com/vinci1it2000/xlref/releases/

donate

https://donorbox.org/xlref

keywords

data, excel, tables, parser, reference, ranges

developers
license

EUPL 1.1+

About xlref

xlref is an useful library to capture by a simple reference (e.g., A1(RD):..:RD) a table with non-empty cells from Excel-sheets when its exact position is not known beforehand.

This code was inspired by the xleash module of the pandalone library. The reason of developing a similar tool was to have a smaller library to install and improve the performances of reading .xlsx files.

Installation

To install it use (with root privileges):

$ pip install xlref

Or download the last git version and use (with root privileges):

$ python setup.py install

Reference Syntax

The capturing is preformed according to an excel like reference syntax and the non-empty cells of the targeted excel-sheet. The syntax is defined as follows:

[<excel>]#[<sheet>!]<st-cel>[(<moves>)][:<nd-cel>[(<moves>)]][:<expansion>][<filters>]

Note

The fields between square parenthesis are optionals.

Follows the description of the parameters:

  • excel: excel file path relative to the parent reference file directory. If not defined, the parent reference excel is inherited.
  • sheet: excel sheet name if not defined, the parent reference excel sheet name is inherited.
  • st-cel: first cell coordinate of excel range. The cell coordinate (i.e., <column><row>) is defined by a column (letter) and row (number), like in excel. xlref allows two special characters ^ and _, that represents the leftmost/topmost and rightmost/bottommost non-empty cell column/row.
  • moves: the sequence of primitive directions (i.e., `L`:left, `U`: up, `R`: right, `D`: down) that xlref uses iteratively for finding the first non-empty cell. The allowed primitive direction combinations are L, U, R, D, LD, LU, UL, UR, RU, RD, DL, and DR. The following diagram shows the graphically the moves from the starting cell `X`:

    U

    UL◄───┐▲┌───►UR

    LU │││ RU

    ▲ │││ ▲ │ │││ │ └─────┼│┼─────┘

    L◄──────X──────►R

    ┌─────┼│┼─────┐ │ │││ │ ▼ │││ ▼

    LD │││ RD
    DL◄───┘▼└───►DR

    D

  • nd-cel: second cell coordinate of excel range. It has the same syntax of the st-cel, but it has and extra special character .. This represents the column or row of the st-cel after the application of the moves.
  • expansion: the sequence of primitive directions to expand the captured range.
  • filters: list of string and or dictionaries that defines the filters to apply iteratively on the captured range.

Reference Reading Steps

The library performs the following steps to read a reference:

  1. Open the excel file or inherits the parent's one,
  2. Open the sheet by its name or inherits the parent's one,
  3. Set the first range cell,
  4. Move the first cell according to the specified moves until it finds the first non-empty cell,
  5. Set the second range cell or inherits the moved first range cell,
  6. Move the second cell like in point 4,
  7. Expand the range according to the defined expansions,
  8. Apply the iteratively the filters on the captured range.

Tutorial

>>> import os.path as osp >>> from setup import mydir >>> _ref = osp.join(mydir, 'tests/files/excel.xlsx#ref!A1(RD):RD[%s]')

A typical example is capturing a table with a "header" row and convert into a dictionary. The code below shows how to do it:

>>> import xlref as xl >>> _ref = 'excel.xlsx#ref!A1(RD):RD[%s]' # doctest: +SKIP >>> ref = xl.Ref(_ref % '"dict"') >>> ref.range # Captured range. B2:C28 >>> values = ref.values; values # Captured values. {...} >>> values['st-cell-move'] '#D5(RU):H1(DL)'

You can notice from the code above that all the values of the dictionary are references. To parse it recursively, there are two options:

  1. add the "recursive" filter before the "dict":

    >>> values = xl.Ref(_ref % '"recursive", "dict"').values >>> values['st-cell-move'].tolist() [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]

  2. apply a filter onto dictionary' values using the extra functionality of the "dict" filter:

    >>> values = xl.Ref(_ref % '{"fun": "dict", "value":"ref"}').values >>> values['st-cell-move'].tolist() [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]

You have also the possibility to define and use your custom filters as follows:

>>> import numpy as np >>> xl.FILTERS['my-filter'] = lambda parent, x: np.sum(x) >>> xl.Ref('#D5(RU):H1(DL)["my-filter"]', ref).values 45.0

An alternative way is to use directly the methods of the filtered results as follows:

>>> xl.Ref('#D5(RU):H1(DL)["sum"]', ref).values 45.0

xlref's People

Contributors

vinci1it2000 avatar

Stargazers

Marco Magnano avatar Udi Finkelstein avatar Chris Hart avatar  avatar actuarial.tools avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.