Git Product home page Git Product logo

pylightxl's People

Contributors

albinkc avatar alexjj avatar alwinw avatar ammgws avatar boidolr avatar cgtobi avatar chessmith avatar danchianucci avatar g-as avatar hellomorrismoss avatar jfcorbett avatar kishkin avatar kxrob avatar mportesdev avatar pydpiper avatar sbardian avatar yohanboniface avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pylightxl's Issues

RecursionError: maximum recursion depth exceeded while calling a Python object

Pylightxl Version: 1.55
Python Version: 3.8

Summary of Bug/Feature:
Throws RecursionError: maximum recursion depth exceeded while calling a Python object

Traceback:
j5a2q [2021-06-06 17:51:48,991] [ERROR] file_upload_views.post : FileUpload exception.
j5a2q Traceback (most recent call last):
j5a2q File "/home/app/web/api/all_views/file_upload_views.py", line 72, in post
j5a2q file_object = File.objects.create(**data)
j5a2q File "/usr/local/lib/python3.8/site-packages/django/db/models/manager.py", line 82, in manager_method
j5a2q return getattr(self.get_queryset(), name)(*args, **kwargs)
j5a2q File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 433, in create
j5a2q obj.save(force_insert=True, using=self.db)
j5a2q File "/usr/local/lib/python3.8/site-packages/django/db/models/base.py", line 748, in save
j5a2q self.save_base(using=using, force_insert=force_insert,
j5a2q File "/usr/local/lib/python3.8/site-packages/django/db/models/base.py", line 785, in save_base
j5a2q updated = self._save_table(
j5a2q File "/usr/local/lib/python3.8/site-packages/django/db/models/base.py", line 890, in _save_table
j5a2q results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
j5a2q File "/usr/local/lib/python3.8/site-packages/django/db/models/base.py", line 927, in _do_insert
j5a2q return manager._insert(
j5a2q File "/usr/local/lib/python3.8/site-packages/django/db/models/manager.py", line 82, in manager_method
j5a2q return getattr(self.get_queryset(), name)(*args, **kwargs)
j5a2q File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 1204, in _insert
j5a2q return query.get_compiler(using=using).execute_sql(returning_fields)
j5a2q File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1393, in execute_sql
j5a2q for sql, params in self.as_sql():
j5a2q File "/usr/local/lib/python3.8/site-packages/sql_server/pyodbc/compiler.py", line 434, in as_sql
j5a2q value_rows = [
j5a2q File "/usr/local/lib/python3.8/site-packages/sql_server/pyodbc/compiler.py", line 435, in
j5a2q [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
j5a2q File "/usr/local/lib/python3.8/site-packages/sql_server/pyodbc/compiler.py", line 435, in
j5a2q [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
j5a2q File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1288, in pre_save_val
j5a2q return field.pre_save(obj, add=True)
j5a2q File "/usr/local/lib/python3.8/site-packages/django/db/models/fields/files.py", line 288, in pre_save
j5a2q file.save(file.name, file.file, save=False)
j5a2q File "/usr/local/lib/python3.8/site-packages/django/db/models/fields/files.py", line 87, in save
j5a2q self.name = self.storage.save(name, content, max_length=self.field.max_length)
j5a2q File "/usr/local/lib/python3.8/site-packages/django/core/files/storage.py", line 52, in save
j5a2q return self._save(name, content)
j5a2q File "/usr/local/lib/python3.8/site-packages/storages/backends/sftpstorage.py", line 130, in _save
j5a2q self._mkdir(dirname)
j5a2q File "/usr/local/lib/python3.8/site-packages/storages/backends/sftpstorage.py", line 115, in _mkdir
j5a2q self._mkdir(parent)
j5a2q File "/usr/local/lib/python3.8/site-packages/storages/backends/sftpstorage.py", line 115, in _mkdir
j5a2q self._mkdir(parent)
j5a2q File "/usr/local/lib/python3.8/site-packages/storages/backends/sftpstorage.py", line 115, in _mkdir
j5a2q self._mkdir(parent)
j5a2q [Previous line repeated 904 more times]
j5a2q File "/usr/local/lib/python3.8/site-packages/storages/backends/sftpstorage.py", line 114, in _mkdir
j5a2q if not self.exists(parent):
j5a2q File "/usr/local/lib/python3.8/site-packages/storages/backends/sftpstorage.py", line 151, in exists
j5a2q self.sftp.stat(self._remote_path(name))
j5a2q File "/usr/local/lib/python3.8/site-packages/storages/backends/sftpstorage.py", line 87, in sftp
j5a2q self._connect()
j5a2q File "/usr/local/lib/python3.8/site-packages/storages/backends/sftpstorage.py", line 67, in _connect
j5a2q self._ssh.connect(self._host, **self._params)
j5a2q File "/usr/local/lib/python3.8/site-packages/paramiko/client.py", line 340, in connect
j5a2q to_try = list(self._families_and_addresses(hostname, port))
j5a2q File "/usr/local/lib/python3.8/site-packages/paramiko/client.py", line 203, in _families_and_addresses
j5a2q addrinfos = socket.getaddrinfo(
j5a2q File "/usr/local/lib/python3.8/site-packages/gevent/_socketcommon.py", line 247, in getaddrinfo
j5a2q addrlist = get_hub().resolver.getaddrinfo(host, port, family, type, proto, flags)
j5a2q File "/usr/local/lib/python3.8/site-packages/gevent/resolver/thread.py", line 63, in getaddrinfo
j5a2q return self.pool.apply(_socket.getaddrinfo, args, kwargs)
j5a2q File "/usr/local/lib/python3.8/site-packages/gevent/pool.py", line 161, in apply
j5a2q return self.spawn(func, *args, **kwds).get()
j5a2q File "src/gevent/event.py", line 329, in gevent._gevent_cevent.AsyncResult.get
j5a2q File "src/gevent/event.py", line 359, in gevent._gevent_cevent.AsyncResult.get
j5a2q File "src/gevent/event.py", line 347, in gevent._gevent_cevent.AsyncResult.get
j5a2q File "src/gevent/event.py", line 327, in gevent._gevent_cevent.AsyncResult.raise_exception
j5a2q File "src/gevent/event.py", line 272, in gevent.gevent_cevent.AsyncResult.exc_info.get

j5a2q File "/usr/local/lib/python3.8/site-packages/gevent/_tblib.py", line 415, in g
j5a2q return f(a)
j5a2q File "/usr/local/lib/python3.8/site-packages/gevent/_tblib.py", line 476, in load_traceback
j5a2q return loads(s)
j5a2q File "/usr/local/lib/python3.8/site-packages/gevent/tblib.py", line 149, in _init
j5a2q self.co_filename = code.co_filename
j5a2q File "/usr/local/lib/python3.8/site-packages/gevent/tblib.py", line 133, in _getattr
j5a2q return self[name]
j5a2q RecursionError: maximum recursion depth exceeded while calling a Python object

pylightxl.Database.ws_names returns incorrect sheet names in some cases

Pylightxl Version: 1.60
Python Version: 3.9.14

Code to reproduce the problem:

import pylightxl

path = 'test_sheet.xlsx'
db = pylightxl.readxl(fn=path)
print(db.ws_names)

The output is:

["'#3969'", '#3969']

However the file contains only 1 sheet named #3969.
The weird thing is that if I rename the sheet to sheet for example, save it in LibreOffice (see the attached test_sheet_rename.xlsx), then I get from the above code what is expected: ws_names return only 1 sheet named sheet.

test_sheet.xlsx
test_sheet_rename.xlsx

writexl does not handle creating files in correct directory

Version: 1.51
Python Version: 3.8.2
Summary of Bug/Feature:
There are two issues in writexl_alt_writer function called by writexl

  1. It does not handle use case where a path with dirname/filename is passed. It ends up writing files in "." instead of creating/writing files in "dirname".
  2. Line 569 uses a windows directory separator ("") instead of using os independent approach to create path

Traceback:
None. I have suggested some fixes and diff between fixed version vs. 1.51

Suggestion for fix:
In terms of writing file to correct directory, as part of moving zipped up file out of temp folder,

  • first get an absolute path name for the newly created file (filename)
  • pop back to the directory where you were running (os.chdir(old_dir))
  • create targetpath name based on the path that was passed in (could be relative name and needs to handle current dir)
  • call shutil.move with filename and targetpath
  • Here is the diff between 1.51 (<). and working version (>)
  • 565a566,570
# to complete move, get absolute path, change back to directory where you were running and 
# use the pathname received to move it to right place
filepath = os.getcwd() + "/" + filename
os.chdir(old_dir)
targetpath = os.path.split(path)[0] if os.path.split(path)[0] != "" else "." 

567c572
< shutil.move(filename, old_dir)

    shutil.move(filepath, targetpath)

569,571c574,575
< os.remove(old_dir + '\' + filename)
< shutil.move(filename, old_dir)
< os.chdir(old_dir)

    os.remove(os.path.join(old_dir, filename))
    shutil.move(filename, targetpath) 

Cell values beyond the 27th column are not read

It looks like column AA is the rightmost column read by pylightxl. Columns AB and beyond all come up as empty.

For example, the last three cells in this sheet show up as empty in pylightxl:

sample2.xlsx

But AB1 should be 27, AC1 should be 28, and AD1 should be 29.

Type hinting issue

Pylightxl Version: 1.5.9
Python Version: 3.8

Summary of Bug/Feature: In the quickstart guide for reading excel files it states:

# pylightxl also supports pathlib as well
my_pathlib = pathlib.Path('folder1/folder2/excelfile.xlsx')
db = xl.readxl(my_pathlib)

This works, but when adding that to my IDE (pycharm) it complains about a path object not being a string but a Path object. This is just a typehinting issue, but I thought I would report it anyway.

Suggestion for fix: Just replace the string type hinting with a union of string and path

Problems when explicitly loading multiple sheets by name

Steps to reproduce:

  1. Create two trivial XLSX workbooks, one with 2 worksheets and another with 3 worksheets. I have attached examples.
  2. Create a new python 3.7+ environment that contains ipython for demonstration. (I used 3.8.1; my colleague used 3.7.6.)
  3. pip install pylightxl
  4. Explore corner cases.

We find problems, especially when loading the last worksheet when naming the sheets explicitly. The behavior seems to follow the following rules:

  • If the last sheet requested is the last sheet in the workbook, raise an IndexError.
  • Otherwise, count the number of sheets requested and return that number of sheets from the workbooks starting with the first worksheet. At this point the names don't seem to matter, just the number of names passed in the tuple to sheetnames=.

The following IPython transcript captures the problem:

(pylightxl) ➜  pylightxl pip install pylightxl
Collecting pylightxl
  Downloading pylightxl-1.3-py3-none-any.whl (8.1 kB)
Installing collected packages: pylightxl
Successfully installed pylightxl-1.3
(pylightxl) ➜  pylightxl ipython
Python 3.8.1 | packaged by conda-forge | (default, Jan 29 2020, 15:06:10)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.12.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import pylightxl as xl

In [2]: fn = 'Book1.xlsx'

In [3]: db1 = xl.readxl(fn)

In [4]: db1.ws_names
Out[4]: ['Sheet1', 'Sheet2']

In [5]: db2 = xl.readxl(fn, sheetnames=('Sheet2',))

In [6]: db2.ws_names
Out[6]: ['Sheet1']

In [7]: db3 = xl.readxl(fn, sheetnames=('bad_name', 'Sheet2',))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/conda/envs/pylightxl/lib/python3.8/site-packages/pylightxl/readxl.py in readxl(fn, sheetnames)
     44                 try:
---> 45                     pop_index = sh_names.index(sn)
     46                     temp.append(zip_sheetnames.pop(pop_index))

ValueError: 'bad_name' is not in list

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-7-aaed2c1506d9> in <module>
----> 1 db3 = xl.readxl(fn, sheetnames=('bad_name', 'Sheet2',))

~/conda/envs/pylightxl/lib/python3.8/site-packages/pylightxl/readxl.py in readxl(fn, sheetnames)
     46                     temp.append(zip_sheetnames.pop(pop_index))
     47                 except ValueError:
---> 48                     raise ValueError('Error - Sheetname ({}) is not in the workbook.'.format(sn))
     49             zip_sheetnames = temp
     50

ValueError: Error - Sheetname (bad_name) is not in the workbook.

In [8]: db4 = xl.readxl(fn, sheetnames=('Sheet1', 'Sheet2',))
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-8-db91480dd23e> in <module>
----> 1 db4 = xl.readxl(fn, sheetnames=('Sheet1', 'Sheet2',))

~/conda/envs/pylightxl/lib/python3.8/site-packages/pylightxl/readxl.py in readxl(fn, sheetnames)
     44                 try:
     45                     pop_index = sh_names.index(sn)
---> 46                     temp.append(zip_sheetnames.pop(pop_index))
     47                 except ValueError:
     48                     raise ValueError('Error - Sheetname ({}) is not in the workbook.'.format(sn))

IndexError: pop index out of range

In [9]: fn_2 = 'Book2.xlsx'

In [10]: db5 = xl.readxl(fn_2, sheetnames=('Sheet1', 'Sheet2',))

In [11]: db5.ws_names
Out[11]: ['Sheet1', 'Sheet2']

In [12]: db6 = xl.readxl(fn_2, sheetnames=('Sheet1', 'Sheet3',))
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-12-0779be8e5664> in <module>
----> 1 db6 = xl.readxl(fn_2, sheetnames=('Sheet1', 'Sheet3',))

~/conda/envs/pylightxl/lib/python3.8/site-packages/pylightxl/readxl.py in readxl(fn, sheetnames)
     44                 try:
     45                     pop_index = sh_names.index(sn)
---> 46                     temp.append(zip_sheetnames.pop(pop_index))
     47                 except ValueError:
     48                     raise ValueError('Error - Sheetname ({}) is not in the workbook.'.format(sn))

IndexError: pop index out of range

In [13]: db7 = xl.readxl(fn_2, sheetnames=('Sheet2', 'Sheet3',))
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-13-144bfe03180b> in <module>
----> 1 db7 = xl.readxl(fn_2, sheetnames=('Sheet2', 'Sheet3',))

~/conda/envs/pylightxl/lib/python3.8/site-packages/pylightxl/readxl.py in readxl(fn, sheetnames)
     44                 try:
     45                     pop_index = sh_names.index(sn)
---> 46                     temp.append(zip_sheetnames.pop(pop_index))
     47                 except ValueError:
     48                     raise ValueError('Error - Sheetname ({}) is not in the workbook.'.format(sn))

IndexError: pop index out of range

In [14]: db8 = xl.readxl(fn_2, sheetnames=('Sheet3', 'Sheet2',))

In [15]: db8.ws_names
Out[15]: ['Sheet1', 'Sheet2']

In [16]: db9 = xl.readxl(fn_2, sheetnames=('Sheet3',))

In [17]: db9.ws_names
Out[17]: ['Sheet1']

In [18]: db10 = xl.readxl(fn_2, sheetnames=('Sheet2',))

In [19]: db10.ws_names
Out[19]: ['Sheet1']

In [20]:

Book1.xlsx
Book2.xlsx

KeyError bug in readxl

Pylightxl Version:1.51
Python Version:3.8.3
Office Version: LibreOffice 7.0.4.2

Summary of Bug/Feature: KeyError Bug

Traceback:

File "... ...\pylightxl\pylightxl.py", line 135, in readxl
    worksheet = ordered_ws[order]
KeyError: 1

Suggestion for fix:

def readxl(fn, ws=None):
        ... ...
        # get all worksheets
        for order in range(1, len(ordered_ws) + 1):      ---->   for order in ordered_ws.keys():
             worksheet = ordered_ws[order]
        ... ...

I found ordered_ws is {2: 'sheet1', 3: 'sheet2', 4: 'sheet3'}
It seems like the start number is 2 instead of 1.
so I do that change, that worked for me.
I am using LibreOffice to edit this xlsx file. So maybe there is some bug in LibreOffice ( or in pylightxl ).

Thanks :)

Incorrect values for spreadsheets with large amounts of text

In spreadsheets with large amounts of text, I'm finding that reading text from one field sometimes gives me the text (or even partial text) of another field.

As an example, using this publicly available Excel document: https://digital.nhs.uk/binaries/content/assets/legacy/excel/8/c/hes_data_dictionary_20170914.xlsx (note that there's a Content-Disposition header that means it will save under a different filename)

For example, print(db.ws('APC').address(address='H276')) should give "Age at start of the episode (STARTAGE), with decimalised values for babies." but it actually gives "Rule # 631, 641, 651 , 660 and 671", which appears to be the contents of cell J43 in a different sheet (OP). Requesting the contents of that cell (print(db.ws('OP').address(address='J43'))) gives the text from K32 in sheet APC.

Meanwhile, print(db.ws('APC').address(address='H261')) gives "34 Decision not to treat - decision not to treat made or no further contact required", which is one line from the middle of a large text cell (which is repeated in all three sheets, so I'm not exactly sure which one its coming from).

Other possibly useful observations: I certainly see correct results for large portions of the file. It seems to start going wrong about halfway through. The errors seem to be consistently reproducible (i.e. I get the exact same wrong text for each cell each time). Deleting all the contents of one of the text-heavy columns (column I) seemed to resolve the issue (but undoing this change and saving again caused it to come back). I guess there's some subtlety in how the Excel file format references text internally that's throwing off pylightxl.

Let me know if I can provide any more information.

Cannot open excel xlsx file

Pylightxl Version:
1.55 , 1.54
Python Version:
3.7

Summary of Bug/Feature:
Cannot open xlsx excel file. Using the following code:

  file, _ = urllib.request.urlretrieve("file.xlsx")
  path = pathlib.Path.cwd() / file
  db = pylightxl.readxl(path)

Traceback:

Traceback (most recent call last):
  File "open_excel.py", line 310, in <module>
    main()
  File "open_excel.py", line 303, in main
    read_from_the_excel_file()
  File "open_excel.py", line 245, in read_from_the_excel_file
    pylightxl.readxl(path)
  File "/opt/python/3.7/lib/python3.7/site-packages/pylightxl/pylightxl.py", line 118, in readxl
    wb_rels = readxl_get_workbook(fn)
  File "/opt/python/3.7/lib/python3.7/site-packages/pylightxl/pylightxl.py", line 227, in readxl_get_workbook
    for tag_sheet in root.findall('./default:sheets/default:sheet', ns):
  File "/opt/python/3.7/lib/python3.7/xml/etree/ElementPath.py", line 313, in findall
    return list(iterfind(elem, path, namespaces))
  File "/opt/python/3.7/lib/python3.7/xml/etree/ElementPath.py", line 292, in iterfind
    token = next()
  File "/opt/python/3.7/lib/python3.7/xml/etree/ElementPath.py", line 83, in xpath_tokenizer
    raise SyntaxError("prefix %r not found in prefix map" % prefix) from None
SyntaxError: prefix 'default' not found in prefix map

Pylightxl Encoding Problem Proccessing Chinese excel File.

Pylightxl Version:1.56beta
Python Version:3.8.10

Summary of Bug/Feature:
When Pylightxl deal with a chinese excel file(xlsx), it encounter a encoding problem

Traceback:
Traceback (most recent call last):
File "tjexcel.py", line 103, in
main()
File "tjexcel.py", line 98, in main
xl.writexl(db=db, fn=myfilepath)
File "C:\Program\py\pylightxl\pylightxl.py", line 559, in writexl
writexl_alt_writer(db, fn)
File "C:\Program\py\pylightxl\pylightxl.py", line 588, in writexl_alt_writer
text = writexl_alt_app_text(db, temp_folder + '/docProps/app.xml')
File "C:\Program\py\pylightxl\pylightxl.py", line 718, in writexl_alt_app_tex
ns = utility_xml_namespace(f)
File "C:\Program\py\pylightxl\pylightxl.py", line 2023, in utility_xml_namespce
for event, elem in ET.iterparse(file, events):
File "C:\Program\Python3\lib\xml\etree\ElementTree.py", line 1229, in iterato
data = source.read(16 * 1024)
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa8 in position 412: illegal
multibyte sequence

Suggestion for fix:

def writexl_alt_app_text(db, filepath):

with open(filepath, 'r', encoding='utf-8') as f:

Writing to an existing excel file will cause a warning when opening it.

Pylightxl Version: 1.56
Python Version: 3.7.9
Platform: win 10 x64

Summary of Bug/Feature:
Writing to an existing excel file will cause the warning "We found a problem with some content in Excel. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes." when opening it.

Traceback:
There is no error in python.

Suggestion for fix:
Erase the warning popup.

Code for reproducing the problem:

import pylightxl

data = [('start', 'end'), ('1', '2'), ('321', '432')]
db = pylightxl.Database()
db.add_ws(ws='Sheet1')
for row, x in enumerate(data, start=1):
    for col, value in enumerate(x, start=1):
        db.ws(ws='Sheet1').update_index(row, col, value)
pylightxl.writexl(db, 'test.xlsx')  # All went well when opening the excel file.
pylightxl.writexl(db, 'test.xlsx')  # The warning happened.

Accessing a full column, row, or sheet named range fails

Pylightxl Version: 1.61
Python Version: 3.7.4

Summary of Bug/Feature:
Excel permits a named range for a full column, row, or sheet. Accessing an existing named range for one of these fails. The format of a full column named range is in the format: Sheet1!A:A

Pylightxl expects a number following a letter and halts when parsing the address.

The workaround is to define a named range with an explicit column/row boundary; however, one may not always have permission to re-define ranges.

Traceback:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\pylightxl\pylightxl.py in utility_address2index(address)
   2222     try:
-> 2223         row = int(strVSnum.split(address)[1])
   2224     except (IndexError, ValueError):

ValueError: invalid literal for int() with base 10: ''

During handling of the above exception, another exception occurred:

UserWarning                               Traceback (most recent call last)
<ipython-input-28-7f447233b8b5> in <module>
      7 #xldb.ws(ws='Sheet1').nr(name='dataset_id')
      8 #xldb.nr_names
----> 9 xldb.nr('dataset_name')

C:\ProgramData\Anaconda3\lib\site-packages\pylightxl\pylightxl.py in nr(self, name, formula, output)
   1677 
   1678         ws, address = full_address.split('!')
-> 1679         return self.ws(ws).range(address, output=output)
   1680 
   1681     def nr_loc(self, name):

C:\ProgramData\Anaconda3\lib\site-packages\pylightxl\pylightxl.py in range(self, address, formula, output)
   1847         if ':' in address:
   1848             address_start, address_end = address.split(':')
-> 1849             row_start, col_start = utility_address2index(address_start)
   1850             row_end, col_end = utility_address2index(address_end)
   1851 

C:\ProgramData\Anaconda3\lib\site-packages\pylightxl\pylightxl.py in utility_address2index(address)
   2224     except (IndexError, ValueError):
   2225         raise UserWarning('pylightxl - Incorrect address ({}) entry. Address must be an alphanumeric '
-> 2226                          'where the trailing character(s) are numeric characters 1-9'.format(address))
   2227 
   2228     return [row, col]

UserWarning: pylightxl - Incorrect address (D) entry. Address must be an alphanumeric where the trailing character(s) are numeric characters 1-9

Suggestion for fix:
Check for this condition when parsing.

Multiline string not working in writecsv

Pylightxl Version: 1.55
Python Version: 3

Summary of Bug/Feature:
Method writecsv does not support multiline contents (cell that includes "\n"), which will break the database structure sometime

Suggestion for fix:

My easiest implementation

def q(i):
    return '"' + i.replace('"', '""') + '"'
f.write(delimiter.join(map(q, row)))
f.write('\n')

f.write(delimiter.join(row))
f.write('\n')

cell formatting

Does this library read cell formats? I.e. "percentage", "text" etc?

Read content from stream

Pylightxl Version:
1.47
Python Version:
3.7

Summary of Feature:
Hi, thanks for a nice package. Is there any way to read from a file stream?
Like:

with open("my_file.xlsx", "r") as infile:
    db = pylightxl.readxl(infile)

Suggestion for fix

Add a new argument to readxl like pylightxl.readxl(file_content=excel_file.read())

Wrong index assignment: self.size[1] should be self.size[0]

def keyrow(self, key, keyindex=1):
"""
Takes a row key value (value of any cell within keyindex col) and returns the entire row,
no match returns an empty list
:param str/int/float key: any cell value within keyindex col (type sensitive)
:param int keyrow: option keyrow override. Must be >0 and smaller than worksheet size
:return list: list of the entire matched key row data (only first match is returned)
"""
if not keyindex > 0 and not keyindex <= self.size[1]:
raise UserWarning('pylightxl - keyindex ({}) entered must be >0 and <= worksheet size ({}.'.format(keyindex,self.size))
# find first key match, get its column index and return col list
for row_i in range(1, self.size[1] + 1):
if key == self.index(row_i, keyindex):
return self.row(row_i)
return []

Load excel from file-like object

Pylightxl Version:1.57
Python Version:3.9.7

Summary of Bug/Feature: I receive the data from an HTTP request, convert to an file-like obj using io.BytesIO() and it dosen't load the excel.... Using the openpyxl works, and if I write the data received from the HTTP request in to an file and pass the path to the pylightxl.readxl works...
sorry for my English (Using Google Translate)

Traceback:

  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/home/jhony/Documentos/Programação/Python/Projects/PySimpleGui/ConfiaSatSystem/confiasatsystem/handler/simcard/simABC.py", line 49, in run
    await self.run_main()
  File "/home/jhony/Documentos/Programação/Python/Projects/PySimpleGui/ConfiaSatSystem/confiasatsystem/handler/simcard/alllcom.py", line 101, in run_main
    excel = pylightxl.readxl(file_like_obj)
  File "/home/jhony/Documentos/Programação/Python/Projects/PySimpleGui/ConfiaSatSystem/env/lib/python3.9/site-packages/pylightxl/pylightxl.py", line 115, in readxl
    fn = readxl_check_excelfile(fn)
  File "/home/jhony/Documentos/Programação/Python/Projects/PySimpleGui/ConfiaSatSystem/env/lib/python3.9/site-packages/pylightxl/pylightxl.py", line 189, in readxl_check_excelfile
    raise UserWarning('pylightxl - Incorrect file entry ({}).'.format(fn))
UserWarning: pylightxl - Incorrect file entry (<_io.BytesIO object at 0x7f0e2c94c8b0>).

Suggestion for fix:

Writing output seems to not handle Windows file paths correctly

Steps to reproduce

path = os.path.join(os.getcwd(), "app", "test.xlsx")
xl.writexl(db, path)

To double check that path is valid I wrote:

path = os.path.join(os.getcwd(), "app", "test.xlsx")
print("Is it File? " + str(os.path.isfile(path)))

Which outputs Is it File? True.

Error

Traceback (most recent call last):
  File "C:\Users\grant\AppData\Roaming\Python\Python38\site-packages\flask\app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "C:\Users\grant\AppData\Roaming\Python\Python38\site-packages\flask\app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "C:\Users\grant\AppData\Roaming\Python\Python38\site-packages\flask_cors\extension.py", line 161, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "C:\Users\grant\AppData\Roaming\Python\Python38\site-packages\flask\app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "C:\Users\grant\AppData\Roaming\Python\Python38\site-packages\flask\_compat.py", line 39, in reraise
    raise value
  File "C:\Users\grant\AppData\Roaming\Python\Python38\site-packages\flask\app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "C:\Users\grant\AppData\Roaming\Python\Python38\site-packages\flask\app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "C:\Users\path\views.py", line 362, in hardware_inventory
    xl.writexl(db, path)  # TODO - this needs to be sent back as a file
  File "C:\Users\grant\AppData\Roaming\Python\Python38\site-packages\pylightxl\writexl.py", line 50, in writexl
    alt_writer(db, path)
  File "C:\Users\grant\AppData\Roaming\Python\Python38\site-packages\pylightxl\writexl.py", line 69, in alt_writer
    f.extractall(temp_folder)
  File "C:\Program Files\Python38\lib\zipfile.py", line 1647, in extractall
    self._extract_member(zipinfo, path, pwd)
  File "C:\Program Files\Python38\lib\zipfile.py", line 1693, in _extract_member
    os.makedirs(upperdirs)
  File "C:\Program Files\Python38\lib\os.py", line 213, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "C:\Program Files\Python38\lib\os.py", line 213, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "C:\Program Files\Python38\lib\os.py", line 213, in makedirs
    makedirs(head, exist_ok=exist_ok)
  [Previous line repeated 8 more times]
  File "C:\Program Files\Python38\lib\os.py", line 223, in makedirs
    mkdir(name, mode)
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: '_pylightxl_C:'
INFO:werkzeug:127.0.0.1 - - [03/Aug/2020 17:06:07] "GET /api/hardware_inventory HTTP/1.1" 500 -

Update:

xl.writexl(db, os.path.relpath(path, '.'))

The above works using the relative path leading me to believe it's something about the absolute path.

xl.writexl(db, "test.xlsx")
os.replace("test.xlsx", path)

also works.

Error opening Excel file

Pylightxl Version: 1.57
Python Version: 3.9.1

Summary of Bug/Feature:
Error opening excel file because of Sheet ID issue. See issue #53

Traceback:

Traceback (most recent call last):
  File "D:\FolhaCaixa2\lusitania\api.py", line 46, in <module>
    lusitania.convert()
  File "D:\FolhaCaixa2\utils.py", line 89, in convert
    db = xl.readxl('Lusitania_Recibos.xlsx')
  File "C:\Users\Andre\.virtualenvs\FolhaCaixa2-LlgGVunk\lib\site-packages\pylightxl\pylightxl.py", line 119, in readxl
    wb_rels = readxl_get_workbook(fn)
  File "C:\Users\Andre\.virtualenvs\FolhaCaixa2-LlgGVunk\lib\site-packages\pylightxl\pylightxl.py", line 235, in readxl_get_workbook
    sheetId = int(rId.split('rId')[-1])
ValueError: invalid literal for int() with base 10: 'R47fd958b504b4526'

Suggestion for fix:
PR: #56

read worksheet on demand

Is it possible to read 1 worksheet on demand to save on memory?

My research says this is not possible on xlsx files because they are zip/xml files. Is this correct?

Repository tags out of sync with PyPi release

I package this for the AUR using git tags, and this repository doesn't seem to have a tag associated with the recent release (1.60 on PyPi).

I'd appreciate if a tag for 1.60 was pushed to the repository. Thanks! 😄

Error when access formula with None stored

Pylightxl Version: 1.58
Python Version: 3.7

Summary of Bug/Feature: When using sheet.row with output='f', I get the traceback shown below. I've printed the sheet._data dictionary and it does in fact have None for some of the formulas. It is an array formula that I see it.

Traceback:

Traceback (most recent call last):
  File "testing.py", line 285, in test_read
    result = read_pylightxl(path)
  File "testing.py", line 158, in read_with_pylightxl
    page.append(sheet.row(row, output=output))
  File "<snip>\site-packages\pylightxl\pylightxl.py", line 1796, in row
    val = self.index(row, c, output=output)
  File "<snip>\site-packages\pylightxl\pylightxl.py", line 1733, in index
    return self.address(address, output=output)
  File "<snip>\site-packages\pylightxl\pylightxl.py", line 1656, in address
    rv = '=' + self._data[address]['f']
TypeError: can only concatenate str (not "NoneType") to str

Suggestion for fix (Option 1): Fix the immediate issue only

Change this

rv = '=' + self._data[address]['f']

to:

                rv = '=' + self._data[address]['f'] or ''

Suggestion for fix (Option 2): Avoid None earlier in the code for values, formulas, and comments

Change this

data.update({cell_address: {'v': cell_val, 'f': cell_formula, 's': '', 'c': comment}})

to:

        data.update({cell_address: {'v': cell_val or '', 'f': cell_formula or '', 's': '', 'c': comment or ''}})

Read comment from cell

Pylightxl Version:
1.54
Python Version:
3.5.3

Summary of Bug/Feature:
Is it possible to get the comment of each cell or maybe for the first cells in the first row ?

(feature request) please consider adding type defintions

Pylightxl Version: 1.59
Python Version: 3.9

Summary of Bug/Feature: at the moment pylightxl doesn't use types, nor it expose them. So tools like mypy cannot check the user interactions

Traceback:

Suggestion for fix:add typing annotations to the module

Unable to save an excel file after modifing it.

I tried modifying a file using the following script, but got an error as described below.
I am using python 2.7.13 with windows 7.
The excel sheet I imported is sales.xlsx

`import pylightxl as xl
from datetime import datetime

def transform():
db = xl.readxl("sales.xlsx")
sheet = db.ws("Sheet 1")
k = 2
while str(sheet.index(k, 2)).startswith("VOLKPH"):
inv_no_l = list(sheet.index(k, 2))
inv_no_l[5] = '0'
sheet.update_index(k, 2,val = ''.join(inv_no_l))
sheet.update_index(k, 1, int(datetime.strftime(datetime.strptime(sheet.index(k, 1), format("%d-%m-%Y")), format('%Y%m%d'))))
k+=1
xl.writexl(db, "salesmodified.xlsx")

if __name__=='__main__':
transform()
`

===== RESTART: G:\Automation\Python2_7\tally_xml_generator\transform.py =====

Traceback (most recent call last):
File "G:\Automation\Python2_7\tally_xml_generator\transform.py", line 18, in
transform()
File "G:\Automation\Python2_7\tally_xml_generator\transform.py", line 15, in transform
xl.writexl(db, "salesmodified.xlsx")
File "G:\Automation\Python2_7\lib\site-packages\pylightxl\pylightxl.py", line 432, in writexl
writexl_alt_writer(db, fn)
File "G:\Automation\Python2_7\lib\site-packages\pylightxl\pylightxl.py", line 456, in writexl_alt_writer
text = writexl_alt_app_text(db, temp_folder + '/docProps/app.xml')
File "G:\Automation\Python2_7\lib\site-packages\pylightxl\pylightxl.py", line 586, in writexl_alt_app_text
tag_vt_vector = root.find('./default:HeadingPairs//vt:vector', ns)
File "G:\Automation\Python2_7\lib\xml\etree\ElementPath.py", line 286, in find
return iterfind(elem, path, namespaces).next()
File "G:\Automation\Python2_7\lib\xml\etree\ElementPath.py", line 264, in iterfind
selector.append(ops[token[0]](next, token))
File "G:\Automation\Python2_7\lib\xml\etree\ElementPath.py", line 120, in prepare_descendant
token = next()
File "G:\Automation\Python2_7\lib\xml\etree\ElementPath.py", line 84, in xpath_tokenizer
raise SyntaxError("prefix %r not found in prefix map" % prefix)
SyntaxError: prefix 'vt' not found in prefix map

Make use of defusedxml if available

I understand that being packaged as a single-file module with no external dependencies was an explicit goal of this project, and it's one that I support. However, now that it relies on ElementTree for parsing rather than on regular expressions, perhaps you would consider at least optionally making use of defusedxml if installed.

Writing Excel to File Object

Pylightxl Version:
1.59
Python Version:
3.10.0

Summary of Feature:
In light of my previous issue/request regarding reading a CSV from a file object, I noticed there doesn't appear to be a way currently to write an xl file to a file object via writexl(). I could see something like that being useful where you need to put a file out to cloud storage or even in unit testing (my current situation) where you'd need to write to something in memory.

As with my other issue, I'd be happy to take a run at this too.

Doc creation errors

push doc && make html:

reading sources... [100%] sourcecode/writexl                                                                                                                                      
.../pylightxl-1.53/doc/source/revlog.rst:84: WARNING: Unknown target name: "new".
.../pylightxl-1.53/doc/source/revlog.rst:84: WARNING: Unknown target name: "new".
WARNING: autodoc: failed to import function 'address2index' from module 'pylightxl.pylightxl'; the following exception was raised:
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/sphinx/util/inspect.py", line 325, in safe_getattr
    return getattr(obj, name, *defargs)
AttributeError: module 'pylightxl.pylightxl' has no attribute 'address2index'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/sphinx/ext/autodoc/importer.py", line 106, in import_object
    obj = attrgetter(obj, mangled_name)
  File "/usr/lib/python3.9/site-packages/sphinx/ext/autodoc/__init__.py", line 289, in get_attr
    return autodoc_attrgetter(self.env.app, obj, name, *defargs)
  File "/usr/lib/python3.9/site-packages/sphinx/ext/autodoc/__init__.py", line 2186, in autodoc_attrgetter
    return safe_getattr(obj, name, *defargs)
  File "/usr/lib/python3.9/site-packages/sphinx/util/inspect.py", line 341, in safe_getattr
    raise AttributeError(name) from exc
AttributeError: address2index

WARNING: autodoc: failed to import function 'index2address' from module 'pylightxl.pylightxl'; the following exception was raised:
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/sphinx/util/inspect.py", line 325, in safe_getattr
    return getattr(obj, name, *defargs)
AttributeError: module 'pylightxl.pylightxl' has no attribute 'index2address'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/sphinx/ext/autodoc/importer.py", line 106, in import_object
    obj = attrgetter(obj, mangled_name)
  File "/usr/lib/python3.9/site-packages/sphinx/ext/autodoc/__init__.py", line 289, in get_attr
    return autodoc_attrgetter(self.env.app, obj, name, *defargs)
  File "/usr/lib/python3.9/site-packages/sphinx/ext/autodoc/__init__.py", line 2186, in autodoc_attrgetter
    return safe_getattr(obj, name, *defargs)
  File "/usr/lib/python3.9/site-packages/sphinx/util/inspect.py", line 341, in safe_getattr
    raise AttributeError(name) from exc
AttributeError: index2address

WARNING: autodoc: failed to import function 'columnletter2num' from module 'pylightxl.pylightxl'; the following exception was raised:
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/sphinx/util/inspect.py", line 325, in safe_getattr
    return getattr(obj, name, *defargs)
AttributeError: module 'pylightxl.pylightxl' has no attribute 'columnletter2num'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/sphinx/ext/autodoc/importer.py", line 106, in import_object
    obj = attrgetter(obj, mangled_name)
  File "/usr/lib/python3.9/site-packages/sphinx/ext/autodoc/__init__.py", line 289, in get_attr
    return autodoc_attrgetter(self.env.app, obj, name, *defargs)
  File "/usr/lib/python3.9/site-packages/sphinx/ext/autodoc/__init__.py", line 2186, in autodoc_attrgetter
    return safe_getattr(obj, name, *defargs)
  File "/usr/lib/python3.9/site-packages/sphinx/util/inspect.py", line 341, in safe_getattr
    raise AttributeError(name) from exc
AttributeError: columnletter2num

WARNING: autodoc: failed to import function 'num2columnletters' from module 'pylightxl.pylightxl'; the following exception was raised:
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/sphinx/util/inspect.py", line 325, in safe_getattr
    return getattr(obj, name, *defargs)
AttributeError: module 'pylightxl.pylightxl' has no attribute 'num2columnletters'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/sphinx/ext/autodoc/importer.py", line 106, in import_object
    obj = attrgetter(obj, mangled_name)
  File "/usr/lib/python3.9/site-packages/sphinx/ext/autodoc/__init__.py", line 289, in get_attr
    return autodoc_attrgetter(self.env.app, obj, name, *defargs)
  File "/usr/lib/python3.9/site-packages/sphinx/ext/autodoc/__init__.py", line 2186, in autodoc_attrgetter
    return safe_getattr(obj, name, *defargs)
  File "/usr/lib/python3.9/site-packages/sphinx/util/inspect.py", line 341, in safe_getattr
    raise AttributeError(name) from exc
AttributeError: num2columnletters

Issues reading range of cells, each cells have decimals

Pylightxl Version: 1.54
Python Version: 3.9.2

Summary of Bug/Feature: Issues reading range of cells, each cells have decimals

Traceback: i do not know how to do it

Suggestion for fix: No clue

My Code:

db = xl.readxl('IrisMod.xlsx')
conjEntrenoList = db.ws('Hoja1').range('A1:E135')
conjTestList = db.ws('Hoja2').range('A1:E15')

Add feature to write to a named cell

Pylightxl Version: 1.59
Python Version: 3.8.10

Summary of Bug/Feature:

Feature Request

Add ability to write to a named cell

Traceback:

Suggestion for fix:

I can not read file

Pylightxl Version: 1.58
Python Version: 3.9.2

Summary of Bug/Feature: I can not pass the path of file to read with "xl.readxl(path)"

Traceback:
TypeError Traceback (most recent call last)
in
3 path = pathlib.Path(r'C:\Users\Downloads\Report .xlsx')
4 print(path)
----> 5 db = xl.readxl(spath)
6 print(db)

c:\users\aitor.fernandez\appdata\local\programs\python\python39\lib\site-packages\pylightxl\pylightxl.py in readxl(fn, ws)
117 # {'ws': ws1: {'ws': str, 'rId': str, 'order': str, 'fn_ws': str}, ...
118 # 'nr': {nr1: {'nr': str, 'ws': str, 'address': str}, ...}
--> 119 wb_rels = readxl_get_workbook(fn)
120
121 for nr_dict in wb_rels['nr'].values():

c:\users\aitor.fernandez\appdata\local\programs\python\python39\lib\site-packages\pylightxl\pylightxl.py in readxl_get_workbook(fn)
232 # the output of openpyxl can sometimes not write the schema for "r" relationship
233 rId = tag_sheet.get('id')
--> 234 sheetId = int(re.sub('[^0-9]', '', rId))
235 wbrels = readxl_get_workbookxmlrels(fn)
236 rv['ws'][name] = {'ws': name, 'rId': rId, 'order': sheetId, 'fn_ws': wbrels[rId]}

c:\users\aitor.fernandez\appdata\local\programs\python\python39\lib\re.py in sub(pattern, repl, string, count, flags)
208 a callable, it's passed the Match object and must return
209 a replacement string to be used."""
--> 210 return _compile(pattern, flags).sub(repl, string, count)
211
212 def subn(pattern, repl, string, count=0, flags=0):

TypeError: expected string or bytes-like object

Suggestion for fix:

Workbook written by pylightxl crashes Excel

Pylightxl Version: 1.55
Python Version: 3.8

Summary of Bug/Feature:
Opening an Excel workbook written by pylightxl, then opening the same workbook again (without closing it first) crashes Excel. Also, it is not possible to add new Worksheets to these workbooks. Clicking + fails with the error message This action won't work on multiple selections..

error_message

For generating the workbook I did nothing special, just used this example from the docs:

import pylightxl as xl

# take this list for example as our input data that we want to put in column A
mydata = [10,20,30,40]

# create a black db
db = xl.Database()

# add a blank worksheet to the db
db.add_ws(ws="Sheet1")

# loop to add our data to the worksheet
for row_id, data in enumerate(mydata, start=1):
    db.ws(ws="Sheet1").update_index(row=row_id, col=1, val=data)

# write out the db
xl.writexl(db=db, fn="output.xlsx")

Traceback:
N/A (no exception)

Suggestion for fix:
The issue seems to be caused by the <sheetViews>...</sheetViews> element written to a worksheet. Removing it as in the screenshot below results in working worksheets for me. A GitHub issue on box/spout pointed me to this. Do the sheetViews serve any purpose or could they just be removed?

Screenshot from 2021-07-29 17-57-41

.database.num2columnletters doesn't work properly with indexes above 727 (AAY)

If the input was 728, we should espect "AAZ" but actually, the code returns "ABZ"

I could suggest to replace that function by :

def num2alpha(num: int) -> str:
    def pre_num2alpha(num) -> list:
        if num % 26 != 0:
            num = [num // 26, num % 26]
        else:
            num = [(num - 1) // 26, 26]

        if num[0] > 26:
            num = pre_num2alpha(num[0]) + [num[1]]
        else:
            num = list(filter(lambda x: False if x == 0 else True, num))

        return num

    return "".join(list(map(lambda x: chr(x + 64), pre_num2alpha(num))))

That should resolve the issue 😉 (even though it's a totally different code)

a Suggestion in readxl_get_sharedStrings

Pylightxl Version:1.58
Python Version:3.8

Suggestion for fix:
in
def readxl_get_sharedStrings(fn)

text = ''.join([tag.text for tag in tag_t])
maybe chang to:
text = ''.join([tag.text for tag in tag_t if tag.text])

because some time tag.text is None

Can not read xlsx file

Pylightxl Version: 1.47
Python Version: 3.8.5

Summary of Bug/Feature: When I try to use readxl() to load my excel file, it raised ValueError exception.

Traceback:

 File "/home/pengtao/code/excel2json/excel2json.py", line 29, in <module>
    licenses_sheet = xl.readxl(fn='opendataset.xlsx').ws('License')
  File "/home/pengtao/.pyenv/versions/3.8.5/lib/python3.8/site-packages/pylightxl/pylightxl.py", line 91, in readxl
    sh_names, namedranges = readxl_get_workbook(f, f_zip)
  File "/home/pengtao/.pyenv/versions/3.8.5/lib/python3.8/site-packages/pylightxl/pylightxl.py", line 177, in readxl_get_workbook
    ET.register_namespace(prefix, uri)
  File "/home/pengtao/.pyenv/versions/3.8.5/lib/python3.8/xml/etree/ElementTree.py", line 1031, in register_namespace
    raise ValueError("Prefix format reserved for internal use")
ValueError: Prefix format reserved for internal use

Suggestion for fix: None

Database.nr(name="SomeNameThatExists") always fails with KeyError

Names are redacted to prove the point -- SomeNameThatExists is an actual name on the SheetThatExists sheet of a .xlsx file. I've confirmed it shows up in the pylightxl.Dictionary._ws as well by printing the contents of the _ws just before this call.

Snippet:

import pylightxl as xl

wb = xl.readxl("excel_file.xlsx")
print(wb.nr(name="SomeNameThatExists"))

Output:

Traceback (most recent call last):
  File "<....>/lib/python3.8/site-packages/pylightxl/pylightxl.py", line 1430, in ws
    return self._ws[ws]
KeyError: "'SheetThatExists'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<....>/lib/python3.8/site-packages/pylightxl/pylightxl.py", line 1611, in nr
    return self.ws(ws).range(address, output=output)
  File "/home/britchie/.local/lib/python3.8/site-packages/pylightxl/pylightxl.py", line 1432, in ws
    raise UserWarning('pylightxl - Sheetname ({}) is not in the database'.format(ws))
UserWarning: pylightxl - Sheetname ('SheetThatExists') is not in the database

I found that the following change resolved this issue, but I'm not sure that this is the most appropriate fix:

- return self._ws[ws]
+ return self._ws[ws.strip('\'')]

Reading CSVs from a file object

Pylightxl Version:
1.59
Python Version:
3.10.0

Summary of Bug/Feature:
I'm scouting out libraries to use for parsing both csv and xlsx files, and I came across pylightxl. I like it since it can do both, and it's obviously lightweight. I'm wondering if there's a particular reason readcsv() is not able to read from a file object. If it's strategic, no worries. If it's just a matter of "not there yet", I'd be happy to submit a PR if you're okay with outside contributions.

Also, nice work!

Error while trying to open a file

Pylightxl Version: 1.56
Python Version: 3.9.1

Summary of Bug/Feature:
When trying to open a file, I get invalid literal error.

db = xl.readxl('Lusitania_Recibos.xlsx')
print(db.ws(ws='Sheet1').address(address='A7'))

If I open the file and save it on Excel, the above 2 lines work, so I don't know what's the issue (The file was downloaded from somewhere else, it is a generated file).
But I want to be able to use this without having Excel, or having to open the file and save it.

Traceback:

Traceback (most recent call last):
  File "D:\FolhaCaixa2\lusitania\api.py", line 46, in <module>
    lusitania.convert()
  File "D:\FolhaCaixa2\utils.py", line 89, in convert
    db = xl.readxl('Lusitania_Recibos.xlsx')
  File "C:\Users\Andre\.virtualenvs\FolhaCaixa2-LlgGVunk\lib\site-packages\pylightxl\pylightxl.py", line 119, in readxl
    wb_rels = readxl_get_workbook(fn)
  File "C:\Users\Andre\.virtualenvs\FolhaCaixa2-LlgGVunk\lib\site-packages\pylightxl\pylightxl.py", line 235, in readxl_get_workbook
    sheetId = int(rId.replace('rId', ''))
ValueError: invalid literal for int() with base 10: 'GemRid775394'

Suggestion for fix:

improvement write only

Fix database class for improvement write only model:

def add_ws(self, sheetname, data=None):
        """
        Logs worksheet name and its data in the database

        :param str sheetname: worksheet name
        :param data: dictionary of worksheet cell values (ex: {'A1': {'v':10,'f':'','s':''}, 'A2': {'v':20,'f':'','s':''}})
        :return: None
        """
        if not data:
              data = {'A1': {'v': '', 'f': '', 's': ''}}
        self._ws.update({sheetname: Worksheet(data)})
        self._ws_names.append(sheetname)

Incomplete licensing

The documentation includes a full copy of the license and pylightxl.py references the MIT license but does not appear to include the full.

Because the MIT license requires to be included in all "substantial portions of the Software" it is not clear that pylightxl.py is properly released under the MIT license.

Pathlib incompatibility and case sensitivity

Not sure if I used the right headers for this issue; these are things I noticed when trying out ur application. I like the idea of minimal dependencies and would love to use the program especially when heavy duty transformations that require Pandas are not required.

Anyways, here goes my observations:

Running pylightxl-1.41

  1. You cant read a file directly using Pathlib, u need to convert it back to a string
from pathlib import Path
import pylightxl as xl

folder = Path('excelfiles')

{f.stem:xl.readxl(f) for f in folder.rglob('*.xls*')}
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
 in 
----> 1 {f.stem:xl.readxl(f) for f in folder.rglob('*.xls*')}

 in (.0)
----> 1 {f.stem:xl.readxl(f) for f in folder.rglob('*.xls*')}

~/miniforge3/envs/pydata/lib/python3.8/site-packages/pylightxl/readxl.py in readxl(fn, sheetnames)
     21 
     22     # test that file entered was a valid excel file
---> 23     check_excelfile(fn)
     24 
     25     # zip up the excel file to expose the xml files

~/miniforge3/envs/pydata/lib/python3.8/site-packages/pylightxl/readxl.py in check_excelfile(fn)
     76 
     77     if type(fn) is not str:
---> 78         raise ValueError('Error - Incorrect file entry ({}).'.format(fn))
     79 
     80     if not isfile(fn):

ValueError: Error - Incorrect file entry (excelfiles/Seattle.xlsx).

However, when I add the str wrapper to it, everything works fine.

{f.stem:xl.readxl(str(f)) for f in folder.rglob('*.xls*')}

{'Seattle': pylightxl.Database,
 'Portland': pylightxl.Database,
 'Oakland': pylightxl.Database}
  1. pylightxl is case sensitive when it comes to the suffixes. It wont accept uppercase XLSX:
{f.stem:xl.readxl(str(f)) for f in folder.rglob('*.XLSX*')}

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
 in 
----> 1 {f.stem:xl.readxl(f) for f in folder.rglob('*.XLSX*')}

 in (.0)
----> 1 {f.stem:xl.readxl(f) for f in folder.rglob('*.XLSX*')}

~/miniforge3/envs/pydata/lib/python3.8/site-packages/pylightxl/readxl.py in readxl(fn, sheetnames)
     21 
     22     # test that file entered was a valid excel file
---> 23     check_excelfile(fn)
     24 
     25     # zip up the excel file to expose the xml files

~/miniforge3/envs/pydata/lib/python3.8/site-packages/pylightxl/readxl.py in check_excelfile(fn)
     76 
     77     if type(fn) is not str:
---> 78         raise ValueError('Error - Incorrect file entry ({}).'.format(fn))
     79 
     80     if not isfile(fn):

ValueError: Error - Incorrect file entry (excelfiles/Tacoma.XLSX).

Issue reading Excel file

Pylightxl Version: 1.54
Python Version: 3.8.6

Summary of Bug/Feature: Issue reading an excel file

Traceback:

Traceback (most recent call last):
  File "C:/Users/udbha/Downloads/Excel Formatter/mywork/change.py", line 12, in <module>
    main()
  File "C:/Users/udbha/Downloads/Excel Formatter/mywork/change.py", line 7, in main
    db = xl.readxl(fn=PATH)
  File "C:\Users\udbha\AppData\Local\Programs\Python\Python38\lib\site-packages\pylightxl\pylightxl.py", line 118, in readxl
    wb_rels = readxl_get_workbook(fn)
  File "C:\Users\udbha\AppData\Local\Programs\Python\Python38\lib\site-packages\pylightxl\pylightxl.py", line 215, in readxl_get_workbook
    rId = tag_sheet.get('{' + ns['r'] + '}id')
KeyError: 'r'

Process finished with exit code 1

Suggestion for fix: I do not know.

MY CODE:

import pylightxl as xl


def main():
    PATH = "document.xlsx"
    NEW_PATH = "done_document.xlsx"
    db = xl.readxl(fn=PATH)
    xl.writexl(db=db, fn=NEW_PATH)


if __name__ == "__main__":
    main()

My excel file is excel 2010, which versions of excel is this library capable of handling?

Typo In Error Message

Pylightxl Version: 1.53
Python Version: 3.9.2

Summary of Bug/Feature:
On readxl_check_excelfile(fn), the error message:
UserWarning('pylightxl - File ({}) does not exit.'.format(fn)) has a typo:
exit -> exist

Traceback:

Suggestion for fix:
Correct the typo

a question in readxl_get_workbook

def readxl_get_workbook(fn):
.......
for tag_sheet in root.findall('./default:sheets/default:sheet', ns):
name = tag_sheet.get('name')
try:
rId = tag_sheet.get('{' + ns['r'] + '}id')
except KeyError:
# the output of openpyxl can sometimes not write the schema for "r" relationship
rId = tag_sheet.get('id')
sheetId = int(re.sub('[^0-9]', '', rId))
wbrels = readxl_get_workbookxmlrels(fn)
rv['ws'][name] = {'ws': name, 'rId': rId, 'order': sheetId, 'fn_ws': wbrels[rId]}
.....

readxl_get_workbookxmlrels is run many times

wbrels = readxl_get_workbookxmlrels(fn)
for tag_sheet in root.findall('./default:sheets/default:sheet', ns):
    name = tag_sheet.get('name')
    try:
        rId = tag_sheet.get('{' + ns['r'] + '}id')
    except KeyError:
        # the output of openpyxl can sometimes not write the schema for "r" relationship
        rId = tag_sheet.get('id')
    sheetId = int(re.sub('[^0-9]', '', rId))
    rv['ws'][name] = {'ws': name, 'rId': rId, 'order': sheetId, 'fn_ws': wbrels[rId]}

is run faster

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.