Git Product home page Git Product logo

xl2apl's People

Contributors

paulmansour avatar rikedyp avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

ffsk rikedyp

xl2apl's Issues

Sheet name to XML file name.

This mapping is not done properly. xl/_rels/workbook.xml.rels file must be inspected to properly map the sheet name to the file name.

Slashes and Backslashes

Zip archives created by Excel use slash (/) in the path. Third parties may use backslash. These currently fail as XL2APL expects slash (/). You would think the .NET ZipArchive object would handle this, but it does not.

Can't read files open by another process

Hey TCG, happy holidays!

The function XL2APL.Main.OpenExcelFile uses the .NET ZipFile.OpenRead method to read in a .NET ZipArchive. One shortcoming of this method is that it will fail if the file is open by another process. The owner of this other process could yourself, or even some other user if you're trying to open an Excel file that is located on a shared network drive. This latter situation can be a pretty common occurrence in a corporate environment. Reading from the COM Interop (with ⎕WC 'OleClient' 'Excel.application'... ) does not have this problem. So this is a pretty big showstopper for users who have been using COM to read Excel files.

After investigating further, I was surprised to find that the ZipFile.OpenRead method is not opening the file in ReadOnly mode. Some might be tempted to trap the exception, ⎕NCOPY the file into a temp dir, work on that, then delete it. But that can lead to a lot of problems + greatly increases the code complexity and liability. A much cleaner way to get around this is to create your own FileStream where you explicitly ask to open in ReadOnly mode. I am proposing to change the XL2APL.Main.OpenExcelFile function to something like:

OpenExcelFile←{
 ⍝ ⍵ ←→ File Name 
 ⍝ ← ←→ .NET ZipArchive 
 ⎕USING←'System.IO.Compression,System.IO.Compression.dll' 
 ⎕USING,←⊂'System.IO,System.Runtime.dll' ⍝ For FileStreams 
 fs ← ⎕NEW FileStream(⍵ FileMode.Open FileAccess.Read FileShare.ReadWrite) 
 ⎕NEW ZipArchive(fs ZipArchiveMode.Read) 

}

This way, you can open and read Excel files even if someone else has it open in ReadOnly mode.

I think this is actually a pretty important fix. Many developers might not run into this problem, but I'm sure clients will hit this problem of shared Excel files before most developers do!

I haven't done much testing of this with files located on a network. I will have to do that. It would be interesting to start doing some timing benchmarks and seeing if there is a lot of time involved in running the OpenExcelFile on network files, because that can get called a few times in the process of retrieving a workbook (maybe some redundancy can be removed by opening once and passing some arguments down?). I'd have to read up on the method more, but that's just a thought I had and maybe I'll make a separate thread if I get to experiment some more.

Regards,
Josh

Mac OS ⎕using issue

remove .dll from OpenExelFile

⎕USING←'System.IO.Compression,System.IO.Compression'
⎕USING,←⊂'System.IO,System.Runtime'

Test for various header widths

Header may be narrower or wider than data when more than one block is processed. Write tests to ensure that final results are same
width, padding data to match header or vice versa.

Possible Error: GetSheetData skips blank lines and the first non-blank line at the beginning of Excel file

It seems that GetSheetData does not read empty lines at the beginning of the Excel table. It skips them and also the first non-blank line. After that, the rest of the lines appear in the .Data variable.
Here is a test Excel:
TestExcel3.xlsx
The Excel has two worksheets: Konfig and Std
Konfig is emtpy. I want to retrieve all the data from sheet Std.
The first three lines in Std are empty.

In the next line (line 4) column A has Missing and column B ffff.
Line 5 has MMM.

Here the APL code (Dyalog version 18.2)

eInf←XL2APL.Main.GetWorkbookInfo e
 sh←eInf.SheetNames                            ⍝ Get names of work sheet
 ex←eInf∘XL2APL.Main.GetSheetData¨sh           ⍝ ex is the complete Excel
 d←ex[2].Data                       

The result shows:

┌──────┬──────────┬──────────────────┬─────────────────────┬──────────────────┐
│MMM   │          │                  │                     │                  │
│      │This      │is                │testing              │                  │
│Header│          │                  │                     │                  │
│      │          │OTM_OE_T1_PAC3200 │OTM_OE_T2_PAC3200    │OTM_OE_G1_PAC3200 │
│Data  │          │                  │                     │                  │
│      │01.01.2022│5086.7056360206607│1.8559737503528594E-4│0                 │
│      │02.01.2022│5053.7190978546141│1.4925257861614228E-4│0                 │
│      │03.01.2022│4845.2284321136476│1.9012190401554107E-4│264.17249439585208│
│      │04.01.2022│5146.4254146385192│8.3032652735710142E-5│0                 │
│      │05.01.2022│5097.8761791515353│1.7639732360839845E-4│0                 │
....

Instead of showing line Missing only the line MMM is the first line at the result.

MAC OS failed on

ReadZipFile had to be changed from "=" to "≡" on line [7]

e≡⎕NULL:''

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.