Git Product home page Git Product logo

Comments (1)

adrilo avatar adrilo commented on June 8, 2024 1

Hi @acirulis! Thanks for the kind words, always appreciated :)

An XLSX/ODS file is indeed an archive with many files describing the contents/style of the spreadsheet.
PhpSpreadsheet (and most others) way of doing things is to load all these files in memory to keep a full picture of all elements, at all times: data, cell/row styles, spreadsheet config, ... For large spreadsheet, the amount of data is too big to fit in memory, hence the crash! On the other hand, it gives you a lot of freedom: you can access/edit any cell, can you change styles at any time, etc...

Spout's approach is different. For reading, it reads the structure row by row (keeping a single row in memory at a time) and read the cells content as well as cells styles (located in other files), mapping content/style to cell, for the given read row. There are some optimizations because it's possible that not all cell contents fit into memory. This definitely limits the amount of data that needs to be stored in memory. The drawback is that you can only access 1 row at a time. If you want to read the 90th row, you need to read the 89th first rows. To know how many rows are in the spreadsheet (useful for progress), then you must read the entire spreadsheet...
Same goes for writing. You need to write row by row. You can't go back; once row2 is written, you cannot make changes to it later. This way, you have only a single row in memory at a time.

That's more or less how Spout can handle large spreadsheets, and why other libraries can't handle them!
Hope that's clear :)

from spout.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.