Git Product home page Git Product logo

Comments (5)

vinayak-mehta avatar vinayak-mehta commented on July 24, 2024 2

@akshowhini This is easy to do when the number and name of columns are the same, which doesn't happen very often. A robust way to do this would be to group multiple tables from different pages by partially matching the column names (based on some threshold) and concatenate them.

from camelot.

c0nb4 avatar c0nb4 commented on July 24, 2024 1

The way I'm Doing this in my personal project is with

pd.concat(self.list_of_dfs)

The only problem I see is when tables have different column-names. So I just Rename them

names = self.list_of_dfs[0].columns.tolist() for df in self.list_of_dfs: df.columns = names

from camelot.

vinayak-mehta avatar vinayak-mehta commented on July 24, 2024 1

This issue is low priority as there's no general way to merge tables spanning multiple pages across millions of different types of table structures.

from camelot.

akshowhini avatar akshowhini commented on July 24, 2024

@vinayak-mehta I would like to contribute to this. However, I would like to know your expectations on the scenarios and how to handle those.

from camelot.

AnnasMazhar avatar AnnasMazhar commented on July 24, 2024

Has this thread seen any progress. I was looking through the same issues haven't got any permanent solution to merge tables spanning multiple pages.

from camelot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.