Git Product home page Git Product logo

read-table's Introduction

Read Table

Extract tables from an HTML page.

Example

This is our table:

image of table

Source: https://en.wikipedia.org/wiki/List_of_countries_by_population_(United_Nations)

from pprint import pprint

import requests

from read_table import read_table, to_dict

url = "https://en.wikipedia.org/wiki/List_of_countries_by_population_(United_Nations)"

tables = read_table(requests.get(url).text, attrs={"class": "wikitable"})

table = tables[0]

# As a list of lists

pprint(table)

"""
[
    [
        "Country / Area",
        "UN continental region [4]",
        "UN statistical subregion [4]",
        "Population (1 July 2018)",
        "Population (1 July 2019)",
        "Change",
    ],
    ["China [a]", "Asia", "Eastern Asia", "1,427,647,786", "1,433,783,686", "+0.43%"],
    ["India", "Asia", "Southern Asia", "1,352,642,280", "1,366,417,754", "+1.02%"],
    [
        "United States",
        "Americas",
        "Northern America",
        "327,096,265",
        "329,064,917",
        "+0.60%",
    ],
    ...
]
"""

# As a list of dicts

pprint(to_dict(table))

"""
[
    {
        "Change": "+0.43%",
        "Country / Area": "China [a]",
        "Population (1 July 2018)": "1,427,647,786",
        "Population (1 July 2019)": "1,433,783,686",
        "UN continental region [4]": "Asia",
        "UN statistical subregion [4]": "Eastern Asia",
    },
    {
        "Change": "+1.02%",
        "Country / Area": "India",
        "Population (1 July 2018)": "1,352,642,280",
        "Population (1 July 2019)": "1,366,417,754",
        "UN continental region [4]": "Asia",
        "UN statistical subregion [4]": "Southern Asia",
    },
    {
        "Change": "+0.60%",
        "Country / Area": "United States",
        "Population (1 July 2018)": "327,096,265",
        "Population (1 July 2019)": "329,064,917",
        "UN continental region [4]": "Americas",
        "UN statistical subregion [4]": "Northern America",
    },
    ...
]
"""

read-table's People

Contributors

meirdev avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.