Git Product home page Git Product logo

pearlorient's Introduction

The Pearl of the Orient

This is the repository I created to keep track of the nuts and blots that handles a variety of data for Hong Kong.

Census data

Census data is downloaded from Census and Statistics Department, http://www.census2011.gov.hk. There are Excel XML files (*.xlsx) at the level of District Board Constituency Area.

The Excel files are verified to be in the same format for each of the 18 districts, 412 constituency areas. Briefly, the file is in the format of following:

  • The file is in "two-column" format, whereas columns A-E and H-N are two separate tables
  • Data description in text string is in columns A and H
  • Numerical values are in columns C,D,E and L,M,N
  • Tables are categorised in different sections. Section labels are in column A with style matched by regex \d\. .+
  • Each table is started by a header in green background and bold face
    • Background fill color: cell.style.fill.fgColor.value
    • Green colour code in ARGB: FFCCFFCC
    • Boldface: cell.style.font.bold == True
    • Header text may span in rows. Rules to check for span
      • Cells vertically adjacent in same green background
      • Cells vertically adjacent are text
      • Lower cell has no top border and upper cell has no bottom border, checked by cell.style.border.top == '' and cell.style.border.bottom == ''
    • Header must begin by the row atop is blank (columns A-F or H-O)
  • Table may end with a green footer as summary row. In same format as the header
    • Footer must end by the row below is blank (columns A-F or H-O)
  • In the middle of the table, rows are in alternate color of no-fill or light green
    • Light green colour code in ARGB: FFF0FFF0
  • Numbers are mostly integers
    • Number format defined iin cell.style.number_format
    • Usual format string: '#\\ ###\\ ###;\\-#\\ ###\\ ###;\\-'
    • Format for income number at summary row: '#,###,###;\\-#,###,###;\\-'
    • Number for income table excluding foreign domestic helpers: '\\(#,###,###\\);\\(\\-#,###,###\\);\\(\\-\\)'
    • Format for percentage data (floating point with 1 d.p.): '0.0;\\-0.0;\\-'
  • Mostly, each number can be associated with the description on left (column A or H of same row) and top (nearest green-backgrounded cell atop in the same column)
    • Exception for incomes table
    • Numbers can be bracketed to mean figures excluding foreign domestic helpers
    • Labels in column A/H need to check for upper and lower adjacent cells are in different colour. Otherwise are continuation of text spanning across rows.

pearlorient's People

Contributors

adrianswtam avatar righthandabacus avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.