Git Product home page Git Product logo

ds-skills-json-intro's Introduction

JSON Files

Introduction

We've started to investigate APIs and briefly got a preview of the most common response format for data: JSON. While there are other formats, such as XML, json is the current standard and the most common format you are apt to encounter. With that, let's take a look at how JSON files are structured.

JSON

JSON stand for JavaScript Object Notation. It came after XML and was meant to streamline many data transportation issues at the time. It is now the common standard amongst data transfers on the web and has numerous parsing packages for numerous languages (including Python)! Here's a brief preview of the same file above now in JSON:

The JSON Module

https://docs.python.org/3.6/library/json.html

import json

To load a json file, we first open the file using python's built in function and then pass that file object to the json module's load method. As you can see, this loaded the data as a dictionary.

f = open('nyc_2001_campaign_finance.json')
data = json.load(f)
print(type(data))
<class 'dict'>

Json files are often nested in a hierarchical strucutre and will have data structures analagous to python dictionaries and lists. We can begin to investigate a particular file by using our traditional python methods. Here's all of the built in supported data types in JSON and their counterparts in python:

Check the keys of the dictionary:

data.keys()
dict_keys(['meta', 'data'])

Investigate what data types are stored within the values associated with those keys:

for v in data.values():
    print(type(v))
<class 'dict'>
<class 'list'>

We can quickly preview the first dictionary as a DataFrame

pd.DataFrame.from_dict(data['meta'])
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
view
attribution Campaign Finance Board (CFB)
averageRating 0
category City Government
columns [{'id': -1, 'name': 'sid', 'dataTypeName': 'me...
createdAt 1315950830
description A listing of public funds payments for candida...
displayType table
downloadCount 1470
flags [default, restorable, restorePossibleForType]
grants [{'inherited': False, 'type': 'viewer', 'flags...
hideFromCatalog False
hideFromDataJson False
id 8dhd-zvi6
indexUpdatedAt 1536596254
metadata {'rdfSubject': '0', 'rdfClass': '', 'attachmen...
name 2001 Campaign Payments
newBackend False
numberOfComments 0
oid 4140996
owner {'id': '5fuc-pqz2', 'displayName': 'NYC OpenDa...
provenance official
publicationAppendEnabled False
publicationDate 1371845179
publicationGroup 240370
publicationStage published
query {}
rights [read]
rowClass
rowsUpdatedAt 1371845177
rowsUpdatedBy 5fuc-pqz2
tableAuthor {'id': '5fuc-pqz2', 'displayName': 'NYC OpenDa...
tableId 932968
tags [finance, campaign finance board, cfb, nyccfb,...
totalTimesRated 0
viewCount 233
viewLastModified 1536605717
viewType tabular

Notice the column names which will be very useful!

Investigate further information about the list stored under the 'data' key:

len(data['data'])
285

Previewing the first entry:

data['data'][0]
[1,
 'E3E9CC9F-7443-43F6-94AF-B5A0F802DBA1',
 1,
 1315925633,
 '392904',
 1315925633,
 '392904',
 '{\n  "invalidCells" : {\n    "1519001" : "TOTALPAY",\n    "1518998" : "PRIMARYPAY",\n    "1519000" : "RUNOFFPAY",\n    "1518999" : "GENERALPAY",\n    "1518994" : "OFFICECD",\n    "1518996" : "OFFICEDIST",\n    "1518991" : "ELECTION"\n  }\n}',
 None,
 'CANDID',
 'CANDNAME',
 None,
 'OFFICEBORO',
 None,
 'CANCLASS',
 None,
 None,
 None,
 None]

Summary

As you can see, there's still a lot going on here with the deeply nested structure of some of these data files. In the upcoming lab, you'll get a chance to practice loading files and conducting some initial preview of the data as we did here.

ds-skills-json-intro's People

Contributors

mathymitchell avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.