Measuring Judge Ideology

Education Data Science Practicum Spring 2019

Project Started: 1/30/2019
Updated: 3/23/2019

Download and Run "Midterm_Sample" folder for analysis using a sample that you can run on your local device

Accomplished so far:

Data collection:
- gain all federal appellate court opinions
Pre-Analysis:
- made sure vector representation of opinions is possible (works for first circuit)
Text Cleaning (lots of regex nightmares!):
- need to build metadata, for each opinion:
  - authoring judge
  - year
  - court
- identify documents that contain dissents/concurrences from other judges
- identify documents that are per curiam
Modeling:
- average document vectors for each judge to get judge vectors
- PCA on judge vectors to look for clusters (for 1st circuit)

Notes:

Part (3) above is incomplete. JSON raw data's "year" is sparse. I need to extract year from the opinions.
Once Part (3) is complete, de-mean documents by court and by year.
How to de-mean by topic? I need topic labels. Ash and Chen won't share metadata with topic labels :(

Purpose:

There are lots of models that can be used to estimate ideal points of judges (i.e. Poole and Rosenthal, Martin-Quinn, Clinton, etc). These models use mostly voting records of the judges. The purpose of this study is to try ideal-point modeling using the text of opinions instead of voting behavior. The subjects of the study are federal judges in the Appellate Courts of the United States (i.e. 1st Circuit Court of Appeals).

Data:

Analysis:

Part one:

Some things to watch out for include era-effects, topic-effects and circuit effects.

Start with one circuit court
Learn to implement doc2vec using "fastTextR", "textTinyR"
The idea is to apply this onto all federal appellate court opinions

Problem:

I need to extract "year" from the opinions. Where do I get topic? Should I model this?

Part two:

Now that I know I can implenent vector representations of these documents, need to clean text.

Identify the judge who wrote the opinion (Metadata not available for all federal appellate court cases)
Identify dissent, concurrence, per curiam
Extract year from decisions
Then, de-mean document vectors by year and circuit THEN average for each judge

chansooligans / federal-appellate-court-opinions-text-analysis Goto Github PK

federal-appellate-court-opinions-text-analysis's Introduction

Measuring Judge Ideology

Download and Run "Midterm_Sample" folder for analysis using a sample that you can run on your local device

Accomplished so far:

Purpose:

Data:

Analysis:

Part one:

Part two:

federal-appellate-court-opinions-text-analysis's People

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent