princetonuniversity / gerrymandertests Goto Github PK
View Code? Open in Web Editor NEWLots of metrics for quantifying gerrymandering.
License: GNU General Public License v3.0
Lots of metrics for quantifying gerrymandering.
License: GNU General Public License v3.0
Line 53 fails with newer versions of pandas due to unexpected MultiIndex 'labels' syntax. Code runs by replacing 'labels' with 'codes'.
I've been playing with the methods in the metrics.py
file and have found myself mimicking the functionality in new classes. For example, I've replaced use of _stats
in metrics.py
with a class I call VoteShares
:
I had hoped to put up an explanatory PR, but I'm not allowed to make pull requests. Would you have interest in adding some of these classes to your repo? They make some use of the methods a bit more concise e.g.
mwu: MannWhitneyU = MannWhitneyU([.1, .2, .3, .4, .5, .6, .7, .8, .9, 1.0])
mwu.get_statistic()
mwu.get_p_value()
Additional indent in utils.py line 37 throws error. Runs smoothly if outdented to for and else level.
I'm trying to use Wang's 3 tests. However, when trying to use Test 1 (the excess seats test), I get the following errors:
KeyError: 'bootstrap'
TypeError: object of type 'numpy.float64' has no len()
Code to replicate the error:
impute_val = 1
chambers = defaultdict(lambda: defaultdict(list))
chambers['Congressional']['filepath'] = 'election_data/congressional_election_results_post1948.csv'
metric_dict = {'bootstrap': g.bootstrap,
't_test_diff': g.t_test_diff,
'mean_median_diff': g.mean_median}
for chamber in chambers:
chambers[chamber]['elections_df'] = g.parse_results(chambers[chamber]['filepath'])
chambers[chamber]['tests_df'] = g.tests_df(g.run_all_tests(
chambers[chamber]['elections_df'],
impute_val=impute_val,
metrics=metric_dict))
I'm using Python 3.6.
I guess the error may come from this part of the function 'bootstrap', but I'm not sure how to solve it:
n_matches = len(match_seats)
Is it also possible that I'm using the function incorrectly. I would appreciate any help!
Update: I also got the same error when using g.t_test and g.mean_median_test.
This package uses deprecated features from Pandas that are no longer available in Pandas version 1.0 or higher. Accordingly, one should use a virtual environment to install an older version of pandas e.g. pandas==0.25.1
until this issues.
I'm trying to run the gerrymandertests, but apparently it relies on my separately downloading state-specific files (I'm particularly interested in New Mexico) and I can't find any documentation on where to get them.
If I just run the notebook, here's the error:
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-1-54dcfe840d25> in <module>
41
42 for chamber in chambers:
---> 43 chambers[chamber]['elections_df'] = g.parse_results(chambers[chamber]['filepath'])
44 chambers[chamber]['tests_df'] = g.tests_df(g.run_all_tests(
45 chambers[chamber]['elections_df'],
~/outsrc/gerrymandertests/gerrymetrics/utils.py in parse_results(input_filepath, start_year, coerce_odd_years)
12 '''
13
---> 14 df = pd.read_csv(input_filepath)
15
16 df = df[df['Year'] >= start_year]
~/pythonenv/gerry/lib/python3.7/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision)
674 )
675
--> 676 return _read(filepath_or_buffer, kwds)
677
678 parser_f.__name__ = name
~/pythonenv/gerry/lib/python3.7/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
446
447 # Create the parser.
--> 448 parser = TextFileReader(fp_or_buf, **kwds)
449
450 if chunksize or iterator:
~/pythonenv/gerry/lib/python3.7/site-packages/pandas/io/parsers.py in __init__(self, f, engine, **kwds)
878 self.options["has_index_names"] = kwds["has_index_names"]
879
--> 880 self._make_engine(self.engine)
881
882 def close(self):
~/pythonenv/gerry/lib/python3.7/site-packages/pandas/io/parsers.py in _make_engine(self, engine)
1112 def _make_engine(self, engine="c"):
1113 if engine == "c":
-> 1114 self._engine = CParserWrapper(self.f, **self.options)
1115 else:
1116 if engine == "python":
~/pythonenv/gerry/lib/python3.7/site-packages/pandas/io/parsers.py in __init__(self, src, **kwds)
1889 kwds["usecols"] = self.usecols
1890
-> 1891 self._reader = parsers.TextReader(src, **kwds)
1892 self.unnamed_cols = self._reader.unnamed_cols
1893
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.__cinit__()
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source()
FileNotFoundError: [Errno 2] File election_data/state_legislative/state_legislative_election_results_post1971.csv does not exist: 'election_data/state_legislative/state_legislative_election_results_post1971.csv'
election_data/congressional_election_results_post1948.csv comes as part of the repository, but election_data/state_legislative/ is an empty directory. Where can I get the files that it expected there?
In NM we're actively fighting for better redistricting (I'm webmaster for fairdistrictsnm.org) and I'd love to get some quantitative measurements I could show to legislators and display on the website.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.