Git Product home page Git Product logo

shangrla's Introduction

Sets of Half-Average Nulls Generate Risk-Limiting Audits (SHANGRLA)

Risk-limiting audits (RLAs) offer a statistical guarantee: if a full manual tally of the paper ballots would show that the reported election outcome is wrong, an RLA has a known minimum chance of leading to a full manual tally. RLAs generally rely on random samples.

With SHANGRLA we introduce a very general method of auditing a variety of election types, by expressing an apparent election outcome as a series of assertions.
Each assertion is of the form "the mean of a list of non-negative numbers is greater than 1/2."

The lists of nonnegative numbers correspond to assorters, which assign a number to the selections made on each ballot (and to the cast vote record, for comparison audits). Each assertion is tested using a sequential test of the null hypothesis that its complement holds. If all the null hypotheses are rejected, the election outcome is confirmed. If not, we proceed to a full manual recount. SHANGRLA incorporates several different statistical risk-measurement algorithms and extends naturally to plurality and super-majority contests with various election types including Range and Approval voting and Borda count.

It can even incorporate Instant Runoff Voting (IRV) using the RAIRE assertion-generator. This produces a set of assertions sufficient to prove that the announced winner truly won. Observed paper ballots can be entered using Dan King and Laurent Sandrolini's tool for the San Francisco Election board.

We provide an open-source reference implementation and exemplar calculations in Jupyter notebooks.

Installation

Installing from GitHub

Main version:

pip install git+https://github.com/pbstark/SHANGRLA.git@main

Development version:

pip install git+https://github.com/dvukcevic/SHANGRLA.git@dev

Installing from a local copy (in development mode)

Install just the code:

pip install -e .

Also include the optional dependencies for tests and examples:

pip install -e .[test,examples]

Authors and contributors

The initial code was written by Michelle Blom, Andrew Conway, Philip B. Stark, Peter J. Stuckey and Vanessa Teague.

Additional development by Amanda Glazer, Jake Spertus, Ian Waudby-Smith, David Wu, Alexander Ek, Floyd Everest and Damjan Vukcevic.

Licences

Copyright (C) 2019-2024 Philip B. Stark, Vanessa Teague, Michelle Blom, Peter Stuckey, Ian Waudby-Smith, Jacob Spertus, Amanda Glazer, Damjan Vukcevic, David Wu, Alexander Ek, Floyd Everest.

Software

GNU AGPL
The software, and documentation of the software, in this repository is provided under the GNU Affero General Public License (AGPL). You can redistribute and/or modify the software and documentation under the terms of the AGPL as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

The software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the AGPL for more details. A copy of the AGPL is provided in LICENSE.

Other files

Creative Commons License
The other documents in this repository (not including the software and documentation of the software) are provided under a Creative Commons Attribution-NoDerivs 4.0 International License (CC BY-ND 4.0).

shangrla's People

Contributors

bsheehan-sf-rla avatar dvukcevic avatar pbstark avatar spertus avatar wannabesmith avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

shangrla's Issues

kaplan_markov function is broken

The kaplan_markov function currently doesn't work, as it tries to do
any(x < 0) while x is a list in find_p_values.

Wrapping the definition of d with np.array(...) in find_p_values on line 1442 of assertion_audit_utils fixes this issue.

Bug in find_p_values()?

Line 1609 of assertion_audit_utils.py calls overstatement_assorter() with a "cards" argument (which was inserted in commit #37). However, overstatement_assorter() does not have a cards parameter, so an error is thrown.

Looking for oc_cvrs.zip

Hi All:

Im working with Vanessa Teague et al, to understand SHANGRLA and evaluate porting it.

Im trying to run examples/OC_example.ipynb in SHANGRLA python repo. It references RLA files not in the repo, apparently from the Orange County 2022 election:

'cvr_file': '/Users/amanda/Downloads/oc_cvrs.zip',
#'cvr_file': '/Users/Jake/Desktop/oc_cvrs.zip',

Are these available somewhere that I can download?

If not, is there another example data set I can use to run a complete SHANGRLA workflow?

Thanks,
John (https://github.com/JohnLCaron)

Break loop in new_sample_size

The while loop in new_sample_size needs to break at some point. Obvious choice is to break when the population size is reached; this is certainly true when sampling w/o replacement. If population size is reached without confirming the outcome, raise a message? When sampling w/ replacement could add a parameter to break once a certain fraction (potentially bigger than 1) of the population has been sampled; for now, take it to be 1.

Welford's Algorithm in NonnegMean

I compared the implementation of Welford's algorithm for calculating the running mean implemented in NonnegMean.shrink_trunc to an existing implementation online, and I found some discrepancy in the results.

The specific lines I'm referring to can be found here:

# Welford's algorithm for running mean and running sd
mj = [x[0]]
sdj = [0]
for i, xj in enumerate(x[1:]):
mj.append(mj[-1] + (xj - mj[-1]) / (i + 1))
sdj.append(sdj[-1] + (xj - mj[-2]) * (xj - mj[-1]))
sdj = np.sqrt(sdj / j)
# end of Welford's algorithm.

Extracting this code into a short function, we have:

import numpy as np

def shrink_trunc_welford(x: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
        j = np.arange(1, len(x) + 1)
        mj = [x[0]]
        sdj = [0]
        for i, xj in enumerate(x[1:]):
            mj.append(mj[-1] + (xj - mj[-1]) / (i + 1))
            sdj.append(sdj[-1] + (xj - mj[-2]) * (xj - mj[-1]))
        sdj = np.sqrt(sdj / j)
        return mj, sdj

Then, a quick test:

x = np.array([1.5,2.5,1.7,1.0,1.2,2.1,1.4,1.8])
m, sd = shrink_trunc_welford(x)
print(m)
# [1.5, 2.5, 2.1, 1.7333333333333334, 1.6, 1.7000000000000002, 1.6500000000000001, 1.6714285714285715]
print(np.mean(x))
# 1.65

It seems the currently implemented running mean overshoots the true mean of the sample.

I'm not sure whether this is expected or not, but I compared this to an existing implementation in the welford package on pypi:

from welford import Welford

welford = Welford()
m = []
for xi in x:
    welford.add(xi)
    m.append(float(welford.mean))
print(m)
# [1.5, 2.0, 1.9, 1.6749999999999998, 1.5799999999999998, 1.6666666666666665, 1.6285714285714283, 1.65]

This implementation does not overshoot the sample mean. If this is not intended, let me know and I should be able to tweak the implementation accordingly quite easily.

Also, it appears the algorithm is implemented again in agrapa, although the code looks to be slightly different (perhaps calculating variance rather than sd):

# Welford's algorithm for running mean and running sd
mj = [x[0]]
sdj2 = [0]
for i, xj in enumerate(x[1:]):
mj.append(mj[-1] + (xj - mj[-1]) / (i + 1))
sdj2.append(sdj2[-1] + (xj - mj[-2]) * (xj - mj[-1]))
sdj2 = sdj2 / j
# end of Welford's algorithm.

kaplan_martingale returns nan

There are some cases where kaplan_martingale returns nan.

Steps to reproduce

s = [1, 0, 1, 1, 0, 0, 1] 
TestNonnegMean.kaplan_martingale(s, N=7, t=3/7, random_order=True)

returns

array([1.66666667, 0.72222222, 1.13888889, 2.64722222,        nan,
              nan,        nan]))

I think that this was due to c values of infinity being sent to integral_from_roots.

In addition, changing the last observed data point results in the entire martingale changing.

Steps to reproduce

s = [1, 0, 1, 1, 0, 0, 0]                                                   
TestNonnegMean.kaplan_martingale(s, N=7, t=3/7, random_order=True)

returns

(1.0, array([1., 1., 1., 1., 1., 1., 1.]))

incremental sample size: condition on the observed sample

Condition on the observed sample.
Make assumption about the error rate for future sample (default: observed rate in the sample so far).
Find quantile or expected incremental sample size for each assertion; sum maximum sampling probabilities across as-yet-unsampled cards.

initial sample size returns 1

Somehow the calculation of the initial sample size broke when ballot polling was introduced. Presumably, calculating the incremental sample size is also wrong. And the sample size calculations should terminate if the size reaches the appropriate population size (max_cards or contest['cards']).

Where is ballot_comparison module?

The tests and notebooks related to SUITE and hybrid audits fail for lack of a ballot_comparison module. I presume that a dependency is missing.

# python Code/suite_tools.py
Traceback (most recent call last):
  File "Code/suite_tools.py", line 11, in <module>
    from ballot_comparison import ballot_comparison_pvalue
ModuleNotFoundError: No module named 'ballot_comparison'

Note also that there are some extraneous files like
Code/.ipynb_checkpoints/hybrid-audit-example-3-checkpoint.ipynb
that should be removed and avoided via an addition to .gitignore.

Impact of RuntimeWarning: overflow encountered in Notebook and tests?

In the published notebook assertion-RLA.ipynb there are some warnings displayed:

Find initial sample size

/opt/tljh/user/lib/python3.6/site-packages/numpy/core/fromnumeric.py:90: RuntimeWarning: overflow encountered in reduce
  return ufunc.reduce(obj, axis, dtype, out, **passkwargs)

Find measured risks for all assertions

/opt/shared/Code/assertion_audit_utils.py:984: RuntimeWarning: overflow encountered in double_scalars
  a[k+1,j] += 0 if j==0 else (1-c[k])*(j/(k+1))*a[k,j-1]
/opt/shared/Code/assertion_audit_utils.py:983: RuntimeWarning: overflow encountered in double_scalars
  a[k+1,j] = -c[k]*((k+1-j)/(k+1))*a[k,j]
/opt/shared/Code/assertion_audit_utils.py:1048: RuntimeWarning: invalid value encountered in multiply
  mart_vec = np.multiply(Y_norm,integrals)

I also see these among the warnings displayed when running pytest assertion_audit_utils.py.

Should we worry about these? Fix them? Suppress them somehow?

ONEAudit implementation questions

  • For Dominion as used by San Francisco, treat tabulator scan batches as the ONEAudit batches for pooling assorters.
    • Treat any ballot in counting group 1 (cards cast in-person) as a pooled CVR, pooled over its (tabulator, batch) combination.
  • For others jurisdictions, need flexibility to use ONEAudit reporting batches that do not correspond to scan batches, for instance, the votes in a contest that were cast in person.
    • add function in class CVR that assigns tabulation groups based on the contests the CVR contains?
  • Implementation uses Contest-level flag for whether to use ONEAudit (using the Audit.AUDIT_TYPE.ONEAUDIT class constant), but it might make sense to specify the audit strategy at the Audit level or the Stratum level.
  • Need a way to estimate sample sizes. Can use the actual CVRs as MVRs together with the ONEAudit CVRs in a simulation, perhaps with additional hypothetical errors in the actual CVRs.
  • Need to work through the sampling when use_style == True.
    • Sample from CVRs that contain the contest. If a selected CVR is in a pool batch, select a random card from the pool batch that contains it
    • Every card in a batch for which at least one CVR contains the contest should be treated as if it contains the contest: if it does not, the MVR should record a non-vote (rather than treating it as an error if the card does not contain the contest, as it would for use_style without ONEAudit. Could do that by adding the contest (with empty vote dict) to every CVR in the pool batch to which the CVR belongs and that did not already contain the contest.** Existing functions automatically take care of the rest.

**A "scoped" defaultdict could do that, if such a construct exists, but it's also straightforward to code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.