Git Product home page Git Product logo

Comments (4)

joeldenning avatar joeldenning commented on August 15, 2024

Some docket pdfs list fines and fees that were required to be paid, and then also the payment dates.

This github issue is to explore whether we can reliably parse out that information from the PDFs.

from utahexpungements.org.

jamesschlader avatar jamesschlader commented on August 15, 2024

According to the content one pdf that is thick with accounts, it seems that we could parse out each account to be paid with special attention to ones that look like this:
REVENUE DETAIL - TYPE: FINE
We could search for "REVENUE DETAIL - TYPE:" to get all of the relevant Accounts. This would, hopefully, distinguish fee/fine/sentence accounts from bail or refund accounts. At any rate, the plan would be section out those parts similarly to how the entire Docket is sectioned out. Then, those sections could be walked looking for the line with "Balance:". The amount on this line would tell us what the payment status is.

I wonder what this parsing function ought to return. I see three ways to go:
(1) Parse every account section and return an object with all the relevant detail: name of account and balance seem like obvious field candidates.
(2) Like (1) except only return objects for "TYPE:" accounts.
(3) Create a field called "accountsPaid", or something like that, which is set to true. Whilst parsing, look for "TYPE:" sections and return false if the balance is > 0.00.

Also, there is TOTAL REVENUE section. If the Balance field here is the one that the BCI uses to make their determination, then we would just need to target the first Balance line after that and we'd be done. That section always appears, based on the cases we have available, so that would be the quickest path to victory. My suspicion is that this section is not the salient one, so I don't think this is a viable option. I mention it here in case I'm wrong about that.

Considerations:

  1. Maybe we want to parse first, test for qualification later. In that case, (1) or (2) but not (3) would be viable options. If we don't mind mixing those tasks, then (3) might be a good option.
  2. If the listing of relevant fines and fees does not always appear in the "REVENUE DETAIL - TYPE:" pattern, then we'll need to use method (1).
  3. Nothing appears in the ACCOUNT SUMMARY section unless it has been mentioned in the PROCEEDINGS section. It seems highly likely that the PROCEEDINGS section will need to be fully parsed anyway, so perhaps we can find a way to parse out fee/fine/sentence detail, including final account balance info, whilst parsing the PROCEEDINGS section.

from utahexpungements.org.

tuckersamuelsen avatar tuckersamuelsen commented on August 15, 2024

My advice:

  1. The fine/fee section at the top, believe it or not, is NOT helpful. Many accounts with unpaid fines are simply sent to the state debt collection agency, and will show a 0 balance at the top.

  2. If there is a balance near the top, then it there is an unpaid fine. If the balance is 0, there still might be an unpaid fine, so that isn't dispositive.

  3. There may be a minute entry somewhere in the docket with the words "Office of State Debt Collection". That means the defendant has/had an open fine amount there, and may still need to satisfy it. Those exact words will be very helpful for determining eligibility.

from utahexpungements.org.

joeldenning avatar joeldenning commented on August 15, 2024

Thanks for the info Tucker. Sounds like the ACCOUNT SUMMARY section doesn't always have everything needed in it.

Maybe we want to parse first, test for qualification later

@jamesschlader yeah I think this is a good approach. Let's get as much as we can parsed and then use rules for eligibility qualification in the future phases.

from utahexpungements.org.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.