Git Product home page Git Product logo

cfie-frse's Introduction

alt tag

About

This is Version 2.0 of the CFIE-FRSE tool: Corporate Financial Information Environment (CFIE) -Final Report Structure Extractor (FRSE) is a desktop application to detect the structure of UK Annual Report and extract the reports' contents on a section level. For Version 1.0 please check the release section https://github.com/drelhaj/CFIE-FRSE/releases

Please note we do not provide a web version of the tool. This is the only official and stable version of CFIE-FRSE.

System Requirement

Your machine must have JAVA installed in order to be able to run the tool. You might also need to set the JAVA_HOME PATH, especially if you get the following Error when running the tool: "'Java' is not recognized as an internal or external command". For how to set up JAVA_HOME: https://stackoverflow.com/questions/15796855/java-is-not-recognized-as-an-internal-or-external-command

How to run

  • NOTE: Please note that the code works with Java 8 (might also work with Java 7).

  • [MS Windows]: To run the tool simply clone the repository to your machine, place your pdf annual reports in the pdfs directory and run (double click) the runnable.bat file.

  • [Linux Ubuntu]: To run the tool simply clone the repository to your machine, place your pdf annual reports in the pdfs directory and run the runnable.sh. Simply cd to the directory where the runnable.sh is located and type the following command ./runnable.sh

  • [Unix/Mac]: To run the tool simply clone the repository to your machine, place your pdf annual reports in the pdfs directory and run the runnable.sh. Simply cd to the directory where the runnable.sh is located and type the following command sh runnable.sh or bash runnable.sh

  • The analysis output directory (a directory for each PDF file) will be found in the pdf directory.

  • Please do not delete any of the files or directories or change their structure.

  • The only modifications you can do is adding or deleting PDF files from the PDF directory and you can also edit the userKeywords.txt in the keywords directory to include your own keyword list, simply empty the file and insert one keyword (or keyphrase) on each line, please avoid having empty lines, especially at the end of the file.

  • Please email [email protected] for any questions. More details can be found on http://ucrel.lancs.ac.uk/cfie/.

More about the tool:

  • Detects the structure of UK Annual Reports by detecting the key section, their start and end page and extracting the contents.

  • The tool provides A Section Classification mechanism to tell the type of the extracted section, each extracted section will be annotated with a number between 0 and 8 as follows:

    • 1 Letter from board chair (this works with synonyms as well, e.g. Chairman's Statement)
    • 2 CEO review
    • 3 Governance statement
    • 4 Remuneration report
    • 5 Business review
    • 6 Financial review
    • 7 Operating review
    • 8 Highlights
    • 9 Auditors report
    • 10 Risk management
    • 11 Chairman’s governance
    • 12 CSR disclosures
    • 0: Indicates any other section that doesn't belong to the sections between 1 and 12
  • The analysis results of the uploaded files or reports can be found in output directory under file name output.csv which will show the results of all the processed pdf files.

Enjoy CFIE-FRSE 2019, CFIE Team [email protected]

cfie-frse's People

Contributors

drelhaj avatar perayson avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.