Git Product home page Git Product logo

modern-slavery-statements-research's Introduction

Research on Modern Slavery

This repository is going to contain a collection of experiments and analyses performed on the Modern Slavery Statements Dataset.

Introduction

The UN Sustainable Development Goal 8.7 states: Take immediate and effective measures to eradicate forced labour, end modern slavery and human trafficking and secure the prohibition and elimination of the worst forms of child labour, including recruitment and use of child soldiers, and by 2025 end child labour in all its forms.

In 2018, the Global Slavery Index found that there were 40.3 M people in modern slavery, of whom 25M were in forced labor producing computers, clothing, agricultural products, raw materials, etc and 15M were in forced marriage.

The Future Society, an independent nonprofit think-and-do tank launched a partnership with the Walk Free Initiative to automate the analysis of modern slavery statements produced by businesses to boost compliance and help combat and eradicate modern slavery. The team at The Future Society is curating an up-to-date repository of >16K modern slavery statements (and counting) to boost machine learning research in this area. The data is scraped based on the collection of report links provided by the modernslaveryregistry.org.

By sharing your analysis and contributing to this repository you help the global community to hold multi-national corporations accountable for how they treat their workforce and suppliers.

Prerequisites

Quickstart

It's recommended that you use a virtual environment such as virtualenv, pipenv or similar.

Option 1 - notebook

Copy this notebook and follow the instructions.

Option 2 - command line

Install the package:

pip install modern-slavery-statements-research

Specify your AWS access credentials as -i (aws access key id) and -a (secret access key) arguments and run (without the curly brackets):

download-corpus -i {aws_access_key_id} -a {aws secret access key}

The logs printed in the console will tell you the name of the data folder.

If you've set up your modern slavery project related AWS CLI credentials as default you can simply run

download-corpus

You can explore more options by running download-corpus --help

Data Schema

The dataset includes the following columns:

Company ID                                    Unique company identifier
Company                                       Company name
Is Publisher                                  Whether the company is a publiser 
Statement ID                                  Unique statement identifier
URL                                           Original URL where the statement could be found
Override URL                                  Edited URL
Companies House Number                        Company's registered number in companieshouse.gov.uk
Industry                                      Company's main area of activity 
HQ                                            Country of company's headquarters
Is Also Covered                               
UK Modern Slavery Act                         Whether the company is legislated by the UK Modern Slavery Act 
California Transparency in Supply Chains Act  Whether the company is legislated by the California Transparency in Supply Chains Act 
Australia Modern Slavery Act                  Whether the company is legislated by the Australia Modern Slavery Act
Period Covered                                Year that is being reported for 
Text                                          Extracted statement text
 

As the corpus is a work in progress, all feedback is welcomed in the Repository issues at present, if you'd like to work with this data, please send an email to [email protected] with a link to your social profile (linkedin, facebook or similar ) and you'll receive IAM user credentials on the first possible instance that would allow you to download and access the data.

Get Help

If you'd like to get help with domain expertise or technical requirements and implementations then get in touch with Adriana or Karyna respectively.

Roadmap

Over the next few weeks and months, the following improvements are planned to the dataset and the repository:

  1. Provide a convenient one-command entry point to the data
  2. Improve the dataset quality by continuously including more documents and improving the data cleaning pipeline.
  3. Provide examples of analysis.
  4. Provide manually annotaded labels for a subset of the corpus to enable analyses using supervised methods.
  5. Open source the data and research for public access.

Citation

If you intend to share any form of public research and analysis based on the data from this repository and the modern-slavery-dataset bucket in AWS S3, then please include the following citation to your publication:

The Future Society. (2020) Modern Slavery Statements Research. Retrieved from https://github.com/the-future-society/modern-slavery-statements-research.

Contributions

If you'd like to contribute to the research then take a look at any of the issues or get in touch with Adriana or Karyna.

Take a look at colab notebooks based on the modern slavery corpus:

modern-slavery-statements-research's People

Contributors

adrianaeufrosinabora avatar dhilgart avatar karinabik avatar robeespi avatar samsarana avatar the-edgar avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.