Git Product home page Git Product logo

pylearningcrowds's Introduction

PyLearningCrowds

Learning from crowds methods implemented in Python. The available methods:

  • Majority Voting: soft, hard, weighted
  • Dawid and Skene: ground truth (GT) inference based on confusion matrices (CM) of annotators.
  • Raykar et al: predictive model over GT inference based on CM of annotators
  • Mixture Models: inference of model and groups on annotations of the data or the annotators
  • Global Behavior: based on label noise solutions, a global confusion matrix to infer a predictive model.
    • Without predictive model: As Dawid and Skene, infers only the GT based on a global confusion matrix.
  • Rodrigues et al (2013): predictive model over GT inference based on annotators reliability.

New methods on Updates

For examples of how to use the methods see the notebooks Tutorials on:


Documentation


Example

Read some dataset annotations

import numpy as np
y_obs = np.loadtxt("./data/LabelMe/answers.txt",dtype='int16') #not annotation symbol ==-1
T_weights = np.sum(y_obs != -1,axis=0) #number of annotations per annotator
print("Remove %d annotators that do not annotate on this set "%(np.sum(T_weights==0)))
y_obs = y_obs[:,T_weights!=0]
print("Shape (n_samples,n_annotators)= ",y_obs.shape)

For further details on representation see the documentation

You can estimate the ground truth with some aggregation technique: Majority Voting (MV)

from codeE.representation import set_representation
r_obs = set_representation(y_obs,"global")
print("Global representation shape (n_samples, n_classes)= ",r_obs.shape)
from codeE.methods import LabelAgg
label_A = LabelAgg(scenario="global")
mv_soft = label_A.predict(r_obs, 'softMV')
mv_hard = label_A.predict(r_obs, 'hardMV')

Read the dataset input patterns

X_train = ...

Define a predictive model over the ground truth

fz_x = ...

You can infer a predictive model with the ground truth

from codeE.representation import set_representation
y_obs_categorical = set_representation(y_obs,'onehot')
print("Individual representation shape (N,T,K)= ",y_obs_categorical.shape)
from codeE.methods import ModelInf_EM as Raykar
R_model = Raykar()
R_model.set_model(fz_x)
R_model.fit(X_train, y_obs_categorical, runs=20)
raykar_fx = R_model.get_basemodel()
raykar_fx.predict(new_X)

You can infer the predictive model and groups of behaviors

from codeE.methods import ModelInf_EM_CMM as CMM
CMM_model = CMM(M=3)
CMM_model.set_model(fz_x)
CMM_model.fit(X_train, r_obs, runs =20)
cmm_fx = CMM_model.get_basemodel()
cmm_fx.predict(new_X)

For the other available methods see the methods documentation


Updates

  • Predictive model support Logistic Regression on sklearn

Only with one run in the configuration of the methods. Example

from sklearn.linear_model import LogisticRegression as LR
model_sklearn_A = LR(C= 1, multi_class="multinomial")
from codeE.methods import ModelInf_EM as Raykar
R_model = Raykar(init_Z="softmv")
args = {'epochs':1, 'optimizer': "newton-cg", 'lib_model': "sklearn"}
R_model.set_model(model_sklearn_A, **args)
R_model.fit(X_train, y_obs_categorical, runs=1)
  • New methods to learning from crowds without the EM (using only backpropagation on neural networks)

Define your base predictive model over ground truth:

fz_x = keras models

Rodrigues & Pereira - CrowdLayer (based on Raykar et al.)

from codeE.methods import ModelInf_BP as Rodrigues18
Ro_model = Rodrigues18()
args = {'batch_size':BATCH_SIZE, 'optimizer':OPT}
Ro_model.set_model(fz_x, **args)
Ro_model.fit(X_train, y_obs_categorical, runs=10)
learned_fz_x = Ro_model.get_basemodel()
... use learned_fz_x

Goldberger & Ben-Reuven - NoiseLayer (based on Global Behavior)

from codeE.methods import ModelInf_BP_G as G_Noise
GNoise_model = G_Noise()
args = {'batch_size':BATCH_SIZE, 'optimizer':OPT}
GNoise_model.set_model(fz_x, **args)
GNoise_model.fit(X_train, r_obs, runs=10)
learned_fz_x = GNoise_model.get_basemodel()
... use learned_fz_x

More detailed examples could be found on V2 notebooks Tutorials:


Extensions

  • Prior on Label noise without EM
  • Guan et al. 2018 (models with label aggregation)
  • Kajino et al. 2012 (models with model aggregation)
  • Fast estimation, based on hard or discrete, on other methods besides DS

License

Copyright (C) 2022 authors of the github.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

pylearningcrowds's People

Contributors

fmenat avatar pepijndereus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

pepijndereus

pylearningcrowds's Issues

ModuleNotFoundError: No module named ‘keras.engine’

from keras.engine.topology import Layer

The line of code above ouputs:

ModuleNotFoundError: No module named ‘keras.engine’ There is no module keras.engine. From tensorflow 2.x onwords all of the sub modules under the keras.engine are under different modules within the tf.keras. You can import keras using import keras directly or from tensorflow import keras.

Module engine is now integrated into module layer. Suggested fix is provided on StackOverflow

AttributeError: Can't set the attribute "name", likely because it conflicts with an existing read-only @property of the object. Please choose a different name.

When running cell 36 of the Label me tutorial I some across the following error:


AttributeError Traceback (most recent call last)
File ~/anaconda3/lib/python3.10/site-packages/keras/src/engine/base_layer.py:3153, in Layer.setattr(self, name, value)
3152 try:
-> 3153 super(tf.internal.tracking.AutoTrackable, self).setattr(
3154 name, value
3155 )
3156 except AttributeError:

AttributeError: can't set attribute 'name'

During handling of the above exception, another exception occurred:

AttributeError Traceback (most recent call last)
Cell In[36], line 6
3 #R_model = Raykar(init_Z="model", priors='laplace', n_init_Z=5)
5 args = {'epochs':1, 'batch_size':BATCH_SIZE, 'optimizer':OPT}
----> 6 R_model.set_model(model_R, **args)

File ~/Downloads/clone/codeE/methods.py:324, in Super_ModelInf.set_model(self, model, optimizer, epochs, batch_size, lib_model)
322 if self.base_model_lib == "keras":
323 self.base_model.compile(optimizer=self.optimizer, loss='categorical_crossentropy')
--> 324 self.base_model.name = "base_model_z"
325 self.max_Bsize_base = estimate_batch_size(self.base_model)
327 elif self.base_model_lib =="sklearn":

File ~/anaconda3/lib/python3.10/site-packages/keras/src/engine/training.py:390, in Model.setattr(self, name, value)
383 except AttributeError:
384 raise RuntimeError(
385 "It looks like you are subclassing Model and you "
386 "forgot to call super().__init__()."
387 " Always start with this line."
388 )
--> 390 super().setattr(name, value)

File ~/anaconda3/lib/python3.10/site-packages/keras/src/engine/base_layer.py:3157, in Layer.setattr(self, name, value)
3153 super(tf.internal.tracking.AutoTrackable, self).setattr(
3154 name, value
3155 )
3156 except AttributeError:
-> 3157 raise AttributeError(
3158 (
3159 'Can't set the attribute "{}", likely because it '
3160 "conflicts with an existing read-only @Property of the "
3161 "object. Please choose a different name."
3162 ).format(name)
3163 )
3164 return
3166 # Wraps data structures in Trackable, unwraps NoDependency objects.

AttributeError: Can't set the attribute "name", likely because it conflicts with an existing read-only @Property of the object. Please choose a different name.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.