The roc-star's discuss from iridiumblue

PyTorch implementation as a class

Hi, thanks a lot for you job, idea described here is smart and simple!

To apply it in my work I had to rewrite your code into class form. This also made code a bit cleaner using PyTorch broadcasting. Hope it will also be useful for someone else =)

I also slightly changed the logic. Here batches are saved into a FIFO queue and during each forward call last sample_size elements are taken instead of a random subset.

update: Added small fix from brandenkmurray in update_gamma function.


class RocStarLoss(_Loss):
    """Smooth approximation for ROC AUC
    """
    def __init__(self, delta = 1.0, sample_size = 1000, sample_size_gamma = 10000, update_gamma_each=500):
        r"""
        Args:
            delta: Param from article
            sample_size (int): Number of examples to take for ROC AUC approximation
            sample_size_gamma (int): Number of examples to take for Gamma parameter approximation
            update_gamma_each (int): Number of steps after which to recompute gamma value.
        """
        super().__init__()
        self.delta = delta
        self.sample_size = sample_size
        self.sample_size_gamma = sample_size_gamma
        self.update_gamma_each = update_gamma_each
        self.steps = 0
        size = max(sample_size, sample_size_gamma)

        # Randomly init labels
        self.y_pred_history = torch.rand((size, 1))
        self.y_true_history = torch.randint(2, (size, 1))
        

    def forward(self, y_pred, y_true):
        """
        Args:
            y_pred: Tensor of model predictions in [0, 1] range. Shape (B x 1)
            y_true: Tensor of true labels in {0, 1}. Shape (B x 1)
        """
        if self.steps % self.update_gamma_each == 0:
            self.update_gamma()
        self.steps += 1
        
        positive = y_pred[y_true > 0]
        negative = y_pred[y_true < 1]
        
        # Take last `sample_size` elements from history
        y_pred_history = self.y_pred_history[- self.sample_size:]
        y_true_history = self.y_true_history[- self.sample_size:]
        
        positive_history = y_pred_history[y_true_history > 0]
        negative_history = y_pred_history[y_true_history < 1]
        
        if positive.size(0) > 0:
            diff = negative_history.view(1, -1) + self.gamma - positive.view(-1, 1)
            loss_positive = torch.nn.functional.relu(diff ** 2).mean()
        else:
            loss_positive = 0
 
        if negative.size(0) > 0:
            diff = negative.view(1, -1) + self.gamma - positive_history.view(-1, 1)
            loss_negative = torch.nn.functional.relu(diff ** 2).mean()
        else:
            loss_negative = 0
            
        loss = loss_negative + loss_positive
        
        # Update FIFO queue
        batch_size = y_pred.size(0)
        self.y_pred_history = torch.cat((self.y_pred_history[batch_size:], y_pred))
        self.y_true_history = torch.cat((self.y_true_history[batch_size:], y_true))
        return loss

    def update_gamma(self):
        # Take last `sample_size_gamma` elements from history
        y_pred = self.y_pred_history[- self.sample_size_gamma:]
        y_true = self.y_true_history[- self.sample_size_gamma:]
        
        positive = y_pred[y_true > 0]
        negative = y_pred[y_true < 1]
        
        # Create matrix of size sample_size_gamma x sample_size_gamma
        diff = positive.view(-1, 1) - negative.view(1, -1)
        AUC = (diff > 0).type(torch.float).mean()
        num_wrong_ordered = (1 - AUC) * diff.flatten().size(0)
        
        # Adjuct gamma, so that among correct ordered samples `delta * num_wrong_ordered` were considered
        # ordered incorrectly with gamma added
        correct_ordered = diff[diff > 0].flatten().sort().values
        idx = min(int(num_wrong_ordered * self.delta), len(correct_ordered)-1)
        self.gamma = correct_ordered[idx]

Interesting Cross-Reference

I landed here because I previously used Tensorflow research's "global objectives" ROC AUC loss and was trying to help someone else with interest find a PyTorch equivalent. You can see the removed TF code here: https://github.com/tensorflow/models/tree/3e73c76c7a5373dafd71ef9231896dabcb696cc5/research/global_objectives
I was trying to see if the original authors could add a note about how to do the upgrade in the issue tensorflow/models#8062 but no dice.

What's interesting is that it appears they based their objective on another paper, Scalable Learning of Non-Decomposable Objectives. It looks like the paper is newer, but I'd be curious to see how the author here thinks the approach stacks up from a theoretical perspective. What do you think?

[Bug] Link to Kaggle kernel is broken

Link to Kaggle Kernel in broken in README.md, any updated implementations?

cap_pos and cap_neg

When working with imbalanced data, it may be possible that a batch of the training data may not have a positive or negative sample (depending on in which way the imbalance is). That causes cap_pos or cap_neg to become 0, and epoch_pos or epoch_neg can't be calculated. Would it be reasonable to do something like to following?

cap_pos = max(1, epoch_pos.shape[0])
cap_neg = max(1, epoch_neg.shape[0])

iridiumblue / roc-star Goto Github PK

roc-star's Issues

PyTorch implementation as a class

Interesting Cross-Reference

[Bug] Link to Kaggle kernel is broken

cap_pos and cap_neg

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent