Comments (3)
Re:
- You can find a reference implementation for Equation (5) here.
- Reference implementation for that is here - but this does indeed depend on the scale. That's what the temperature term
$T$ is for:$\sigma((E_\theta(x, k) - \tau)/T)$ . Also, you can use the log-probabilities$E_\theta(x, k) = \log \pi_{\theta,k}(x)$ which works a bit better in practice. - They can be penalized but this is generally not necessary. Basically, as long as there is one true label for each example, and
$\alpha$ is reasonably low, the majority of prediction sets will contain at least the true label (so not be empty). This is mainly a result of the simple conformity score (for other conformal predictors this can be different). Beyond that, you are of course free to penalize that, but I am just saying that it is generally not required to learn good classifiers. - Gradients wrt. to what is the question? Generlly, gradient is not a problem as long as the sorting is fixed. The key is getting gradients through the sorting - this is what the smooth sorter is for.
Hope that helps. If I am slow to respond on here, feel free to send me an email to follow-up - always curious to see what people do with conformal training especially as I had some follow-up ideas but couldn't really pursue them.
from conformalprediction.jl.
Some questions that have come up so far:
- Is the Direc delta really supposed to be an indicator function? Equation (5) on page 5. Maybe I'm just not familiar with this notation.
- Doesn't the smooth size loss depend a lot on the scale of the (non-)conformity scores? For
$E_{\theta}(x,k)=\pi_{\theta,k}(x)\in[0,1]$ , for example, we have that$\sigma(E_{\theta}(x,k) - \tau) \in [0.27,0.73]$ . We can use temperature scaling, but can we really speak of 'probabilities' that labels are assigned to the set? - More on smooth size loss: What about empty sets? Shouldn't they be penalised at least as heavily as complete sets?
a. Could just penalise these cases as$K - \kappa$ , that is the maximum set size minus the target set size (1).
b. Perhaps even better: penalise$\sum(1-C) - \kappa$ , that is the total sum of probabilities that labels are not assigned to$C$ . - As for the smooth quantile computation, it seems that
Zygote.jl
's AD actually let's me compute grads as long as I sort values beforehand (see this answer on SO). Is this suprising?
@davidstutz would much appreciate your thoughts, if you get the chance. This is still early stages here, so there's absolutely no rush. Amazing paper by the way!!
from conformalprediction.jl.
Wow this was quick, thanks a lot 🙏
That all makes sense. Regarding the quantile computation, thanks for the clarification. For my current use case, I just need to differentiate with respect to a conformal model that has already been calibrated, but I see now why you need information about the sorting itself for training.
Thanks again for being responsive!
from conformalprediction.jl.
Related Issues (20)
- Conformal Training examples HOT 2
- Support for thresholding predictive distributions as explained in Section 2.4 of the tutorial
- Conformal Bayes through 'add-one-in' importance sampling
- .vscode folder HOT 1
- Add Aqua.jl
- Add parallelizer field to all models
- Adaptive Inductive Classification broken? HOT 2
- Move to adjusted quantile HOT 1
- Class-Conditional CP with many classes
- Treat data as artifacts
- JuliaCon pres
- Add format check to CI
- Add support for RAPS
- [Refactor] Separate module for TS
- Revisit sample correction
- Move plot methods to TaijaPlotting.jl
- Add TaijaPlotting to docs env HOT 1
- Add support for 1.6 HOT 1
- readme Quick Tour notebook: "Could not fetch rendered notebook or notebook source." HOT 5
- Conformal Training
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from conformalprediction.jl.