Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Confidence Interval for categorical outcome about econml HOT 3 OPEN

ellpri commented on September 1, 2024

Confidence Interval for categorical outcome

from econml.

Comments (3)

kbattocchi commented on September 1, 2024 1

If you just care about the ATE on the training set, then use the doubly robust ATE (which you can get programmatically from the ate_ attribute). The ate() method is more flexible, allowing you to also compute the ATE for other populations X, but it is not doubly-robust.

In terms of interpretation, a value of 0.313 means that increasing the probability of assigning an individual to treatment 5 instead of treatment 0 by p will increase the likelihood of a severe outcome by 0.313p. (This estimate is linear in the treatment probability which may not be completely realistic for a discrete outcome, since for some values of X we may have small variations in treatment that correspond to large variations in output, which would extrapolate to more than a 100% change in severity probability given a 100% change in treatment from one level to another, which is impossible)

from econml.

ellpri commented on September 1, 2024

@kbattocchi Hi Keith, Thanks for the reply. I am working with accident data. The treatment variable taken here is the relative velocity and '5' indicates more than 80kmph and '0' is 20kmph. The target variable is injury severity. I expected a result that would say if relative velocity changes from 0 to 5, it would increases the injury severity probability by x. But the way you intrepreted is little different.

So the use case is not applicable here? In general, i want to analyse the parameters from the accident Database and its influence on injury severity which is a categorical variable.
As you mentioned , ATE here is linear, so should i use Treatment Featurizer?

from econml.

kbattocchi commented on September 1, 2024

@ellpri I think that my answer is consistent with what you're looking for - changing 100% from '0' to '5' means changing the severity probability by 100% of 0.313, i.e. increasing it by 0.313. I only added the caveat because the linearity of the model is not necessarily completely realistic for discrete outcomes - we perform the estimate conditional on X by regressing the unexpected variation in outcome conditional on X and W to on the unexpected variation in treatment conditional on X and W, and empirically it's possible that for some X there was a big unexpected change in Y (there was a severe injury when we thought that was only 10% likely given X, say) but only a small unexpected change in T (the relative velocity was 5, and we thought that was 95% likely given X) - in that case it looks like a very small change in T leads to a big change in Y, which will extrapolate to a more than 100% change in outcome given a change in treatment from '0' to '5'. Despite this, empirically DML seems to generally perform well with discrete outcomes even though theoretically something like a "double machine learning for logistic regression" setup might be more appropriate.

The treatment featurizer won't affect this - you're already fitting a CATE model that is flexible in X (because you are using CausalForestDML), so featurizing X won't buy you anything - the linearity that I'm talking about is linearity in the treatment (probability). But discrete models are linear in the treatment without loss of generality.

from econml.

Confidence Interval for categorical outcome about econml HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent