Git Product home page Git Product logo

Comments (295)

erictraut avatar erictraut commented on September 27, 2024 8

Several thoughts...

This thread starts with the assertion that Any is a "top type". That's kind of misleading because it also acts as a "bottom type". That's what makes its behavior difficult to characterize using standard subtyping rules. When Any is involved, we sometimes need to define special-case rules to disambiguate the intended behavior. In Python, the true "top type" is object and the true "bottom type" is Never (which also goes by the name NoReturn).

I think there are two questions that we need to consider when Any is used within an intersection: 1) the meaning of T & Any and 2) whether or not T & Any is reducible.

The meaning of T & Any
IMO, the correct meaning should fall out of the generalized rules for intersections. In this comment I propose rules for subtype and attribute type evaluation. If you agree with these rules, then the meaning of T & Any falls out of those definitions:

  • X is a subtype of T & Any only if it is a subtype of T.
  • When accessing attributes on an Any value, it is assumed that the attribute exists, and its type is Any. That means the type of attribute foo accessed from T & Any is either <type of T.foo> & Any (if T contains an attribute foo) or Any.

Note that both of these rules assume no reducibility. If T & Any always reduces to Any, then:

  • X is always a subtype (and supertype) of T & Any.
  • Accessing attribute foo from T & Any always evaluates to Any.

Reducibility of T & Any
TypeScript reduces both T | any and T & any to any.

PEP 483 & 484 never discussed the reducibility for unions (T | Any). Type checkers were able to decide whether or not to reduce this type to Any. Both mypy and pyright have chosen not to reduce in the case of unions.

class A:
    a: int

class B:
    a: Any

def func(x: A | B):
    reveal_type(x.a) # Both mypy and pyright reveal `int | Any`

As I've explained in the past, there is value in not reducing T | Any. If we follow that precedent, I don't think T & Any should be reduced either. (PEP 483 alluded to the fact that T & Any should be reduced, but I don't feel beholden to PEP 483 in this respect because Intersection was not formalized in that PEP or introduced into the type system at that time. PEP 483 simply indicates that Intersection might be added in the future and explains that its rules should mirror those of unions.)

If there is not consensus on this point and the Intersection PEP formalizes that T & Any should be reduced to Any, then I think the PEP should also formalize the same rule for T | Any reducing to Any (for consistency), and mypy and pyright should change their current behavior for unions accordingly.

@ippeiukai, thanks for pointing to the TypeScript documentation. It's useful to understand how other type systems have solved problems. That doesn't mean we need to adopt the same solution for Python, but it's useful information to consider.

@DiscordLiz, I think Python's Any is closer to TS's any than TS's unknown. The TS unknown type is like a union of all possible types. It's similar to the object type in Python. (There is a type called object in Javascript, but it's not a base type for all other types. That's why unknown needed to be added to the TS type system.) TS's any is outside of the subtype hierarchy — it's both a bottom and top type like Python's Any type. The distinction between object and Any is important when it comes to invariant contexts. A list[object] is different from list[Any]. The latter is exempt from the normal invariance requirements.

from intersection_examples.

JelleZijlstra avatar JelleZijlstra commented on September 27, 2024 6

Going back to the original post, my thinking is that T & Any should not be reduced to either T or Any, but is meaningful and should be supported without a need for a special case. This should be a fourth option in addition to the three already listed.

Consider this class:

class X:
    attr: Any & int

When setting the attribute, you need an object that satisfies both Any and int, so in this case it is equivalent to int. x.attr = 1 is valid, x.attr = "x" is not. But when getting the attribute, you get an object that satisfies both Any and int, which means you can do anything with it (e.g., len(x.attr) is valid).

This isn't by itself a strong use case, but as far as I can see it flows directly out of general rules about how Any and Intersection work, without a need for special cases. Note that instead I had written attrs: Any | int, I could have written a very similar paragraph above, just with the getter and setter reversed.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 4

Arguments in favor of Removing Any from Intersections

Any is essentially the universal set of types. The identity element of intersection on sets is the universal set for domains in which a universal set does exist. The Identity element for unions is the empty set (conceptually, typing.Never) The divergence in behavior is easily explainable as a consequence of logical ordering, and can be taught easily utilizing python builtins any and all and their corresponding behavior, as well as the corresponding set operations.

Edit: Argument retracted, Any's definitional mismatch between ? and Ω does matter here.

from intersection_examples.

antonagestam avatar antonagestam commented on September 27, 2024 4

I think it's worth mentioning that TypeScript considers T & any = any, though I find this surprising. Perhaps there have been previous discussions there that can be learned from?

I think it's also worth mentioning that the description of Any in the original post here is incomplete. It is both a subtype and a supertype of all other types.

from intersection_examples.

DiscordLiz avatar DiscordLiz commented on September 27, 2024 4

First of all, I'd like to apologize for earlier. I interjected with frustration at the arguments that seemed to not even know basic theory which I would consider fundamental to even work on this sort of problem, when the reality appears to be a mixing of formal terms and commonly accepted terms which do not have formal definitions.

The problem here is that Any exists at all, but that won't go away due to gradual typing. I agree with @mikeshardmind that you could modify the definition could be changed to remove any logical contradiction, though I disagree that doing so would be easy in any way. His proposed redefinition to actually make it the universal set, which many people think of it as currently skips a small detail that needs to be included, and introduces a new ambiguity.

It would need to be defined as "A construct indicating the potential for any possible type, yielding to other more specific types whenever necessary" rather than "The set of all possible types", as otherwise, you still have Any as incompatible with concrete types in an intersection.

This definition leaves a lot as ambiguous, in particular "whenever necessary" is not definable without first knowing all current and future type constructs, but it is correct and allows eliminating it from places where it would be redundant or problematic.

You can't have a universal set that is actually a universal set that fully satisfies every requirement of python's type system and add an intersection type (or any other operation on sets of types which is composed using negative logical operations, there are positive set theories for which a universal set holds)

His new definition for Never works flawlessly from a formal standpoint, but I don't know enough about how Never is used in real python code to speak to accuracy of it continuing to work as-is for existing code.

The best case scenario here in terms of correctness would appear to be to error or warn on an intersection containing Any, but to simultaneously to this make a best effort attempt to determine possible interfaces and continue to type check uses of the types relying on the intersection using a charitably modified definition of Any.

It isn't a clean solution, but it is one that works well with the idea of gradual typing, which is the only reason Any should exist at all. Fully typed code bases accepting any object should use object | type[object] and become more specific as they can.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 4

This is an opinion, yes? Another option would be to allow the intersection, but define it to expose the shared compatible interface.

Even if we went that route, Any is incompatible with every portion of every interface in an intersection under the current definition of Any being compatible with all uses and having all interfaces. The nature of it being attempted to be defined in an infinitely compatible way is paradoxically making it infinitely incompatible in this case. It is defined in such a way that it has an interface for any possible identifier and multiple definitions of that interface that are valid.

It presents a logical contradiction here with Intersection (and not prior) because Intersection is the first typing feature operating on groups of types to be composed using a logical negation. (This actually isn't strictly accurate, but the other existing cases were already disallowed 1. directly subclassing Any, 2. subclassing incompatibly with the parent class)

All that said, I don't think we need to disallow Any (even if this would be my personal preference), but I think to not disallow Any, we should include updating the definition of Any to one which remains compatible with current use and which fixes this contradiction. Definitions being logically consistent can reduce confusion, and make sure people are all on the same page as more advanced features are added to the type system.

It would need to be defined as "A construct indicating the potential for any possible type, yielding to other more specific types whenever necessary" rather than "The set of all possible types", as otherwise, you still have Any as incompatible with concrete types in an intersection.

This is correct from a mathematically pure perspective, but python has natural limitations on what can compose a type, so we could have a universal set for the domain of possible python types.... and I already prefer your modified definition compared to attempting to prove that universal on that domain is logically consistent, and not even being sure that it would be a successful attempt.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 4

the obvious rules from existing theory are insufficient, rather than simple composable rules.

At the start of this thread there was a misapprehension that Any was the universal set, but this was based on a mistaken understanding of Python's definition of Any. And based on that mistake, there were some unjustified conclusions drawn based on theory.

For the record, as got discussed briefly in a discord call while I was working on an adjacent paper to come up with more formal definitions which are accurate to current use, we were able to show that it holds for a top type as well as the universal set. by reaching for category theory for answers rather than set theory applied to types.

If We can show that Any is functioning as a top type for python's type system in the context of an intersection, it is still logically removable from that intersection by being a functional identity element on the operation of intersection over categories of types, for the domain of valid python types.

those maintaining type checking implementations have expressed that we stick to simple composable rules.

Both Jelle and Eric have argued for irreducibility.

I'd like to discuss this a little further on what irreducibility would actually result in, as there are clearly some sharp edges to it.

As stated above, Any & T -> T comes from the same simplification as T | Never -> T, not to Any | T -> Any (removal of identity element)

Any | T -> Any (not being reduced) has been made as a comparison, but it is different logic to argue for that being reduced. This may also explain why this wasn't in the PEP when the intersection example was, without the existence of intersections.

There may be other reasons to still have it irreducible, but I'd like to ensure that the people making that point have a moment to consider the difference here.

from intersection_examples.

erictraut avatar erictraut commented on September 27, 2024 4

I just thought of another argument in favor of not reducing intersections with Any and not ascribing any special meaning to Any in this context.

It turns out that Any is not unique. There are at least two other typing constructs in the Python type system that, like Any, do not follow the standard rules of subtyping. This includes Callable and tuple when used with an ellipsis: Callable[..., X] and tuple[X, ...]. These constructs, like Any, were added to support gradual typing, and they do not honor the rules of set theory.

If we create a special case for Any, then we need to consider similar special cases for these other constructs — along with any others that I may have forgotten or are added in the future. (Can anyone think of other type constructs in the Python type system that do not follow the standard rules of subtyping?)

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 4

If we create a special case for Any, then we need to consider similar special cases for these other constructs — along with any others that I may have forgotten or are added in the future. (Can anyone think of other type constructs in the Python type system that do not follow the standard rules of subtyping?)

Following the various discussions of how Any interacts, I'm no longer in favor of Any being reducible (in either direction)

Specifically, when it comes to attribute access on members of an intersection, the problem of Any being reducible rears in a way that I don't think it can reduce to either Any or T for Any & T

I believe the rule I have provided here correctly generalizes to the other special types you have shown there, can we remove all special cases as special by validating use of the resulting intersections under this rule?

If we can, are there other special types that do not follow this more general rule?

from intersection_examples.

randolf-scholz avatar randolf-scholz commented on September 27, 2024 4

I don't think disallowing intersections with Any is a good idea. They can easily occur naturally, most commonly when performing multiple isinstance checks like isinstance(x, A) and isinstance(x, B). If either A or B type-alias to Any, there you have it.

In particular, note that since python 3.11 one can use Any as a base class.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024 4

Any & X makes X.foo(some_str) compatible use. This is what I mean by it having surprising effects for people expecting their code to be typed, as it is (With some limitations) very similar to just reducing to Any

I think that is what you want in practice. Having Any there is supposed to broaden the interface. You should examine in which circumstance Any could actually be there.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024 3

Incidentally, we should at least then change the language in the top post of this thread to concord with the language you've been using especially because these differ from the existing PEPs. Specifically:

  • we should also provide definitions for "type" and "interface", and
  • most importantly, instead of asking "what type is produced by T & Any", we should ask what "type" and "interface" is produced by T & Any where T is a set of types and an interface, and Any has a set of types and an interface.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 3

Agreed. I didn't realize how far apart we were on definitions initially, despite that I was consciously avoiding certain terms I felt had some ambiguity, there were others which also had mixed meanings in play.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 3

When setting the attribute, you need an object that satisfies both Any and int

I believe this is already precluded by the current definition of Any, as Any is defined to have all possible interfaces, including those incompatible with int. It would be legal to do x.bit_length("hex") (legal, not necessarily sensical) on something typed as Any, which is clearly incompatible with the interface of int

I think the strongest options here remain to modify the existing definitions to be more rigorous and compatible with both current and future goals (not create new terminology that exists parallel to it, and therefore doesn't fix existing things like Any) or exclude Any entirely for that contradiction.

I also think this is an incredibly vacuous use of an intersection, if you want len on something that's an int, you can require that via an intersection of int, and a protocol requiring __len__. (or whatever else you actually need) It's easy to write off Any here as "YAGNI" because people using Any already aren't having interfaces checked, so what would motivate this use? Why would someone write Any & int ? What situations in real code would cause this to happen that would not be accurately described by either Any, int, or int & SomeProtocol

The lack of an easy parallel here with Any compared to Union is the definition of Any including all possible interfaces. Union expands the potential types and only cares about the overlap in interfaces, whereas Intersection narrows what types are valid and imposes that any shared interfaces must be compatible.

Fixing that definition allows there to not need to be special cases.

from intersection_examples.

DiscordLiz avatar DiscordLiz commented on September 27, 2024 3

When you say "yielding to more specific types" as in types that have already been defined?

When I included this in the updated definition I provided, my intent there is if Any is in conflict with a specific type, the specific type wins.

This is in line with how isinstance already works

foo: Any
if isinstance(foo, T):
    reveal_type(foo)  # T

It would work okay with Any | T because there is no conflict here. It would make Any & T become T because Any provides no specific information about a type, and the intersection is imposing compatibility between types. I don't think there's any other current case of conflict.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024 3

object cannot be considered a top type because types themselves are not a member of it

Are you sure? issubclass(type, object) is true.

object | type[object]

This is the second time this expression has shown up in this thread. I think this just reduces to object since issubclass(type[object], object) is true.

from intersection_examples.

randolf-scholz avatar randolf-scholz commented on September 27, 2024 3

So in summary, there appear to be 3 different interpretations / mental models of Any:

  1. "Any=Top" — Any is the universal top type.
  2. "Any = Union[T1, T2, T3, ...]" — Any is the union of all types
  3. "Any = TypeVar('Any')" — Any is an ad-hoc variable of unknown type, assumed to be used compatibly.
① Is most definitively incompatible with the definition in the documentation.

The Any type

A special kind of type is Any. A static type checker will treat every type as being compatible with Any and Any as being compatible with every type.

The second sentence basically implies that Any cannot be the top type Top=object, since x: int = object() would be a type error.

② Is compatible with the definition and would logically imply that Any | X = Any and Any & X = X.

Proof: w.l.o.g. assume T1=X, then:

  1. X | Any = X | Union[T1, T2, …] = Union[X, X, T2, …] = Union[X, T2, …] = Any
  2. X & Any = X & Union[T1, T2, …] = Union[X & X, X & T2, …] = X | Union[T2 & X, …] = X
③ Is compatible with the definition* and would logically imply that Any | X and Any & X are irreducible forms.

This has some crucial advantages over ②:

* This appears how Any gets actually often used in practice. ("this entity has some type, but I am too lazy / do not want to explicitly express it at the moment"), with the only caveat that Any is universal, whereas using actual ad-hoc variables could create impossible intersections and collapse to Never.

from intersection_examples.

ippeiukai avatar ippeiukai commented on September 27, 2024 3

By the way, in terms of gradual typing, I feel Intersection[T, Any] = T is awful.

some_fn(some_obj)

# ===

def wrap_obj(val: T) -> T & SomeAddedTrait:
  …

a = wrap_obj(some_obj)
some_fn(a)

Here, let’s assume some_fn and some_obj are provided by two different libraries both currently untyped. What if some_fn got typed before some_obj gets typed? I expect I can do anything to a that I can do to some_obj. I don’t expect a to be reduced to mere SomeAddedTrait.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 2

I entirely disagree with the above. Any is the universal set of python objects and object is not. Not only is this the actual definition of Any per pep 484, but object is clearly not the universal set of objects, as the type of types is type While having a problematic defined interface.

The current definition of Any itself creates contradictions in any attempt at a logically consistent type system

A special kind of type is Any. Every type is consistent with Any. It can be considered a type that has all values and all methods. Note that Any and builtin type object are completely different.

The attempt to treat Any as if it is a subtype or supertype is misguided at best and leads to inconsistent results.

While Any is implemented as a type in python, it isn't conceptually a type, but the set of all types when it comes to how it is treated in type checking and it's purpose in the type system.

Additionally, the claim that intersection should work like Union in the below is a complete misunderstanding of logical operators

class A: pass
class B: pass
def f(b: B): ...
x: A & Any
f(x)  # No problem!  This is because intersection broadens the interface to everything.

Intersection is "all" requirements, Union is "Any" requirements. Intersection should not be seen as broadening the allowable types, that is what Union does. Intersection is more restrictive on the types allowed the more types which are added to it.

Conflating the potential interface with the allowable types is problematic. In the Case of A & Any Things which are not A strictly are not part of the intersection.

The other problem comes in from the other part of the definition of Any per pep 484

A special kind of type is Any. Every type is consistent with Any. It can be considered a type that has all values and all methods. Note that Any and builtin type object are completely different.

This speaks to the potential interface of Any, not to the type of Any. Without changing the definition of Any, this part should make Any unsafe as an operand in an intersection, as by having all potential methods, it also has all potential conflicting method definitions with other operands of the intersection.

consider:

class A:

    def do_something(self):
        ...


class B:

    def do_something(self, x: int, /):
        ...

These are incompatible interfaces for an intersection and Any can be either.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024 2

Any is the universal set of python objects

Where do you see in PEP 484 that Any is the universal set? Perhaps it would be best to clarify using mathematical language exactly what you mean by "universal set". It's not the union of all types Omega for the reason I described above. Do you mean something else?

and object is not.

I didn't say object was the universal set. I said that object is the universal base class. This is a technical term. I'll add a link to the wikipedia article.

Intersection should not be seen as broadening the allowable types,

Intersection broadens the allowable parameter types for which x can be passed for the exact same reason that union narrows it. I will add a section on a comparison to unions then to make all of the rules clearer and more obvious. Since unions are implemented today, you will be able to verify these rules yourself.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 2

This is entirely wrong, you are conflating types and their interfaces here,

I added a section with an example so that you can see for yourself that intersection does indeed broaden what you can do with an object. Is it possible you misunderstood what I'm saying?

your edit adds "Why does intersection broaden the way a variable can be used?" this is the definition of an interface, not a type. You are still conflating the two, and the difference between types and their interfaces is actually why this is such a problem with the current definition of Any with intersections. Please review what other people have told you about this.

Unions are broadening of type, but only the shared interfaces may be safely used because we only know that it has at least one of these interfaces.

Intersections are narrowing of type, but because of the requirement of compatibility with all of an intersection's types, allow using the combined interface of the types.

In any interpretation of Any based on its current definition in accepted peps, Any presents a potential problem here.

Any having all potential interfaces makes Any means that all of the potential interfaces of Any are in conflict with the other operands as they can be defined as incompatible, it has all interfaces, compatible versions of them and not. For this reason, either the definition of Any needs changing or Any is unsafe as an intersection operand.

But even with this, the other type requirement of the intersection does not go away in the process.

The definition of Any is logically inconsistent and intersection and its logical consequences just shine a light on this pre-existing problem.

For what it is worth, mypy provides configuration options forbidding the use of Any specifically because of the problems of Any, and Ruff provides lints disallowing Any for the same reasons.

If it comes between implementing a typing feature in a way that is logically broken, and not allowing some code that already isn't typed to make use of a typing feature, the latter is preferable.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 2

in conflict with the other operands as they can be defined as incompatible, and it has all interfaces, compatible versions of them and not.

That's not a conflict for the same reason that T | Any is not a "conflict". You can simply do whatever you want with such an object of this type.

It is a conflict in the intersection case though. T & Any requires the interfaces of both T and Any. Any has all interfaces, those compatible with T and not.

You can't do whatever you want with T | Any

from typing import Any
from datetime import datetime


def function(x: datetime | Any) -> None:
    x.f()
main.py:6: error: Item "datetime" of "datetime | Any" has no attribute "f"  [union-attr]
Found 1 error in 1 file (checked 1 source file)

This also isn't unique to parameters, as the type also matters when it comes to potential reassignment, it happens even with a direct typing of a variable:

from typing import Any
from datetime import datetime


def function(x: datetime | Any) -> None:
    x.f()


def other_function() -> None:
    x: datetime | Any = datetime.utcnow()
    reveal_type(x)
    x.f()
main.py:6: error: Item "datetime" of "datetime | Any" has no attribute "f"  [union-attr]
main.py:11: note: Revealed type is "Union[datetime.datetime, Any]"
main.py:12: error: Item "datetime" of "datetime | Any" has no attribute "f"  [union-attr]
Found 2 errors in 1 file (checked 1 source file)

You can only use the interface provided by T safely, as you could receive T here, rather than Any. While Any has a compatible interface, you still need to care about the interface of T.

The difference between types and their interfaces matters. you cannot handwave them away with less rigorous definitions, and this is highly important here because the behavior of Union and Intersection are diametrically opposed in their effects of broadening vs narrowing of both interfaces and types

To be honest, I don't think anything is logically broken.

The logical contradiction has already been presented, and you have not given an explanation which satisfactorily shows it isn't in contradiction.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 2

I don't know why you are still arguing this @NeilGirdhar but type theory is pretty clear here and you keep asking for things you've already been provided by people being far too patient with you.

My patience on this is my own, please don't presume that I should be more or less patient with people, thanks.

Why don't you show some code that is ambiguous?

Already have presented it, but I'm fine to explain one more time.

Any is defined as having all interfaces. Some interfaces are incompatible with others, as shown above, but here it is again.

class A:

    def do_something(self):
        ...


class B:

    def do_something(self, x: int, /):
        ...


A & B  # unsatisfyable condition

class AP(Protocol):

    def do_something(self):
        ...


class BP(Protocol):

    def do_something(self, x: int, /):
        ...

AP & BP  # unsatisfyable condition

An intersection must satisfy all of its requirements, conflicting interfaces create a situation where there are no types that could possibly satisfy the requested interface. Any is defined as having all interfaces. So while it may be safe to interact as if any has any version of that interface, it would be impossible to correctly say that Any has a non-conflicting interface with non-infinitely defined interfaces.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 2

the "type" and the "interface". I think your definitions roughly correspond to Sub(X) and Sup(X), but it's hard to say.

This would be a far easier discussion to have from a shared mathematical understanding of type or category theory, rather than the non-rigorous definitions that python currently has adopted, but if python had a mathematically rigorous type system, we wouldn't need to have this discussion currently. The moment you attempt to formalize all of the existing rules into concrete logic, there are some types which can be created at runtime, but are inexpressible by the typing, as well as logical contradictions that go away if you ignore the existence of Any

Sub(X) and Sup(X)

These don't have definitions agreed upon anywhere outside of type theory, and your use of them here does not match the use in type theory. I've seen the doc floating around that proposes using this, but the use isn't consistent with existing type theory, so I've avoided potentially muddying this conversation with the use of one definition where the other might be assumed.

Interface means the obvious thing here, namely the known methods, attributes, etc of an instance of a type, and their signatures and/or types.

Type also means the obvious thing here. (The type which an instance of an object is)

Interfaces and types are largely identical outside of LSP violations when talking about only one type. When discussing unions and intersections of types, this changes.

Treating Any as a singular type with an infinite interface is the mathematical problem here. Python does not have the means to have multiple methods of the same identifier with differing arity/signature like some other languages, so the infinite interface contradicts other interfaces in the case of an intersection where any shared interfaces must be compatible

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024 2

Okay, productive and interesting discussion, I'm going to take a break, but looking forward to seeing the next draft of this proposal. Intersections solve so many type errors; it's a feature I'm really looking foward to.

from intersection_examples.

DiscordLiz avatar DiscordLiz commented on September 27, 2024 2

Could we refine some of the arguments for removing Any

I think the best argument is that the definitions in accepted PEPs already say we should, and that those maintaining type checking implementations have expressed that we stick to simple composable rules. Deviating from PEP483 requires a special case and a strong reason why the obvious rules from existing theory are insufficient, rather than simple composable rules.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 2

TypeScript reduces both T | any and T & any to any

While I understand there is a rationale that allows this as any is both a top and bottom type, I believe that it is unnecessarily paradoxical to reduce T & Any to Any, and that it throws away a known constraint to do so.

Generally speaking, Unions (And all real-world motivations which have been provided for Intersections in python) are used for their use in defining compatibility. If someone defines compatibility by saying "we accept any type", it's not controversial to for the reduction to be more type compatible. If we define compatibility for an interface as "must satisfy a set of constraints", it would be surprising for one or more of the constraints to be discarded simply because Any leaked in from somewhere as one of the constraints.

I think it is fine to view Any as equivalent to an object in this specific context (and in the case of intersections as TypeVars, base it on variance), but it is not and cannot be broadly equivalent for reasons you've already covered in detail. To me, it's all about the direction in which the constraint flows by treating it as top or bottom. This is closely tied to Any's behavior and variance in my mental model as to why this reduction does not make sense.

Note that both of these rules assume no reducibility. If T & Any always reduces to Any, then:

X is always a subtype (and supertype) of T & Any.
Accessing attribute foo from T & Any always evaluates to Any.

The lack of reducibility while treating the entirety of possible attributes as Any carries a similar issue in that the interface is broader than we know it must be. An object which satisfies Any & T must be an instance of T for a concrete type T, or implement the provided protocol in the case of a Protocol, and may have other unknown interfaces which we have to assume exist because of Any. But for this to be compatible, if T provides a definition for foo, (T & Any).foo should not evaluate to Any, unless that's the provided definition from T or we are still throwing away information.

For these specific concerns, I'm pretty negative on taking a direct lead from TS on this one.

  • I can see a value in not reducing it, but not as you have presented the consequences of that.
  • I can see a value of reducing it conditionally based on how the constraint applies to a consumer (And not modifying Union)
  • I can see a value of reducing Any & T unconditionally to T (with or without modifying Union)

I don't think this requires modifying Union to be consistent

The actual parallel to The automatic removal of Any by treating it as only a top type in this context is the removal of Never from Unions, not the expansion of Any In them.

This is because if we view Any as a top type and only a top type in this context, then it is the identity element for intersections of sets of types for the domain of valid Python types. We can remove a top type from an intersection of types because of it being an effective identity element on the set of possible types. Neither adding nor removing such a type from an intersection of types should ever change the set of types the set contains. (So this logic only holds if this is the rationale for this reduction which we go with, but I do contend that in this context, it is a reasonable interpretation, even if any is defined as both)

Similarly, Never is the identity element for unions of types.

from intersection_examples.

randolf-scholz avatar randolf-scholz commented on September 27, 2024 2

@mikeshardmind This would introduce several inconsistencies on its own, consider:

from typing import Any

class Foo(Any): ...

x = "test"
isinstance(x, Foo) and isinstance(x, int)  # does not raise error at runtime
isinstance(x, Foo & int)  # raises error at runtime??

from intersection_examples.

randolf-scholz avatar randolf-scholz commented on September 27, 2024 2

@mikeshardmind

If Any & Foo doesn't reduce or reduces to Foo, then suddenly you threw away type info of the requirement Foo, and should have just written Any if this was your intent.

How has one thrown away any information if Any & Foo does not reduce?

@DiscordLiz Banning Any would greatly hurt gradual typing, because if some_library exports A, but A aliases to Any, I can no longer write any hinted code that has Intersections with A. Until recently, this was for example the case with the very popular pandas library. (and pandas-stubs is still far from complete). It is needed.

I still don't quite see the issue you have with keeping Any & X as an irreducible form. Maybe you can provide a definitive example why the irreducible approach does not work, and @CarliJoy can pin it to the top.

Ideally, the PEP should have an extended alternative approaches section with examples that explains well why certain decisions were made.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

Arguments against An Intersection containing Any becomes Any

This reduces the effectiveness of gradual typing, as it causes a known constraint of the intersection to be thrown away, causing Any existence of Any to become contagion to code using intersections. This could further fragment the ecosystem between fully typed code and not.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

Arguments in favour to Disallow Any in Intersection

It is possible that Any & T is inherently unsafe.

Consider the case Any & AsyncCallback where AsyncCallback is a protocol or abstract base class describing the expected interface a coroutine will be scheduled with. It is possible that a type that satisfies Any conflicts with the required interface of AsyncCallback. (This can also be seen as a reason to remove Any and only treat Any as the universal set, rather than disallow)

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024 1

Dear everyone.

I'd like to propose a fourth option, which is that T&Any collapses to Any in some situations, collapses to T in others, and is irreducible in general. I organized this post in a few sections:

  • some background information that I think highlights some pitfalls,
  • a mathematical analysis with Python examples,
  • various consequences of the analysis,
  • an argument against universally collapsing intersection with Any, and
  • a very short conclusion.

I briefly discussed this with Jelle who suggested I post this to this thread. Please let me know if I've made any errors.

Background: Unions with Any

Before I get into the meat of it, here's an interesting discrepancy between PyRight and MyPy:

from typing import reveal_type, Any, TypeVar
def f(x: bool, t: str, a: Any):
    y = t if x else a
    reveal_type(y)  # Pyright: str | Any; MyPy: Any
    y.f()  # Pyright: Error; MyPy: Okay

As we can see, Pyright is not collapsing str | Any to Any. It treats it as irreducible.

I think MyPy is doing the wrong thing here. (I discussed this with Jelle who said it was due to "mypy's general behavior to use the join operator for ternary expressions. We'd like to fix that at some point but it's difficult for compatibility reasons.")

The reason this is incorrect is for the same reason it's an error to do this:

from typing import reveal_type, Any, TypeVar
def f(x: bool, t: str, a: int):
    y = t if x else a
    reveal_type(y)  # str | int
    y.strip()  # Cannot access member "strip" for type "int".

Thus, we can see that in general T | Any is not Any. Although, there are cases where it can be collapsed to Any.

Background: Definition of Any

It's also worth exploring what Any really means since that seems to be a source of confusion. Any is not:

  • the universal base class object,
  • nor is it the union of all types (the "universal set" as mentioned in a comment above) Omega = T_1 | T_2 | ... for all types T_i.

That's because its supertypes Sup(Any) = Any whereas Sup(Omega) = ∩_i Sup(T_i) = Sup(object) = object.
This is why we cannot reason from set theory.

Analysis

For any type T, consider its subtypes Sub(T) and supertypes Sup(T). For an intersection, A & B, we have:

  • Sub(A & B) = Sub(A) ∩ Sub(B)
    This means:
def f(y: A & B): ...
if isinstance(x, A) and isinstance(x, B):
  f(x)  # okay.
  • Sup(A & B) = Sup(A) ∪ Sup(B)
    This means:
def g(a: A): ...
def h(b: B): ...
x: A & B
g(x)  # okay
h(x)  # okay

So whereas it is true that each component of the intersection narrows the set of Sub(A&B), it increases the set Sup(A&B). Intuitively, this means that each element is an extra constraint on what you can pass to a function that accepts an intersection, but is a broader interface of what you can do with a variable of type A&B.

We can now deduce the rules:

  • Sub(A&C) = Sub(A) ∩ Sub(C)
  • Sup(A&C) = Sup(A) ∪ Sup(C)

For ordinary types A and C<A, then we get:

  • Sub(A&C) = Sub(C)
  • Sup(A&C) = Sup(C).
    And thus, the regular rule that A&C=C is true for ordinary types.

However, this is not the case when C is Any! Then we have

  • Sub(A&Any) = Sub(A) ∩ Sub(Any) = Sub(A)
  • Sup(A&Any) = Sup(A) ∪ Sup(Any) = Sup(Any) = Omega!

Thus, in general, A&Any is an irreducible form.
But as a parameter of a function (or similar subtype question), we have

class A: pass
class B: pass
def f(x: A&Any): ...
f(A())  # Okay, of course.  A is both a subtype of A and Any.
f(B())  # Error, of course.  

As a parameter to a function (or similar supertype question), we have

class A: pass
class B: pass
def f(b: B): ...
x: A & Any
f(x)  # No problem!  This is because intersection broadens the interface to everything.

Why does intersection broaden the way a variable can be used?

This seems paradoxical, but intersection broadens the way a variable can be used in exactly the same way that its dual, union, narrows the way a variable can be used.

Consider:

class A: ...
class B(A): pass

def g(a: A): pass
def h(b: B): pass

def f(x: bool):
    y = A() if x else B()
    reveal_type(y)  # A | B
    g(y)
    h(y)  # Argument of type A | B cannot be assigned to parameter of type B

Given a union, only those operations that can be done to every element of the union can be done to the union.

This is why any operation that can be done to any element of the intersection can be done to the whole intersection. We can test this in Pyright:

class A:
    def a(self): pass
class B:
    def b(self): pass

def f(x: A):
    assert isinstance(x, B)
    reveal_type(x)  # subclass of A and B; i.e., A & B
    x.a()  # Okay!
    x.b()  # Okay!

Thus, the intersection A&B can do anything that any of its components can do.

Consequences

In a function, the desired behavior of collapsing Any is correct:

def f(x: T & Any) -> None: ...
reveal_type(f)  # (T) -> None

(This is what everyone wants.)

If a variable ends up with type T & Any, for example as the output of a generic function, it should be usable everywhere:

def f[T](x: U) -> T & U: ...
x: Any
y = f(x)
reveal_type(y)  # T & Any

def g(s: str): ....
g(y)  # Always passes!

We need some kind of special case for isinstance(x, T) when x has type Any.

def f(x: T, y: Any):
  assert isinstance(x, U)
  reveal_type(x)  # Ideally, T & U; currently, PyRight can sometimes do this; MyPy gives U.
  assert isinstance(y, T)
  reveal_type(y)  # Must remain T;  Cannot be Any & T or type checking will be severely impacted.

I think the easiest thing to do would be for isinstance to intersect types unless one type is Any, in which case it replaces it. Some thought should go to this to ensure that no weird cases pop up. In the worst case, a new parameter to isinstance may be necessary to select intersection or replacement. This addresses the "contagion" effect mentioned above. It also means that T & Any will be extremely rare, which I think is what everyone wants as well 😄.

Along the same lines, there are other ways to narrow types, including singledispatch, match statements, and function overloads. I think the logical thing to do is to allow Any & T to leak into them and out of them. If this becomes problematic, a user may be able to use isinstance to eliminate Any.

Danger of unconditionally collapsing T & Any to T

Consider:

class A:
    def f(self):
        pass

class B:
    pass

T = TypeVar('T')
def g(x: T) -> T & B:
    assert isinstance(x, B)
    return x

def j() -> A:  # Case 1: return type annotated A; Case 2: Any
    return A()
def k() -> Any:
    return A()

y = g(j())
reveal_type(y)  # A & B
z = g(k())
reveal_type(z)  # Any & B

y.f()  # Allowed.
z.f()  # Only works if Any & B is not collapsed to B.

If intersection collapses types automatically, then z.f() fails even though all that's changed between y and z is that the argument to g has been changed from a type to Any. It would be extremely weird for broadening a type to Any to create type errors! These errors can be very hard to track down since they can be totally unlocalized (one library can change a type to Any, and the error can pop up miles away in user code that doesn't even use that function directly). Thus, these errors would be extremely pernicious.

Conclusion

T & Any is irreducible in general. Its subtypes are the subtypes of T, and its supertypes are the universe of all types.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

Intersection broadens the allowable parameter types for which x can be passed for the exact same reason that union narrows it. I will add a section on a comparison to unions then to make all of the rules clearer and more obvious. Since unions are implemented today, you will be able to verify these rules yourself.

This is entirely wrong, you are conflating types and their interfaces here, as was already explained to you in the main thread here: python/typing#213 (comment)

I didn't say object was the universal set. I said that object is the universal base class. This is a technical term. I'll add a link to the wikipedia article.

Except that it isn't, you can see this with the type of object being type, and the type of object() being object, I'm quite aware of how the type system works and don't need you to link me a Wikipedia article on the matter.

Where do you see in PEP 484 that Any is the universal set? Perhaps it would be best to clarify using mathematical language exactly what you mean by "universal set". It's not the union of all types Omega for the reason I described above. Do you mean something else?

I've edited in more detail. The term universal set of objects is not used, but definitions consistent with it are. However there's another part of the definition which is problematic by both saying Any is consistent with all objects, and also saying Any has all interfaces. This is quite literally self-contradictory the moment we consider that interfaces can have multiple conflicting definitions.

from intersection_examples.

DiscordLiz avatar DiscordLiz commented on September 27, 2024 1

There's no contradiction. Why don't you show some code that is ambiguous?

I don't know why you are still arguing this @NeilGirdhar but type theory is pretty clear here and you keep asking for things you've already been provided by people being far too patient with you.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024 1

I don't know why you are still arguing this @NeilGirdhar but type theory is pretty clear here and you keep asking for things you've already been provided by people being far too patient with you.

I only posted because Jelle looked my post over and said it looked right. Let's try to keep the discussion civil.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

The way you've been using type doesn't match what is returned by reveal type in Python, which I think is the source of the confusion. I imagine, you would say that T | Any is not a type either? And yet it's returned by reveal type. So this is the way I've been using it.

Correct, I consider T | Any a set of types defined by the union of types which satisfy either T or Any, having the knowable interface shared between them.

I think the fact that python's terminology is so loose compared to more well-defined type theory creates more situations like this than I'd like.

Allow intersecting types, but only allow the intersection to have an interface of compatible methods.

I believe this should already be considered a requirement, but I disagree about it solving Any with Any's current definition.

I think it's possible to redefine Any and Never in terms that resolve this without needing to remove the usefulness of Any and without disrupting any current uses (because the problem only arises currently with the not yet implemented intersection, and other things that have not been accepted yet, such as Not[T])

A definition for Any which would work:

The set of all possible types. Type checkers should allow any operation on things which are only known to be a member Any, as this is the default state of objects. Type checkers may restrict the allowed interface or the known type as more information becomes known about the object through narrowing via instance checks, type checks, or intersections with other types.

And a corresponding definition of Never:

The empty set of types. Functions which always raise never return anything. Objects which have no valid types should be marked with this, and code which is statically typed and attempts to use objects with this type should be considered as an error as this code is either unreachable or there has been a logical error in the types given to the analysis tool

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

What are the consequences of this defintion for the examples above?

For the case of Any, I believe it leaves all existing code exactly as is due to the restriction on type-checker behavior included. Type checkers already coerce Any to specific types via isinstance, so saying that this is desirable behavior doesn't change anything either. Leaving it as may leaves room for type checkers to not be considered in error if they have an opinion about not narrowing something for either performance, edge case, or any other reasons not considered at this time. This language could be restricted more in the future if python's type system is ever considered to be near-complete, but currently I think may leaves a good amount of freedom, while still having a strong definition of what behavior is allowed here.

The behavior on intersection would result in the type Any & T being reducible to T as rather than being all compatible types, it's merely the set of all types. I don't have a good answer for the impact on the edge case you presented near the start with Any, but I believe that this case is likely to be rare, and only happen in cases where the required interface of T is what is cared about, as users of Any have no benefit for using intersection to say it conforms with other interfaces as Any is already treated that way.

It may be something that type checkers should warn is happening by default or include a configuration setting to inform users is happening

The behavior of Never with these definitions would be entirely as it currently is. The current use of Never is already consistent with the set-based definition.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

most importantly, instead of asking "what type is produced by T & Any", we should ask what "type" and "interface" is produced by T & Any where T is a set of types and an interface, and Any has a set of types and an interface.

It may also help to provide some examples, so starting that.

Intersection A & B

Type: Intersection of A and B. If A and B are both concrete types and not protocols, this must be a subclass of both. If either is a protocol, the protocol must be satisfied. If A or B is already an intersection, the intersections are combined.
Interface: The interface of A and B are both available and must not conflict.

(A & B) & C is equivalent to A & B & C

Union A | B

Type: Union of A and B. If A or B are already Unions, The unions are combined

Interface: The attributes, methods, and properties shared between A and B which have compatible types. If A and B use an identifier in conflicting ways, the accessor of the object cannot know which way it is used without checking.

(A | B) | C is equivalent to A | B C

(The below may seem contrived, but I think including it can assure the definitions we end up on remain consistent)

The order of intersections and unions matters

Unions of Intersections (A & B) | (B & C)

Type: a union of the intersection A & B and the intersection B & C
Interface: The shared compatible interface of 1) the combined interfaces of A and of B which must not conflict with each other and 2) the combined interfaces of B and of C which must not conflict with each other

(A & B) | (B & C) is equivalent to (A | C) & B

Intersection of Unions (A | B) & (B | C)

Type: Intersection of the unions A | B and B | C
Interface: The interface of both the unions A | B and B | C, where each of those unions has the shared minimum nonconflicting interface, and no permutation of union elements results in a conflicting interface for the intersection*

(A | B) & (B | C) is not equivalent to B & (A | C), but to (A & C) | (B & (A | C))

* A contrived case, to be sure, but...

class A(Protocol):
    def this(self: Self) -> Self:
        ...

class B(Protocol):
    def that(self: Self, other: T) -> T:
       ...

class C(Protocol):
    def this(self: Self) -> None:
        ...

Problematic = (A | B) & (B | C)

There are two possibilities here

  1. This reduces to: (A & B) | B | (B & C) due to the incompatibility between A & C
  2. This is considered unsatisfiable as a set of types

I believe 2 to be the correct interpretation. 1 results in a different interface than may be expected.

Why this matters when considering Any

This is a set of ordering and satisfiability constraints that works with the proposed update to the definitions of Any and Never. Should these orderings and constraints be considered undesirable, we may be back to the drawing board on handling Any

from intersection_examples.

tomasr8 avatar tomasr8 commented on September 27, 2024 1

Thanks for the writeup @mikeshardmind! FWIW, I'm also working on trying to figure out how to combine intersections with unions and I think we mostly agree except for the last case - on the surface it seems like having the usual laws like distributivity satisfied is desirable, but let's maybe discuss it over at discord, so we don't pollute this issue :D

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

I was thinking about chaing the language of "type" and "interface". Let's call this the "literature type".

I'll give some examples of why I don't know that more definitions are strictly helpful and that it's more about understanding that there are some things we should not consider a type, but a specification of what we know about possible/allowed types.

The moment many of these are thought of this way, the distinction between a concrete runtime type and the interface we can semantically and statically show to exist becomes automatic.

In many cases, we don't have a concrete exact type. This is true even without considering Unions or Intersections when it comes to protocols. An annotation of a protocol is not a type, it's an expectation of a required interface and is sometimes referred to as structural typing (which is distinct from typing based on composition or inheritance)

In many places, we are not specifying a type, but what we know about the possible types or interfaces an object might have.

A protocol as an annotation is a specification about a known interface but says nothing about the concrete type other than that instances of it which satisfy it must implement that interface.

This is a protocol taken from real library code. The intent is that a user could provide most choices for a numeric container, including those that I might not have knowledge of as the library author, such as a custom vector class. The library does abstract math and simplification of terms on objects for which these are well-defined operations. This includes symbolic simplifications and in the future higher-order operations on functions of functions.

class SupportsBasicArithmetic(Protocol):
    
    def __add__(self: Self, other: Self) -> Self:
        ...
    
    def __sub__(self: Self, other: Self) -> Self:
        ...
    
    def __mul__(self: Self, other: Self) -> Self:
        ...
    
    def __truediv__(self: Self, other: Self) -> Self:
        ...

    def __eq__(self: Self, other: Self) -> bool:
        ...

Code that accepts parameters annotated with this, has no knowledge of the type of the object they are accepting, only of a required portion of the interface.

Additionally, writing

x: int | Fraction = 1

The type of x at that point in time is int (or Literal[1], but the annotation states that this isn't the intent), but it is annotated to specify that it could also be assigned as a Fraction.

Saying that x has a type of int | Fraction there feels wrong to me in terms of terminology, that's only what types (plural) x is allowed to have.

As to specific things you wanted defined:

a type returned by reveal_type,

With this model of thinking about it, reveal_type is a request to the type checker for what it knows about the potential types of an object (and type(thing) is still the canonical way to get the actual type at runtime)

the type of a parameter annotation, and
the type of a variable annotation.

The annotation is a specification of what must be true about the type of the object for code relying on this annotation to be sound from the perspective of static analysis.... a set of specifications about the allowed type(s) and interface(s) required.


Edit: Actually, we probably still need more definitions accepted in some regard, and many definitions could stand to be made more clear, more rigorous, and/or more consistent, I just don't think we need multiple definitions for type. We just need to consider where we don't have a type, but a description of potential types.

We should still have interface as a defined term. I believe it was defined well in the discussions about protocol and some of the discussions about variance, but I don't think there's a good canonically accepted definition of interface for python currently

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

(answering parts of this out of order)

I think we should consider consistency with Any as a mechanism to selectively disable type checking where type checking is impractical.

python/typing#213 (comment)

This is more likely to break people through Any being contagion than intentional use. If you want to disable type checking, you can write Any instead of Any & T, but if instead Any always marked for disabling type checking, other code which interacts with types you export would find Any leaking out in this manner as destructive on their types.

Python's Any has a definition in typing PEPs and is extremely commonly used. I'm not sure I totally understand the consequences (to actual code) of the change of definition that you're proposing. Do you want to elaborate?

If (and I understand that it's a big if to change this definition given how established it is) we were to change the definition, I would only be willing to argue it if the definition change is done in such a way that the actual impact is only on logical consistency and done in a way to not change any existing established behavior of Any.

If you go your route of balking at the logical contradiction then Any & Any is not even going to be Any. You don't think that's even more paradoxical?

I think that's the correct logical conclusion currently and that that is paradoxical. This is why I'm in favor of updating the definition.

Any's current definition leaves a lot to be desired, and I want to see python able to benefit from established theory. Any's current definition makes that challenging without special casing it.

Any exists because it is impractical for everything to be typed for a variety of reasons including limitations of the type system, developer time, use as generic parameters.

If I could go back in time and argue only one change to the original specification, it would be not to have Any and to instead have a # type: untyped directive, to just allow things that are not well-typed to be marked as not for inference, and to not interact with typing directly. With that said, I understand the need for Any, and am not going to attempt to argue for it's removal, that's a non-goal of mine.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024 1

This is more likely to break people through Any being contagion than intentional use.

Right, we need to be practical. That's why it's important to look at examples, I think. It's hard enough to create an intersection with Any, so I wonder how often this will happen in the first place. But when it does happen, it can create weird type checking errors that appear when you introduce Any. So there's a balance.

it would be not to have Any and to instead have a # type: untyped directive

The problem with that i that there are generic parameters to functions or classes, for example, that can be Any or a concrete type, so the typed directive would essentially be conditional. And you can in general have a stretch of code that is written agnostically to surrounding code that becomes typed or untyped based on the use of Any in the surrounding code.

from intersection_examples.

CarliJoy avatar CarliJoy commented on September 27, 2024 1

Did this but as written there already - have no idea how that should work in practice. Could you paste some examples or even better an algorithm how a type checker should handle intersection with Any?

a: int & Any
a = 1   # valid
a = "Help" # invalid
a.foo() # valid

def foo() -> Any&int:
    return "a"  # invalid

In short: So for assignments the Any is ignored but for subtypes the it is considered?

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024 1

In short: So for assignments the Any is ignored but for subtypes the it is considered?

All of your examples look right to me. Are there any other examples that we need to reason about?

Did you notice that PyRight currently (and rightly) treats X | Any as irreducible? See above for examples of that.

from intersection_examples.

DiscordLiz avatar DiscordLiz commented on September 27, 2024 1

I wonder what's Python's equivalent for TS's unknown, and how Python's Any differs from TS's any.

Python's Any is much closer to being equivalent to TS's unknown than TS's Any.

from intersection_examples.

DiscordLiz avatar DiscordLiz commented on September 27, 2024 1

For the record, as got discussed briefly in a discord call while I was working on an adjacent paper to come up with more formal definitions which are accurate to current use, we were able to show that it holds for a top type as well as the universal set. by reaching for category theory for answers rather than set theory applied to types.

If We can show that Any is functioning as a top type for python's type system in the context of an intersection, it is still logically removable from that intersection by being a functional identity element on the operation of intersection over categories of types, for the domain of valid python types.

While it was a highly productive conversation, that's a big if still. All use cases I could find in support of and motivating Intersection, if Any leaked in, it could only leak in while functioning as a top type, but that's a few steps away from being confident that either that's all that is possible, or alternatively, that even if it isn't, type checkers can cheaply detect this.

You also have your work cut our for you in providing both the formal logic and a matching a corresponding explanation of why it works that is more accessible.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

I could have sworn I double-checked that first... Nevermind, it should hold for object as well.

I need to double-check my work later on a few other things tangential to this, but I'm positive it would have no impact on the Any case, as that was done by hand (And using more abstract rules than some of the specific things I checked later) I seem to have misrepresented python's type in coq, so the 4 or so interesting things I checked after that need to be rechecked after I fix the definitions.

from intersection_examples.

CarliJoy avatar CarliJoy commented on September 27, 2024 1

Can we conclude -> Any is not reduced?
As this created a huge discussion, can someone summarize in short an comprehensible words the decision and how we got here?
There is always a section in the PEPs of Recjected Idea -> I think this belongs into it.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

Can we conclude -> Any is not reduced? As this created a huge discussion, can someone summarize in short an comprehensible words the decision and how we got here? There is always a section in the PEPs of Recjected Idea -> I think this belongs into it.

Maybe. I'll make sure a summary happens in any outcome, I heavily appreciate PEPs that go through why things were done a specific way.

I'm still looking into formal correctness right now, and there's still an extremely strong argument to forbid Any from intersections all together which comes from group theory, but I need to be prepared to argue it pretty carefully with everything lined up because this would be saying that gradual typing can't participate in Intersections.


Edit: The same group theory says we shouldn't allow Any as defined in python in: Unions of types, Intersections of Types, any kind of acyclic topology or graph based on subtype relationships, and (ultimately, if you look enough) at all period. All existing theory says that Any as defined cannot exist in a well-defined type system. Since we clearly do allow any in some of these contexts, we need a formalization that allows it, special cases it, or which has rules that ignore the contradictions of Any.

For completeness' sake, a few approaches utilizing concepts from set theory as applied to types, homotopy type theory, and category theory all also had some logical inconsistency relating to Any, some of them showed up prior to Unions and Intersections.

Since the inconsistencies are so everywhere, I will continue what I have been doing, advocate for the most correct thing we can do given the circumstances, and continue forbidding the use of Any in code I own. We can still have a pragmatic definition for Intersection.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024 1

② Is compatible with the definition and would logically imply that Any | X = Any and Any & X = X.

2 is not compatible with the documentation.

Most of the discussion in this thread was essentially created by the misinterpretation of Any as your 1 or 2, and I went over both of these in my comment in the section "Background: Definition of Any" in which I go over both cases.

"Any = TypeVar('Any')" — Any is an ad-hoc variable of unknown type, assumed to be used compatibly.

You can say this about any type:

int = TypeVar('int', bound=int)
A | B = TypeVar('X', bound=A | B)
Any = TypeVar('Any', bound=Any) = TypeVar('Any')

Although the type variable interpretation might be a good justification for conversion of a type toAny never creating type errors.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

@randolf-scholz @NeilGirdhar

I brought up some additional information in the less formal discussion about some of this in discord, but hadn't typed it up for github yet.

It's provably impossible for Any to exist as documented in a coherent type system with transitive subtyping behavior.*

Here it is in 1 way (more of these are shown in the discord conversation)

Any :> T1 :> T2 ... :>TN :> Any
~ TN :> T1
results in an immediate contradiction if transitivity is a requirement for the subtyping relationship :>

We can also show that this is the current behavior of type checkers, matching the definition, and we cannot simply say the definition should not allow this without being highly disruptive to gradual typing.

def f(x: int) -> Any:
    return x

def g(x: Any) -> int:
    return x

I'll probably be writing a draft PEP to amend the definition of Any to have it clearly be stated as the exception, clearly note that it is not a type in the type system, but an escape hatch from it for the purposes of gradual typing, document its behavior extensively, and also recommend that all future typing PEPs have a section on how they interact with gradual typing, even if it is just a note that there are no special considerations for gradual typing relating to the PEP.

The only logically sound answers here are to exclude Any from intersections or accept that Any cannot be logically consistent, but that we can have how it behaves in an intersection documented.

Trying to claim that Any has fully logically consistent behavior in any form here will only lead to further disagreements based on the lens people are looking at Any from.


* Edit: To clarify, PEP 483 already says that Any doesn't follow this transitive behavior, it's not new. I'm just recommending that we keep in mind that Any is the exception here.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

③ Is compatible with the definition* and would logically imply that Any | X and Any & X are irreducible forms.

Also brought up in the discord briefly. While this appears to be the "most consistent" approach, it's functionally problematic to do this without also defining "what is the expected structural interface available after something satisfies this type", and "what is the assignability of this type?"

The structural interface appears to be either equivalent to Any by definition of Any and compatibility, or if we are attempting to not discard the known constraint provided by T, There are possible definitions for (Any & T).foo which widen the compatible use of T.foo with the exception of attributes that are non-subclassable types, or any other types which cannot be expressed compatibly to T.foo.

The assignability though still appears to require satisfying T.

If Any is allowed in intersections, Then we really need clear handling for each bit of this, and it is going to be somewhat arbitrary due to the inconsistencies of Any, so we're basically picking for behavior that is most helpful to users.

And If we're acknowledging that this is somewhat picking for behavior arbitrarily...,

  • Either banning Any in intersections or saying that type checkers must provide a configuration to warn for uses of Any (mypy already does the latter) both provide more safety in line with the reasons for static typing in the first place. Banning from intersections may be a cause of friction though. A mandated warning configuration (off by default is fine) for the use of Any to allow users to be warned of its inconsistencies at least makes many of the surprise cases below less surprising.

  • Any & T being Any would be the most surprising to users who do not have a deep understanding of the type system, the history of gradual typing, and the exceptional behavior of Any. (unless they just blindly accept that this is what TS does, so python doing it is "fine".) It doesn't match intuitive thought that comes up from sets of types, or of categories, or the other uses of & and | in python. (and it's "ok" that it doesn't, Any can't, but this would be picking arbitrarily something which doesn't)

  • Any & T being T would be more restrictive, but a somewhat pragmatic outcome as it doesn't ban Any from intersections breaking places where Any may leak in from partially typed library code, while also retaining the additional guard rails people want from static typing. If people intentionally want the behavior of Any & T as Any in their own code, they can always just write Any, and it may encourage more people to use protocols at least defining the parts they know and rely on if they can't fully type it still.

  • Any & T being Any & T, and Any & T not being equivalent to Any | T or Any behaviorally, requires users to understand a complex exception whenever considering the ramifications of this type. If instead we have it behaviorally equivalent to Any | T or to Any (some interpretations of the behavior of Any & T would result to one of these) we would then circle back to arguments about it having surprising behavior to users and that there are already existing ways to specify that.

  • Alternatively, to all of this, there's also the option to treat intersections as strictly as structural in nature only.

from intersection_examples.

ippeiukai avatar ippeiukai commented on September 27, 2024 1

I'll probably be writing a draft PEP to amend the definition of Any to have it clearly be stated as the exception, clearly note that it is not a type in the type system, but an escape hatch from it for the purposes of gradual typing, document its behavior extensively, and also recommend that all future typing PEPs have a section on how they interact with gradual typing, even if it is just a note that there are no special considerations for gradual typing relating to the PEP.

Following this issue in the context of intersection, I think this is a great idea. Any is so special that it deserves its own PEP. This thread is all about Any and became less and less about Intersection. Why not leave the precise treatment of Any out of the Intercection’s spec as much as possible? Perhaps only stating a few things that it really should or should not allow.

The PEP for Any’s definition can discuss how Any should interact with Union, Intersection, and other constructs freely. Type checkers can choose to be looser or stricter in terms of reducing Intersection with Any until the other PEP comes in, at which point treatment of Union with Any can be also looked at.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

Here, let’s assume some_fn and some_obj are provided by two different libraries both currently untyped. What if some_fn got typed before some_obj gets typed? I expect I can do anything to a that I can do to some_obj. I don’t expect a to be reduced to mere SomeAddedTrait.

This is exactly the reason I think Any shouldn't even be allowed in an intersection. There is no outcome with it in it that wouldn't be surprising to someone, no matter how well it is documented.

  • If you surprise the gradual typing user with more they have to fix, you reduce the benefit of gradual typing for them.
  • If you surprise the person expecting their code to be typed, they lose pre-runtime safety checks they expect to have.

Neither of these is good, and the only outcome that avoids both is banning Any from the intersection.

If you instead error and inform them of the ambiguity, there are a couple solutions available to the person the error is presented to, at least one of which should work.

  • Use Any instead of the intersection until the operands are typed (building the gradual typing by typing dependencies first...)
  • Type the side of the intersection that would result in Any with a protocol indicating how you intend to use it instead, being compatible with both the current state and a future state where it is typed.

There's also the matter here that banning Any from the intersection should not disrupt gradual typing.

  • The person writing the intersection has to be the side with the typed code already. If they aren't, then They can just write Any
  • There are two ways for Any to make it into an intersection from untyped code from the perspective of typed code.
    1. By importing a name which is aliased to Any and directly including it.
    2. By TypeVar use.

In the first case, this should be a pretty clear case of "they probably don't know they are importing something aliased to Any, because an intersection with Any is lossy on type information no matter how we try to resolve it"

In the second case, by the necessity of how generics and type variables are scoped, the type var has to belong to the same codebase as the intersection, and the typed code user can set the bound of the typevar to object or to a protocol based on what they need from that side of the intersection (rather than to Any)

from intersection_examples.

DiscordLiz avatar DiscordLiz commented on September 27, 2024 1

@randolf-scholz

@ mikeshardmind @ DiscordLiz What do you 2 even mean by banning? Do you suggest that int & Any raises an error at runtime, like in your example of using a generic as a bound of TypeVar?

What @ mikeshardmind clarified is fine for a definition of banning it.

You don't need it. The person with typed code can specify what they expect and still have things that are typed as Any work via Any's compatability definition.

# untyped code
x: Any = SomeLibraryType()

# typed code
def foo(param: SupportsAbs & SupportsIndex):
    # Here we can assume abs(param) is fine, we don't lose typing the body


foo(x)  # It's Any, it's compatible, we don't lose being able to use this.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

What? Why? Any & T.foo is still irreducible and not equal to Any. Only things compatible with T.foo would be accepted.

If that's the actual answer people reach on it, that's fine, but here's a quick counter example that complicates this issue because of how pervasive Any becomes the moment it is included

class X:

    def foo(x: int) -> int:
        ...

class Z:
    def foo(x: int | str) -> int:
        ...

Well, what if T.foo is the definition provided by X? There are "compatible" definitions (such as the one in Z) that Any could be compatible with that would be unexpected use.

Any & X makes X.foo(some_str) compatible use. This is what I mean by it having surprising effects for people expecting their code to be typed, as it is (With some limitations) very similar to just reducing to Any

from intersection_examples.

randolf-scholz avatar randolf-scholz commented on September 27, 2024 1

@mikeshardmind I retract the sentence Only things compatible with T.foo would be accepted. that was obviously wrong. The question rather becomes how type-checkers shall treat irreducible forms Any & X.

Again, I think one should let type-checkers decide how strict they want to be, instead of outright banning it.

  1. I think we can agree that (Any & X).bar should always be fine if X does no supply bar
  2. If X provides foo, then type-checkers could provide a strictness flag that (Any & T).foo aliases either to T.foo or Any.

But what's the need to set this in stone as part of the PEP? PEP484 didn't formally specify how to handle Union[Any, X] either.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024 1

Any & X makes X.foo(some_str) compatible use. This is what I mean by it having surprising effects for people expecting their code to be typed, as it is (With some limitations) very similar to just reducing to Any

I think that is what you want in practice. Having Any there is supposed to broaden the interface. You should examine in which circumstance Any could actually be there.

It isn't. If I wanted this, I have that right now by just typing Any. To what end would I ever write the intersection if that was what I wanted? I do not want to throw away interface details, and I largely assume that intersections only make sense between already compatible interfaces. The fact that Any clobbers every interface is just not useful in this context.

from intersection_examples.

randolf-scholz avatar randolf-scholz commented on September 27, 2024 1

@mikeshardmind

If I wanted this, I have that right now by just typing Any. To what end would I ever write the intersection if that was what I wanted?

It happens if you import A from some_library and A happens to alias to Any. I already gave pandas as an example of a library which did this. They aliased Series and DataFrame to Any because they didn't have stubs yet, and these were very complex classes that would raise too many false-positives otherwise.

It still makes sense to type-hint code with these classes for several reasons:

  1. typing-stubs might become available in the future.
  2. documentation engines like sphinx read type hints and add them to documentation.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024 1

Only from the side of the intersection author, not the untyped code as already mentioned above.

I don't know what you mean by this. The intersection can be in library code that accepts types from the user, or the library can provide types that the user intersects. Two libraries can even interact with each other, and synthesize an intersection that the user uses.

So, no, you can't just say forget about intersections with Any. They can happen for all kinds of reasons outside of user control.

And returning an error is not reasonable here. In part because silencing the error could be extremely problematic. The error has to be associated with a line number for the type: ignore to be on.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

This is entirely wrong, you are conflating types and their interfaces here,

I think the fundamental disagreement here is that you are identifying a type by the set of subtypes only Sub(X)? The inferface (Sup(X)) is also important, and it's affected by union and intersection. This is why X | Any is irreducible in Pyright today.

Except that it isn't, you can see this with the type of object being type, and the type of object() being object,

I think it's fairly well accepted that object is the universal base class since isinstance(x, object) is true for all x, which is the definition.

The term universal set of objects is not used, but definitions consistent with it are.

Normally, in mathematics, the universal set is the union of all elements. Any is not union of all types Omega, so I assume you mean something different.

there's another part of the definition which is problematic by both saying Any is consistent with all objects, and also saying Any has all interfaces.

I don't think this is problematic. This is why I prefer to consider to type expressions X and Y to be equivalent only when Sup(X) == Sup(Y) and Sub(X) == Sub(Y).

I think at heart, the main point of contention is that are thinking about intersection only in parameter lists or similar subtype questions. In those situations, you are right that you can reduce T & Any to T. However, those are not the only situations in which intersections pop up. For example, see the consequences example I gave. Similarly, unions T | Any do reduce to Any in parameter lists and other subtyping questions, and similarly, they are irreducible in other situations (see the example in the background section).

So, I'm not contradicting you in those parameter list situations. However, there are other situations to consider, and it is possible for a revealed to type to meaningfully have type T & Any, which would be irreducible.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

this is the definition of an interface, not a type. Y

When we talk about types, they can occur in a variety of places. One of those places is that they can be associated with a variable as a "static type". These static types must encode all of the information that we have about the static type, including the interface (the way in which the variable can be used).

This discussion is about how type checkers should manipulate intersections with static types. Because the interface matters, for the static type of variables, you cannot collapse T&Any to T.

Ultimately, type checkers have to report the static types with reveal_type. And so the types returned by reveal-type will sometimes report T&Any, which is distinct from T.

Maybe for you a type checker's "static type" is not a "type". But this is one sense in which I am discussing types.

Another place types come up is in parameter lists. In that location, T&Any does collapse to T. But type checkers deal with types in a variety of places—not just parameter lists.

in conflict with the other operands as they can be defined as incompatible, and it has all interfaces, compatible versions of them and not.

That's not a conflict for the same reason that T | Any is not a "conflict". You can pass whatever you want to a function that accepts this type.

If it comes between implementing a typing feature in a way that is logically broken, and not allowing some code that already isn't typed to make use of a typing feature, the latter is preferable.

To be honest, I don't think anything is logically broken.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

You can't do whatever you want with T | Any

Sorry, I meant the dual of "pass whatever you want". You can do whatever you want with T & Any the same way that you can pass whatever you want to a function accepting T | Any.

The difference between types and their interfaces matters.

I think we're going in circles again. The static type includes all of the information known about the type. You seem to be dividing this information into something you're calling the "type" and the "interface". I think your definitions roughly correspond to Sub(X) and Sup(X), but it's hard to say. Nevertheless, when I use the word "type", I mean the static type (as returned by reveal type). The static type must encode what you call the "interface"—it is not distinct.

The logical contradiction has already been presented, and you have not given an explanation which satisfactorily shows it isn't in contradiction.

Just what I said above: "You can do whatever you want with T & Any the same way that you can pass whatever you want for T | Any." There's no contradiction. Why don't you show some code that is ambiguous?

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

An intersection must satisfy all of its requirements, conflicting interfaces create a situation where there are no types that could possibly satisfy the requested interface. Any is defined as having all interfaces. So while it may be safe to interact as if any has any version of that interface, it would be impossible to correctly say that Any has a non-conflicting interface with non-infinitely defined interfaces.

Oh I see. That's a great point.

Let's see what PyRight does for this:

class A:
    def f(self, x: int): pass
class B:
    def f(self, x: str): pass

def f(x: A):
    assert isinstance(x, B)
    reveal_type(x)  # subclass of A and B
    x.f(2)  # Okay!
    x.f('s')  # Fails; expects int.

So it seems that PyRight is not really intersecting, but rather creating a virtual subclass:

class X(A, B): ...

What are the possible solutions?

  • Forbid intersecting types that have conflicts including __init__. That is going to be seriously problematic since derived classes very often conflict with their parents' __init__ method.
  • Forbid intersecting types that have conflicts other than __init__. That could be reasonable.
  • Allow intersecting types, but only allow the intersection to have an interface of compatible methods.

My preference is probably for case 3 as the most permissive. It would allow you to do something like this:

class A:
  def f(self): ...
  def g(self) -> str: pass

class B:
  def f(self): ...
  def g(self) -> int: pass  # g is different

def g(x):
  assert isinstance(x, A) and isinstance(x, B)  # No subclass like this exists yet, but we may yet create one.
  x.f()  # No problem
  x.g()  # Error!

class C(A, B): ...  # C.g is A.g
class D(B, A): ...  # D.g is B.g
g(C())
g(D())

Type also means the obvious thing here.

The way you've been using type doesn't match what is returned by reveal type in Python, which I think is the source of the confusion. I imagine, you would say that T | Any is not a type either? And yet it's returned by reveal type. So this is the way I've been using it.

Treating Any as a singular type with an infinite interface is the mathematical problem here. Python does not have the means to have multiple methods with differing arity like some other languages, so the infinite interface contradicts other interfaces in the case of an intersection where any shared interfaces must be compatible

Right, that's a fair point.

In cases 1 and 2 above, Any should be forbidden from intersection. In case 3, Sup(T & Any) = object. (This resolves your conflict by basically saying that you can't use the variable except as an object.)

My preference with Any though is to let it leak through so that the interface fo T & Any is Any for the reason described in my long post at the top under "Danger".

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

I think the fact that python's terminology is so loose compared to more well-defined type theory creates more situations like this than I'd like.

Right, now that I understand how you're using terms, I can just use your vocabulary.

So my preference is for T & Any to have the interface of Any, and what you're calling the "type" of T. It should be reported as a static type as irreducible T & Any since we need to keep track of all of information.

A definition for Any which would work:

What are the consequences of this defintion for the examples above?

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

I was thinking about chaing the language of "type" and "interface". Let's call this the "literature type".

At the end of the day though, you're going to have to translate the "literature type/interface" into things like:

  • a type returned by reveal_type,
  • the type of a parameter annotation, and
  • the type of a variable annotation.

These three types need to have a canonical representation that represents the literature type and interface—we can call this representation the canonical type. I'm not sure whether it will ultimately be easier to stick with the Python language for this reason.

Either way, there will need to be a canonical way to specify T & Any. If T & Any has a different interface than whatever "literature type" is chosen for T, then it will need a different canonical type than the canonical type of T. I guess we'll have to see what meaning T & Any takes first, but we may end up the irreducible T & Any that I suggested as the canonical type.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

I just don't think we need multiple definitions for type. We just need to consider where we don't have a type, but a description of potential types.

I understand, but the problem is that I'm not the person that needs to be swayed by the PEP. You're going to communicate with all of the Python people who are used to one set of definition.

For these people, Any is the "canonical type"; so, above, when you claimed that Any was the "universal set", this is incorrect since the universal set for them is Omega.

Now, I've figured what you mean by type ("literature type"); so, I know that for you the literature type LiteratureAny is the universal set. We can speak the same langauge.

But ultimately, you'll need to write a PEP and that PEP is going to have to relate to the other typing PEPs. When it uses symbols to represent types, those symbols have to make sense in the world of symbols that we already have. They have to be intelligible to people who are using those symbols.

If, as I think you're suggesting, the interface to X & Any is going to be different than the interface to X, the irreducible proposal that I had is probably the best way to communicate that—even though it may not be the type theory that you're used to. After all, this is the same choice that Pyright made with irreducible X | Any

Edit: Actually, we probably still need more definitions accepted in some regard, and many definitions could stand to be made more clear, more rigorous, and/or more consistent, I just don't think we need multiple definitions for type. We just need to consider where we don't have a type, but a description of potential types.

Yes, that's fair. Nevertheless, you're going to have to turn that "description" into some canonical representation for display and annotation.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

whereas Intersection narrows what types are valid and imposes that any shared interfaces must be compatible.

This is an opinion, yes? Another option would be to allow the intersection, but define it to expose the total compatible interface. You have to consider that people have slightly different definitions of common methods. Someone may for example have a method like

    def __exit__(self ,type, value, traceback):

and someone else may have

    def __exit__(self ,type, value, traceback, /):

and then you're going to block intersection because they're not compatible?

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

Any is incompatible with every portion of every interface in an intersection under the current definition of Any being compatible with all uses and having all interfaces.

If you go your route of balking at the logical contradiction then Any & Any is not even going to be Any. You don't think that's even more paradoxical?

And by your logic that a type (in the Python/canonical sense) with multiple compatible interfaces should not have any of those interfaces be usable. And yet that's exactly what Any does. So, if we accept your logic, then there's no Any in the first place.

Any exists because it is impractical for everything to be typed for a variety of reasons including limitations of the type system, developer time, use as generic parameters.

I think the best thing to do is to come up with examples that produce intersections with Any and to come up with practical solutions for those cases. If you look at the example in my big post, it does not make sense to me that converting a type to Any would create type errors, but that's exactly what it will do.

I think a big part of the problem is that you're identifying Python's Any with the universal set of types. Python's Any isn't the universal set. The Any that you want to define is. Python's Any has a definition in typing PEPs and is extremely commonly used. I'm not sure I totally understand the consequences (to actual code) of the change of definition that you're proposing. Do you want to elaborate? When you say "yielding to more specific types" as in types that have already been defined?

My feeling is that instead we should consider consistency with Any as a mechanism to selectively disable type checking where type checking is impractical.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

@CarliJoy Can you add the proposal that Jelle made to treat T & Any as irreducible in general? For arguments in favor:

  • In my long comment under "Danger of unconditionally collapsing". Changing a type annotation from T to Any should not cause type errors to appear.
  • The supertype of T & Any is not the same as the supertypes of T (In the section on Analysis), but the subtypes of T & Any are T.
  • The interface fo T & Any is not the same as the interface of T (at least using Python's current definition of Any).

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

Could we refine some of the arguments for removing Any?

In particular the statement that "Any is essentially the universal set of types" is not true for Python's definition of Any. As we discussed above, the types we're talking about are not the "type theory" types. They define interfaces too, and Any's interface is not the same as the "universal set of types". This could be clarified: "With respect to subclasses, Any is the universal set of types."

Second, there's a claim that Union[T, Any] is Any, but as I showed in my long comment, this is not true in all contexts. It is true that they have the same subclasses. So, perhaps amend this statement to "T | Any has the same subclasses as Any"

It's no surprise that since these statement apply to subtypes only that T & Any reduces to T for subtypes—no one disagrees with that. So, perhaps it would be best to find some new arguments?

from intersection_examples.

ippeiukai avatar ippeiukai commented on September 27, 2024

I think it's worth mentioning that TypeScript considers T & any = any, though I find this surprising. Perhaps there have been previous discussions there that can be learned from?

Might be interesting to analyse TypeScript types in a set theory way:

https://ivov.dev/notes/typescript-and-set-theory

Oddly, any is a mixture of never and unknown. any is assignable to every type, just like never, and every type is assignable to any, just like unknown. As a blend of two opposites, any has no equivalent in set theory, and is best seen as an escape hatch to disable TypeScript's assignability rules.

https://blog.thoughtspile.tech/2023/01/23/typescript-sets/

any is a paradox — every set is a subset of any, but any might be empty. The only good news I have is that any extends unknown, so unknown is still the universe, and any does not allow "alien" values.

I wonder what's Python's equivalent for TS's unknown, and how Python's Any differs from TS's any.

from intersection_examples.

ippeiukai avatar ippeiukai commented on September 27, 2024

The TS unknown type is like a union of all possible types. It's similar to the object type in Python. (There is an object type in Javascript, but it's not a base type for all other types. That's why unknown was required.)

@erictraut Thank you for pointing that out. Didn't know Python's object is like TS's unknown. Good to know.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

the obvious rules from existing theory are insufficient, rather than simple composable rules.

At the start of this thread there was a misapprehension that Any was the universal set, but this was based on a mistaken understanding of Python's definition of Any. And based on that mistake, there were some unjustified conclusions drawn based on theory.

those maintaining type checking implementations have expressed that we stick to simple composable rules.

Both Jelle and Eric have argued for irreducibility.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

If We can show that Any is functioning as a top type for python's type system in the context of an intersection, it is still logically removable from that intersection by being a functional identity element on the operation of intersection over

As you said earlier, Any (the annotation or static type) is not a "type" in the sense that you mean it since Any implies an interface, and that interface is different than the "universal set".

As stated above, Any & T -> T comes from the same simplification as T | Never -> T, not to Any | T -> Any (removal of identity element)

I would say that object & T = T comes from the same simplification. object is a top type and the universal base class.

I'd like to discuss this a little further on what irreducibility would actually result in, as there are clearly some sharp edges to it

I think that could be productive. I tried to flesh this out in my "Consequences" section of my post.

There may be other reasons to still have it irreducible, but I'd like to ensure that the people making that point have a moment to consider the difference here.

Looking forward to seeing where this develops.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024

I would say that object & T = T comes from the same simplification. object is a top type and the universal base class.

Funnily enough, we proved it doesn't work for object with a simple proof by contradiction. object cannot be considered a top type because types themselves are not a member of it and participate in the type system. (this is wrong, see below)

we were also able to show that: (object | type[object]) & T =T though under current definitions (Edit: object works too)

from intersection_examples.

DiscordLiz avatar DiscordLiz commented on September 27, 2024

Here, let’s assume some_fn and some_obj are provided by two different libraries both currently untyped. What if some_fn got typed before some_obj gets typed? I expect I can do anything to a that I can do to some_obj. I don’t expect a to be reduced to mere SomeAddedTrait.

Do you really expect you can do anything to that, or do you then use the documentation or the source code to determine what is valid on what the library is providing? Typing is a safety feature. It can and does catch bugs, and code bases which have added typing have discovered inherently unsafe things they were doing in the process. Gradual typing should not be implemented in a way that compromises this.

Protocols already give you a way to interact with untyped code without having to accept Any directly. Something that is Any will pass the protocol check, but you're still specifying your expectations for your code.

And are you really suggesting that in a third library (yours) you have two other libraries that you expect people to subclass from to use your library instead of exposing what your library actually needs them to provide via protocols? Or at the very least providing a abstract base class to your users to inherit from rather than expect your users to own the internal implementation details of your library.

The arguments for not forbidding any are vapid from my perspective, and speak to larger design issues that should be fixed, not catered to. The lack of ergonomics created by such twisted arguments makes me wonder if people think typing has to be painful because they have boxed themselves into designs that are painful with or without typing.

from intersection_examples.

randolf-scholz avatar randolf-scholz commented on September 27, 2024

@mikeshardmind Honestly, I feel this whole discussion is a bit about out of scope. The PEP should simply inform what the type-theoretically correct type to use in this situation. From this point, "banning" Any is no option, some type has to be returned! The closest option to "banning", whatever you actually mean by this word, would then be to return Never.

Leave it to type checkers to implement strictness flags to allow or disallow intersections with Any on their own.

from intersection_examples.

DiscordLiz avatar DiscordLiz commented on September 27, 2024

And not only does Banning Any not hurt untyped or gradually typed libraries, it does the opposite and helps them. Banning Any from intersections is good for gradual type users. If users already need to write a protocol to get the benefits of intersections, you have yet another reason that if it's anywhere near as common as people think it is, they can get the people impacted (those writing intersections) to be aware that a library maintainer may need help with adopting typing and contribute upstream to help improve the whole ecosystem

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024

@randolf-scholz

Honestly, I feel this whole discussion is a bit about out of scope. The PEP should simply inform what the type-theoretically correct type to use in this situation. From this point, "banning" Any is no option, some type has to be returned! The closest option to "banning", whatever you actually mean by this word, would then be to return Never.

Leave it to type checkers to implement strictness flags to allow or disallow intersections with Any on their own.

  • This is one of the options being considered, see the very first post.
  • There are other places where certain type constructs are not valid because of issues with the implications of it. TypeVar bounds are not allowed to be generic, as this requires support for higher kinded types. There is no reason why this must return Never and instead cannot be an error to write. Any & int, a similar situation where this results in a situation that the type system does not support this arises out of the current ambiguity of what to do.

Do note that banning it would not be banning the use of untyped variables or the use of Any. It would only ban the use of Any as part of the intersection operands, something typed as Any would be compatible with the intersection anyway per the current definition of Any.

from intersection_examples.

randolf-scholz avatar randolf-scholz commented on September 27, 2024

@mikeshardmind @DiscordLiz What do you 2 even mean by banning? Do you suggest that int & Any raises an error at runtime, like in your example of using a generic as a bound of TypeVar?

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024

That's quite literally what banning it from the intersection would mean. It can't be there. That doesn't mean things that are Any can't be compatible, just that Any is problematic as part of specifying an intersection.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024

If that works they way you are showing and it isn't because Foo is no longer Any, then...

I think it was a mistake to have Any be runtime checkable with isinstance (edit: it isn't, see below), everything which can be instantiated to be checked in this way is always Any anyhow, so I'm not exactly worried about that particular inconsistency, as nobody should ever need to check that. I'm more worried about inconsistencies that actually cause issues where something is detected as fine by a type checker, but cause errors at runtime, as this removes the primary reason to even have type checking.

If Any & Foo doesn't reduce or reduces to Foo, then suddenly you threw away type info of the requirement Foo, and should have just written Any if this was your intent.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024

And for the record, it doesn't work with Any itself on any currently released python version

from typing import Any
instance(1, Any)  # TypeError: typing.Any cannot be used with isinstance()

If your example works, it should only be because Foo is no longer Any. It is Foo

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024

Banning Any would greatly hurt gradual typing, because if some_library exports A, but A aliases to Any, I can no longer write any hinted code that has Intersections with A. Until recently, this was for example the case with the very popular pandas library. It is needed.

you don't need an intersection with Any though. you don't gain any type information from this compared to Any without the intersection, and you can write a protocol based on your use of the library type if you instead want a basic "I'm using this how I expect it to be valid" check.

I still don't quite see the issue you have with keeping Any & X as an irreducible form.

All things are compatible with Any, including things that don't and can't exist or which would be incompatible with other operands. As an irreducible form, we reach the conclusion that for a concrete type T:

(Any & T).foo = Any.foo & T.too = Any & T.foo, so now any use of foo, even those that on the surface appear to conflict with T.foo must be considered valid by Any's compatibility guarantee. If instead, the person using an untyped library merely defines a protocol based on their use, it just works without effectively erasing T's type information.

from intersection_examples.

tomasr8 avatar tomasr8 commented on September 27, 2024

If we raise an error for Any & int, what do we do for e.g. (Any | int) & str? Since that distributes to (Any & str) | (int & str) so it's technically not allowed either?

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024

Yes. It shouldn't be allowed there either if the approach gone with is to ban Any. I don't see a problem with this. In what case will someone be writing partially typed code and an intersection? The one where they didn't know about the preexisting tool of protocols for partially typing things, and typing them based on their known use.

from intersection_examples.

randolf-scholz avatar randolf-scholz commented on September 27, 2024

you don't need an intersection with Any though. you don't gain any type information from this compared to Any without the intersection, and you can write a protocol based on your use of the library type if you instead want a basic "I'm using this how I expect it to be valid" check.

The point is that I can write code containing intersections now, and don't have to wait until typing-stubs for some_library become available. Once they do, I get immediate feedback for what works and what doesn't.

(Any & T).foo = Any.foo & T.too = Any & T.foo, so now any use of foo, even those that on the surface appear to conflict with T.foo must be considered valid by Any's compatibility guarantee.

What? Why? Any & T.foo is still irreducible and not equal to Any. Only things compatible with T.foo would be accepted.

If we define Any & X to be irreducible, it obviously has to apply to attributes as well. It seems like you are suddenly trying to apply the other proposed definition Any & X = Any to the attributes...

from intersection_examples.

tomasr8 avatar tomasr8 commented on September 27, 2024

Yes. It shouldn't be allowed there either if the approach gone with is to ban Any. I don't see a problem with this. In what case will someone be writing partially typed code and an intersection? The one where they didn't know about the preexisting tool of protocols for partially typing things, and typing them based on their known use.

I don't disagree with that but this means that every time you use intersections or unions, python will have to recursively check the expression to make sure there is no accidental use of intersections with Any which seems potentially costly.

from intersection_examples.

DiscordLiz avatar DiscordLiz commented on September 27, 2024

I'd rather be told when I write SomeLibraryType & MyType that there's something unexpected lurking there than python typing erase my expectations because a library exported something untyped. If I'm far enough involved in typing to be considering Intersections, It's unlikely that I want to have Any used in my code.

from intersection_examples.

DiscordLiz avatar DiscordLiz commented on September 27, 2024

I don't disagree with that but this means that every time you use intersections or unions, python will have to recursively check the expression to make sure there is no accidental use of intersections with Any which seems potentially costly.

Only in the case of intersections, and only for the operands. distributivity means we don't have the check the unions until they appear in an intersection, and then it's a simple linear search through a concatenation of the operands.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024

But what's the need to set this in stone as part of the PEP? PEP484 didn't formally specify either how to handle Union[Any, X] either.

And type checkers vary on how they handle this case creating incompatibility between different type checkers. This should be rigidly defined if it is allowed, or banned if it is not. This is from the typing summit, a talk that went over this inconsistency between type checkers https://youtu.be/BNTkWQfqP_c?t=8465

If a library is written assuming one type checker, should other type checkers be incompatible? what happens if multiple dependencies are using incompatible rules for something which was both allowed and poorly defined?

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

2. If X provides foo, then type-checkers could provide a strictness flag that (Any & T).foo aliases either to T.foo or Any.

I don't think it can even be T.foo. All that's guaranteed is that the return type is covariant. The parameter specification is contravariant, so there's still a lot of flexibility with that.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024
  1. I think we can agree that (Any & X).bar should always be fine if X does no supply bar
  2. If X provides foo, then type-checkers could provide a strictness flag that (Any & T).foo aliases either to T.foo or Any.
  1. Yep. I came up with a set of rules here that led to me viewing Any & T irreducibly as structurally mostly equivalent to Any and thus re-raising alarm bells about surprising behavior for me. I would conceptually agree with that, and the rules provided match it.

  2. There would need to be a way for library authors to do this too, otherwise, you get the same compatibility issues / differing expectations. Suggesting such a flag to me points to the deeper problem here directly. If you need configuration on that level, the feature itself is inconsistent in what people will expect from it by including Any. I don't want someone using my library complaining something is broken because they changed a configuration setting and suddenly my expectations that I expressed don't hold for their use because they were changed out from under what I wrote.

from intersection_examples.

NeilGirdhar avatar NeilGirdhar commented on September 27, 2024

It isn't. If I wanted this, I have that right now by just typing Any

In practice, Any & T can be created synthetically.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024

In practice, Any & T can be created synthetically.

Only from the side of the intersection author, not the untyped code as already mentioned above.

If I write a type var, I control the type var. If I include a type var in the intersection I wrote, it has to be my type var because of the scoping rules of type variables.

So this is still entirely the person who is writing the intersection creating this situation via a type var, or via direct use of an exported Any. (And this was already addressed above)

In both cases, we can provide a better option The error messages don't need to be cryptic, they can tell the person writing the intersection "See this for how to be compatible with gradual typed code without Any here"

from intersection_examples.

DiscordLiz avatar DiscordLiz commented on September 27, 2024

It happens if you import A from some_library and A happens to alias to Any. I already gave pandas as an example of a library which did this. They aliased Series and DataFrame to Any because they didn't have stubs yet, and these were very complex classes that would raise too many false-positives otherwise.

I'd rather be told I imported an Any that could overwrite my expectations. I can handle it from there. with a protocol. I don't need the information I'm providing being discarded for the sake of people who already aren't benefiting from typing.

from intersection_examples.

mikeshardmind avatar mikeshardmind commented on September 27, 2024

It still makes sense to type-hint code with these classes for several reasons:

  1. typing-stubs might become available in the future.
  2. documentation engines like sphinx read type hints and add them to documentation.

The protocol-based approach doesn't invalidate 1, it works with it.
The protocol based approach also works fine with documentation such as sphinx(2.) . you can name your protocols things that make sense for your documentation and then have them themselves either documented for what they correspond to, or set up aliasing in your sphinx config. you can have a mapping of classes which redirect elsewhere rather easily.

Bringing this up is a total red-herring to whether Any has well defined behavior in an intersection that is both unsurprising and useful.

from intersection_examples.

randolf-scholz avatar randolf-scholz commented on September 27, 2024

@DiscordLiz Well, then why can't we just have a strictness flag warn-import-any, and then you can write your replacement Protocol if you want to, but if I don't, then I won't be bothered. The Protocol might have to be extremely complicated, for example, the pands.Series stub currently is over 1500 lines and requires plently of auxiliary typevars and whatnot.

from intersection_examples.

DiscordLiz avatar DiscordLiz commented on September 27, 2024

@DiscordLiz Well, then why can't we just have a strictness flag warn-import-any, and then you can write your replacement Protocol if you want to, but if I don't, then I won't be bothered. The Protocol might have to be extremely complicated, for example, the pands.Series stub currently is over 1500 lines and requires plently of auxiliary typevars and whatnot.

Even with such a flag, you need to define the behavior of Any & SomeType for the people using it. You haven't fixed the inconsistency, just given people a way to never have to see it and arguably create a bigger rift between typed and untyped code because now I can't import untyped code even if it is compatible with my typings. If you give me that flag as the only way to fix this, it won't result in me using intersection more, it will result in me using untyped code less as causing annoyances.

And with that need to define the behavior still there, do you think you have such a definition that doesn't still have other issues with inconsistency in people's expectations?

from intersection_examples.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.