Git Product home page Git Product logo

draft-ip-address-privacy's People

Contributors

bslassey avatar chris-wood avatar davidschinazi avatar ggx avatar jbradleychen avatar saradickinson avatar shivankaul avatar sysrqb avatar tfpauly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

draft-ip-address-privacy's Issues

Add some more use cases of IP addresses from PAT

The Private Access Tokens draft highlights a couple use cases we don't quite capture in the current text: https://datatracker.ietf.org/doc/html/draft-private-access-tokens-01.txt#section-1.4

Two are:

  • limit the amount of content an IP address can access over a given time period (referred to as a "metered paywall")
  • rate-limit access from an IP address to prevent fraud and abuse

We already include anti-abuse usage, but we don't mention "rate-limiting" - that seems useful.

Geo signals

From the current draft, "There are 7 classes of signals..."

There are definitely abuse scenarios where knowing reliable country and finer-grain geolocation are important. For example, laws for copyright abuse vary from country to country. Also useful for detecting impersonation scenarios (e.g. a user in the US pretending to be Canadian). Would this be a new class of signals? Or does it fit in someplace?

A mechanism for first-party re-identification

Summary: IP blindness seems like it's mainly aimed at combatting the cross-site tracking which IP addresses facilitate. But individual sites also use IP addresses to correlate traffic for individuals across multiple visits to that one site, for combatting certain kinds of abuse. This proposal ought to have some way to let those individual sites still re-identify users across multiple visits.


Moving discussion from here. Let me quote some bits of my comments there:

It seems reasonable to say that the focus of this document is to provide alternatives for the use cases served by cross-site re-identification, but I think it's important to consider the effects of IP privacy on same-site re-identification as well.
(For context, I work on an anti-abuse product at Shape Security which does exactly this sort of same-site re-identification.)
Cookies are opt-in, so that's not particularly viable as an anti-abuse mechanism, particularly if account takeover or denial of service is in scope.
[...]
Attackers need to not be able to opt out of sending the signal. Or rather, real users need to opt out so infrequently that outright blocking anyone who does not send it is acceptable. Cookies don't work here because any first-time visitor will lack cookies for the site, which means you can't simply block anyone who lacks cookies.

There's some discussion in that thread about the feasibility of a mechanism which required the server to request some additional signal from the client, which I won't copy over, but we can continue discussing it here.

Augmenting replacement signals with reporting mechanisms

The current draft proposal lists signals that may compensate for some IP attributes (e.g. loss of longitudinal stability). In addition to providing signals from the proxy to internet-facing services, have we thought about patterns and mechanisms through which services could report abusive connections back to the proxy?

Ideally, this would allow the proxy to curtail the access of specific users, as opposed to having multi-tenant IP addresses blocked by the service under attack. The OHAI proposal (https://datatracker.ietf.org/doc/html/draft-rdb-ohai-feedback-to-proxy) is one such attempt to provide a path for feedback, and may be extensible to two-hop proxies and off-line reporting of abuse.

Should such mechanisms be considered in scope, in addition to signals emitted from the proxy?

Potential tweak to structure of document

Right now the document does a good job of listing good and bad uses for IP, though it doesn't explicitly list them as such. Given that there might not be universal consensus on which is which, it's hard to draw a hard line - but it would still make the document easier to follow. I propose that we tweak the structure to say:

  • IP are used for two classes of features: tracking and anti-abuse
  • there are new systems being developed that are focusing on preventing tracking but are not aimed at providing anti-abuse
  • therefore we should discuss how IP is used for these and what replacement signals exist

That would give us the split discussed above by leveraging statements of facts about existing products instead of basing the split in opinion. I'll write a PR to show what this looks like.

Add rough geolocation as use case for IP

IP addresses are used productively in the case of tailoring content to the user's rough geolocation. There are probably many uses of this rough geolocation information, but two that come immediately to mind are conforming to local laws and providing locally relevant content (for example: a merchant's website showing the nearest physical store, a search engine showing local coffee shops when a user searches for "coffee shop" or a news site showing locally relevant news).

Does a reputation system solve a problem?

One of the original goals of this draft was describing a "user reputation system" (for example, as a replacement for IP address reputation) that solved problems for platforms, in theory, while giving people/users some measure of control over that reputation. The draft envisioned some amount of transparency and appeal process built in to the system. Recently, this draft is now focusing broadly on a toolbox of replacement signals - each signal replaces some use case of IP addresses. However, where reputation fits into this new collection of signals is not clear, and the breadth/scale of such a reputation system isn't clear either, nor is it clear if such a system is desirable in reality.

Define categories of anti-abuse patterns

At a very broad level, I believe there are two forms of abuse against a single system that an IP privacy solution must consider. The first is a small set of attackers pretending to be many to prevent their abuse from being detected and blocked, which is generally the case when it comes to Denial of Service attacks (which is why distributed Denial of Service is a useful attack pattern).

The second category is when an attacker is attempting to impersonate a victim. This is the case with credential theft in general and can be protected against with technologies such as WebAuthn and techniques such as 2FA.

Being specific about the types of abuse will help us determine the types of preventative measures that may be appropriate.

I will note that these two categories are intentionally scoped to a single system. There is a third broad category where independant systems can use IP addresses as identifiers to warn others of threats. rfc5782 is one example of a standardized system to share such threat intelligence. This would also include tying an IP address to a real world identity.

And of course if there are other broad categories I'm missing, we should identify and define them.

Counterabuse: multi-platform threat models

As siloed defenses against abuse have improved, abusers have moved to multi-platform threat models. For example, a public discussion platform with a culture of anonymity may redirect traffic to YouTube as a video library, bypassing YouTube defenses that otherwise reduce exposure of potentially harmful content. Similarly, a minor could be solicited by an adult impersonating a child on a popular social media platform, then redirected to a smaller, less established and less defended platform where illegal activity could occur. There are many such cross-platform abuse models and they cause significant public harm. In a world with strong cross-platform privacy barriers, how should such threats be managed?

Counterabuse: avoiding benefits to bad actors.

Privacy providing technologies can support good ends (protecting the average user’s privacy) and bad (providing cover for criminal activity). What principles and guidelines can we establish to support good user privacy while not making it harder to manage abuse?

Move information about laws/regulations into separate document?

The IP Privacy Protection and Law section will grow quite large with #26. Documenting the current legal implications of IP addresses seems helpful within the context of finding solutions for IP address privacy, however this draft should not be overwhelming with information on the various topics. We should consider factoring out some information into a new draft, perhaps creating a "snapshot" of the current legal landscape on this topic (similar to the survey of censorship techniques).

Define cross-site versus same-site privacy risks

Fingerprinting in general and IP addresses in particular can be used to identify users both across sites and within a single website. IP Privacy and anti-fraud and abuse solutions will vary greatly based on which of these privacy risks we are attempting to address.

My suggestion is to focus on preventing cross-site re identification and tracking but to keep same-site re-identification and tracking out of scope for this document. WDYT?

Add Signal for GeoIP replacement

Private Access Tokens provides a mechanism for geo-fencing, and that is mentioned in this draft as a use case of IP addresses, but there isn't a particular signal that is described as a replacement for GeoIP - SOURCE_ASN comes close, but it is has a different purpose and has the wrong scope. I'll suggest a GEO_LOCATION signal as a placeholder.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.