Comments (4)
Measurement endpoints in the original bandwidth scanner design are set up per-authority, so each operator also has a different test endpoint. That's the intent here, with the baked in URL (https://bwauth.torproject.org) as a default fallback for testing, etc.
However, one proposal is to create a 'loop' circuits back to the bandwidth scanner, which is also running a tor relay (with ExitPolicy Accept to the test endpoint), see the "there and back again" circuit generator and https://trac.torproject.org/projects/tor/ticket/9762. Thoughts?
from bwscanner.
from bwscanner.
The bandwidth scanning process shouldn't depend on anonymity of the scanners or endpoints because this is hard to guarantee. Ideally a malicious exit relay shouldn't be able to learn the endpoints AND selectively limit throughput to influence other relays measurements that are in the same circuit. I.e. a circuit that looks like:
bwscanner client ---> measured relay ---> malicious exit ---> known measurement endpoint
means that a malicious exit relay operator could collude with guard relays and bias measurement results. I believe this is made worse by the 'slice' approach of the current scanners, because controlling enough relays in the manner above can probably be used to block new relays from being fairly measured and rising in rank via a kind of 'consensus wall'. And I suspect this might happen as a side effect of circuits built between relays that coincidentally happen to be in the same datacenter because connections will scale thoughput faster and bias measurements on circuits that are geographically closer.
And any relay that is aware it's being measured can do whatever it can in order to improve its measurement results (ie drop every other cell that isn't part of a measurement circuit). I'm not sure what we can do about this type of biasing, though. I think this is the sort of thing that Peerflow can mitigate because measurements are the result of passive observations from the rest of the network rather than active probes.
So a scanner process that looks like the this:
bwscanner client ---> local relay ---> measured relay ---> endpoint exit ---> endpoint local to exit
(e.g. an exit with exitpolicy only to the measurement endpoint).
Should mean that a malicious relay can only influence results that it is also part of. An active network adversary who is able to attack network infrastructure (e.g. the scanner's ISP or upstreams) can of course still degrade measurements towards relays that it wishes to degrade with this approach.
The geographical biases are also difficult to address if we continue to run a small set of scanners (i.e. one or less per dirauth) as most of the scanners are located in US or Europe - so even moving the test endpoints to a CDN will still leave the scanners wherever they may be. So you'll see biases that arise from a relay being near a bwscanner and a cdn node - all of which might be in the same datacenter!
I guess it's obvious that using a single CDN hands a lot of power to the CDN operator, too.
The redesigned scanner was built with the possibility of 'sharding' the scans across a set of scanners - see circuit.TwoHop and arguments "partitions, this_partition". The intent here was to be able to run multiple scanners in parallel on different machines in order to better scale (i.e. reduce the time to complete a scan of the entire network). For example, a bandwidth authority operator could use a cloud computing service to spin up nodes for the duration of the scan and then combine the results. That might make attacking bandwidth scanner infrastructure harder because endpoints won't be defined statically, though exits with single line exitpolicy towards a test endpoint are going to stand out.
Single onion services make a lot of sense because we can avoid the requirement of using an exit in the measurement path, so a circuit can look something like:
bwscanner client ---> bwscanner local relay ---> measured relay ---> bwscanner local relay single onion endpoint
There's a lot of speculation above :) - Thoughts?
P.S. there are other ways to measure performance that could be used for feedback rather than bandwidth measurements - e.g. circuit failure rates, extend latency, particularly for high bandwidth relays that may be more CPU constrained than bandwidth limited.
from bwscanner.
from bwscanner.
Related Issues (20)
- Run chutney using tools/test-network.sh instead of ./chutney start networks/basic-025 HOT 1
- Connect to a running tor only for chutney tests
- NOTE: temporally disable review on PRs HOT 1
- allow disabling file hash checking on the command line
- Check dependencies are correct when installing with python setup.py install
- ``Download failed`` for most of the relays
- exceptions.ValueError: Did not find a suitable exit relay to build this circuit.
- OpenSSL Error HOT 11
- Setting DisableDebuggerAttachment too late HOT 1
- Unhandled Error on disconnect
- __getitem__ on a NoneType
- Excessive memory usage HOT 2
- [aggregate] Missing error handling
- Remove scripts/detect_partitions.py? HOT 3
- txsocksx dependency is not compatible with Python 3 HOT 4
- test_measurement.py fails HOT 2
- Chutney create more exits than non-exits relays HOT 3
- TwoHops algorithm is different to Torflow HOT 1
- Tests fail using chutney bwscanner configuration HOT 3
- Simplify code using txtorcon new features HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bwscanner.