Comments (7)
The use case you've proposed makes sense but let me share the motivation behind our current implementation.
We currently return an exception to respond as quickly as possible to the client without blocking on responses from any of the test instances (primary, secondary, candidate). Our premise is that some resource (thread/task/memory) is blocked on the client side (whoever is sending test requests to diffy) and should be freed as quickly as possible.
This is specially relevant when you eavesdrop on your production clusters to sample and emit large volumes of one-way "dark" traffic to Diffy. Whatever instrumentation sends this traffic to Diffy does not care about the response as it's only purpose is to send a sufficiently large volume of traffic while consuming the minimal amount of resources.
Ultimately, the constant exception is just about being able respond to any type of request without any blocking on the Diffy side.
from diffy.
Just to make sure I understand: Given current implementation of Diffy, the only way to truly utilize its power is to add an additional piece of software/infrastructure to record and play traffic against the Diffy proxy. Given that Diffy does not return anything that is usable for upstream consumers.
Does the Diffy team plan on releasing a tool to help with this? That seems like a crucial part of the puzzle.
I can see other approaches:
- Setup another proxy in front of this proxy to send duplicate traffic to Diffy and ignore response.
- Place load balancer in front of Diffy that sends a part of traffic to it, and make sure production systems retry on empty response, hoping the load balancer will send traffic to the production url.
- Modify Diffy to return one of the three responses
I would really love to utilize Diffy, just having a hard time justifying the additional infrastructure needed to utilize.
from diffy.
Can you expand on how Twitter uses Diffy? I realize you might not be able to because it is not authorized public knowledge. I would find it helpful.
Thanks for your response!
from diffy.
What about passing a command line flag to determine if a response is to be returned or not?
Looking at the code, I saw that the current implementation calls all the backing services in sequence. I also saw some parallel implementation but that seems unused.
Question: Why sequential access has been chosen?
If passing a flag is acceptable, in order to send the response quickly, I think diffy
should return the primaryService
(or the flag could specify which response to return) response without waiting for anything else to be processed.
As a prototype, I modified slightly the DifferenceProxy#proxy
method to query all services in parallel and return the future of the primary server response.
Does that make sense? Would you accept a PR that introduces a flag to return a response?
def proxy = new Service[Req, Rep] {
override def apply(req: Req): Future[Rep] = {
val rawResponses = Seq(primary.client, candidate.client, secondary.client) map { service =>
service(req).liftToTry
}
val responses: Future[Seq[Message]] =
Future.collect(rawResponses) flatMap { reps =>
Future.collect(reps map liftResponse) respond {
case Return(rs) =>
log.debug(s"success lifting ${rs.head.endpoint}")
case Throw(t) => log.debug(t, "error lifting")
}
}
responses foreach {
case Seq(primaryResponse, candidateResponse, secondaryResponse) =>
liftRequest(req) respond {
case Return(m) =>
log.debug(s"success lifting request for ${m.endpoint}")
case Throw(t) => log.debug(t, "error lifting request")
} foreach { req =>
analyzer(req, candidateResponse, primaryResponse, secondaryResponse)
}
}
rawResponses.head.flatMap { Future.const }
}
}
from diffy.
I was in the process of composing a response proposing the approach of adding a command line flag. A pull request is welcome.
Regarding parallel vs sequential. We actually have code that does that in the proxy package but we moved from parallel to sequential as it serves as a noise reduction trick when used in primary -> candidate -> secondary sequence. This helps when the underlying data may be live and skews the odds of noise in favor of candidate. i.e. Primary is now more likely to disagree Secondary than Candidate because the request to Secondary is more delayed than the request to Candidate (relative to Primary) and underlying data is more likely to change over a longer time interval than a shorter one.
from diffy.
That was pretty confusing for me!
As I always thought I should get a response from a proxy!
Please at least mention that in the README
from diffy.
btw: for duplicating traffic at least on a test setup:
https://github.com/agnoster/duplicator
from diffy.
Related Issues (20)
- Differences Not Showing in Right Panel of Dashboard HOT 1
- com.twitter.common#metrics;0.0.29: not found
- fail to get started diffy HOT 1
- sbt.ResolveException: unresolved dependency: org.apache.thrift#libthrift;0.5.0-1: not found HOT 1
- how to make mobile testing
- Feature request: supporting text/plain HOT 4
- Is this tool still in updating? HOT 1
- how to process the difference of data? HOT 4
- Diffy admin won't load, no errors, StaticLoggerBinder warning HOT 5
- latency diffing
- Add Opentracing to diffy. HOT 2
- How to ignore the certificate when I use the https protocol
- How to change the header when sending a request to a different service. For example, when sending a request to cadidate and primary, the Header is different.
- Error: unknown artifact in the build.sbt file,and error:cannot resovle symbol in some files
- Dashboard script files path HOT 1
- readme error: using diffy with docker HOT 1
- Docker: proxy forwards with the wrong header parameter
- manifest for hseeberger/scala-sbt not found HOT 3
- How to show secondary response to the web page HOT 1
- is not in a section HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from diffy.