Git Product home page Git Product logo

confidis's People

Contributors

seveibar avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

confidis's Issues

Configurable Convergence

Execution order of GET ANSWER TO can impact the returned confidence at the very beginning. This doesn't look very good in demos, even though it seems to be mostly irrelevant for production applications.

Answer Interpolation

Interpolating answers, meaning being able to pick an answer between two answers, would be incredibly powerful, because new answers could be produced that maximize a particular belief.

That said, this may be too complex so out of scope unless someone has a great need.

Multi Truth Mode

Currently, only single truth mode is supported. That is to say, the highest confidence answer is selected and considered true, then sources are evaluated based on the correctness of their answer. Multi-truth would mean that any answer with a high enough confidence would be considered correct.

The biggest problem with this approach that I anticipate is that it will make it easier to do a "popular, but wrong answer" attack, wherein attacking sources with low confidences continually guess the same popular wrong answer, eventually making it considered one of the true answers.

Mitigate dependent sources with "dependence mode"

Dependent sources are described in ATTACKS.md

At the moment, I think this is the best fix:

  • Each source has a latent qualities vector that represents their latent beliefs
  • Two correlated sources will have a similar latent qualities vector
  • Two correlated sources that differ in some belief will have a different latent quality somewhere in the vector
  • When computing answer confidence, the confidence contribution is determined by the maximum latent qualities of a combined sources, (two identical sources will have the same maximum latent quality vectors, so the maximum across each vector is the same) Essentially, the quality (correctness) of each independent belief is used to calculate the confidence.
  • A global belief quality vector should be tracked. (this is essentially a subgraph)
  • (possibly) Configuration parameter or adaptive parameter of graph determines the minimum independence that sources have (two identical sources could contribute to overall confidence, like voting)

The latent quality vector is computed via a PCA of source similarity (similarity = percentage of time in agreement) e.g. the top 5 PCA factors can be the latent quality descriptors. Each latent quality descriptor tells you how to construct the each latent quality via the average agreement of each constituent source. There's a version of PCA that forces each component to be either 0 or 1, which makes it much easier to interpret the bias, since it's fully characterized by a couple sources. That also simplifies computation, since you only need to examine a couple sources rather than add agreement across many.

Prediction Kernel

In prediction kernel mode, a prediction kernel model learns confidence of belief answers from the differences of latent beliefs of sources. e.g. if there are three latent beliefs, each will have an answer and those answers will have differences. So the training data is

X=[d(b1_answer, b2_answer), d(b2_answer, b3_answer), d(b1_answer, b3_answer)]

Where b_i is a belief. The answer for each belief can be selected from the best representative of each belief (naive, but probably the theoretical best unless answer interpolation is implemented)

The ground truth/prediction is

y=[conf_b1, conf_b2, conf_b3]

Any model can be used to run the prediction. The model used would be called the "prediction kernel".

Latent Truth Model

Performance of system should be compared against latent truth model. An easy way to do this is just to implement a latent truth model (via a new graph system) and comapre the difference. LTMs have several adversaries that are not expected to perform as well as the base model.

Mixed order test

Confidis is built to be fast for streaming sources, as a result the execution order of each query can effect final confidences and qualities.

For this test, try randomized orders of SET to construct variations of a realistic graph. Measure the difference between the highest confidence and lowest confidence for each question and quality for each source.

Normal Source Test (100 questions, 10 sources)

Create a scenario with 100 questions and 10 sources. Each source has a predetermined mean representing the probability they will get the right answer. Each question has a predetermined answer.

Run the following tests:

  • Every source answers every question (1,000 answers)
  • [250,500] answers assigned uniformly and randomly across sources
  • [250,500] answers assigned randomly across sources, biasing sources in a linear fashion (source 10 receives 10x source 1)
  • [250,500] answers assigned randomly across sources, biasing sources in a exponential fashion (source 1 answers once, source 10 answers 100, exponentially fit in between)

For each test consider the following mean source accuracy scenarios:

  • uniformly random mean across the interval [0,1]
  • uniformly random mean across the interval [0,0.25]
  • uniformly random mean across the interval [0.75,1.0]

After each test, measure the difference between the quality of the source (GET SOURCE <source>) and their mean. Present the parameters of the test in a meaningful way (a bunch of tables). Compute overall scores for use in regression testing with a meaningful measurement name.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.