seveibar / confidis Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 1.12 MB

An uncertain key store for data from agreeing and disagreeing sources

License: MIT License

Rust 31.67% HTML 1.35% JavaScript 66.65% CSS 0.33%

confidis's People

Contributors

Stargazers

Watchers

confidis's Issues

Configurable Convergence

Execution order of GET ANSWER TO can impact the returned confidence at the very beginning. This doesn't look very good in demos, even though it seems to be mostly irrelevant for production applications.

Answer Interpolation

Interpolating answers, meaning being able to pick an answer between two answers, would be incredibly powerful, because new answers could be produced that maximize a particular belief.

That said, this may be too complex so out of scope unless someone has a great need.

Multi Truth Mode

Currently, only single truth mode is supported. That is to say, the highest confidence answer is selected and considered true, then sources are evaluated based on the correctness of their answer. Multi-truth would mean that any answer with a high enough confidence would be considered correct.

The biggest problem with this approach that I anticipate is that it will make it easier to do a "popular, but wrong answer" attack, wherein attacking sources with low confidences continually guess the same popular wrong answer, eventually making it considered one of the true answers.

Mitigate dependent sources with "dependence mode"

Dependent sources are described in ATTACKS.md

At the moment, I think this is the best fix:

Each source has a latent qualities vector that represents their latent beliefs
Two correlated sources will have a similar latent qualities vector
Two correlated sources that differ in some belief will have a different latent quality somewhere in the vector
When computing answer confidence, the confidence contribution is determined by the maximum latent qualities of a combined sources, (two identical sources will have the same maximum latent quality vectors, so the maximum across each vector is the same) Essentially, the quality (correctness) of each independent belief is used to calculate the confidence.
A global belief quality vector should be tracked. (this is essentially a subgraph)
(possibly) Configuration parameter or adaptive parameter of graph determines the minimum independence that sources have (two identical sources could contribute to overall confidence, like voting)

The latent quality vector is computed via a PCA of source similarity (similarity = percentage of time in agreement) e.g. the top 5 PCA factors can be the latent quality descriptors. Each latent quality descriptor tells you how to construct the each latent quality via the average agreement of each constituent source. There's a version of PCA that forces each component to be either 0 or 1, which makes it much easier to interpret the bias, since it's fully characterized by a couple sources. That also simplifies computation, since you only need to examine a couple sources rather than add agreement across many.

Prediction Kernel

In prediction kernel mode, a prediction kernel model learns confidence of belief answers from the differences of latent beliefs of sources. e.g. if there are three latent beliefs, each will have an answer and those answers will have differences. So the training data is

X=[d(b1_answer, b2_answer), d(b2_answer, b3_answer), d(b1_answer, b3_answer)]

Where b_i is a belief. The answer for each belief can be selected from the best representative of each belief (naive, but probably the theoretical best unless answer interpolation is implemented)

The ground truth/prediction is

y=[conf_b1, conf_b2, conf_b3]

Any model can be used to run the prediction. The model used would be called the "prediction kernel".

Latent Truth Model

Performance of system should be compared against latent truth model. An easy way to do this is just to implement a latent truth model (via a new graph system) and comapre the difference. LTMs have several adversaries that are not expected to perform as well as the base model.

Images

Mixed order test

Confidis is built to be fast for streaming sources, as a result the execution order of each query can effect final confidences and qualities.

For this test, try randomized orders of SET to construct variations of a realistic graph. Measure the difference between the highest confidence and lowest confidence for each question and quality for each source.

Normal Source Test (100 questions, 10 sources)

Create a scenario with 100 questions and 10 sources. Each source has a predetermined mean representing the probability they will get the right answer. Each question has a predetermined answer.

Run the following tests:

Every source answers every question (1,000 answers)
[250,500] answers assigned uniformly and randomly across sources
[250,500] answers assigned randomly across sources, biasing sources in a linear fashion (source 10 receives 10x source 1)
[250,500] answers assigned randomly across sources, biasing sources in a exponential fashion (source 1 answers once, source 10 answers 100, exponentially fit in between)

For each test consider the following mean source accuracy scenarios:

uniformly random mean across the interval [0,1]
uniformly random mean across the interval [0,0.25]
uniformly random mean across the interval [0.75,1.0]

After each test, measure the difference between the quality of the source (GET SOURCE <source>) and their mean. Present the parameters of the test in a meaningful way (a bunch of tables). Compute overall scores for use in regression testing with a meaningful measurement name.

confidence limited to maximum qualities constraint

A constraint that limits the confidence in an answer to the maximum quality of the sources that agree with that answer. This prevents "popular, but wrong answer" attacks.

seveibar / confidis Goto Github PK

confidis's People

Contributors

Stargazers

Watchers

confidis's Issues

Configurable Convergence

Answer Interpolation

Multi Truth Mode

Mitigate dependent sources with "dependence mode"

Prediction Kernel

Latent Truth Model

Images

Mixed order test

Normal Source Test (100 questions, 10 sources)

confidence limited to maximum qualities constraint

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent