wbreeze / acd_stats Goto Github PK
View Code? Open in Web Editor NEWAerobatic contest data statistical analyses and reports
Aerobatic contest data statistical analyses and reports
The following data given to the clustering call, prechi.cluster_neighbors
causes the cluster process to run long with no sign of termination:
$range
[1] 42 47 52 57 62 67 72 77 82
$counts
[1] 1 0 0 0 0 0 0 0 0
$grades
[1] 79 70 84 74 79 70 70 42 84 76 76 84 76 76
First, the counts aren't correct given the grades. (The ranges are correct.) Second, the cluster algorithm ought to spit that out as unsolvable more or less immediately.
The contest that raised this is #652, which is odd in that it has grades on tenths, not rounded to 0.5. This contest and contest #641 were manually added to the processed list in order to omit them.
The prechi algorithm currently fixes the minimum number of partitions at three. The ChiSq test we are using has n-3 degrees of freedom. Fixing three means doing tests with 3-3=0 degrees of freedom.
Parameterize the fixed number three, so that we can pass four or five and not be doing meaningless ChiSq tests. This will mean fewer distributions that can be tested with ChiSq; however, the tests aren't really useful with so few partitions.
Prechi does not terminate in a reasonable time given:
2, 0, 0, 0, 3, 0, 0, 0, 2, 3, 5, 1, 4, 2, 8
(on intervals of five from twenty up through ninety)
It arrives at 5, 5, 6, 6, 8 quite quickly, but then goes into forever trying the large number of unproductive combinations. "Forever" is hours or days. Haven't seen it terminate. Another solution is 5, 5, 5, 7, 8; however I think that's worse than the one first found.
There are two things that can be done:
If those don't help, I'll have to think of allowing it to be heuristic but not optimal.
The data contains many distributions with large runs of zero counts. This makes for an explosion of fruitless combinations. The prechi algorithm is timing-out frequently.
A zero run can be merged with the left or right neighbor (there are always both), whichever is smaller. Add a preprocessing step which does this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.