Comments (15)
I did an experiment and found that moving averages do seem to converge on what looks like a Gaussian.
https://bl.ocks.org/curran/853fa00b8f0732fb2bee7fccfd7b4523
Also related to d3/d3-shape#43
from d3-array.
Moving average: https://observablehq.com/@d3/moving-average
(would be great to have it in d3-array)
from d3-array.
And here is another fork, this time using the d3.blur proposal.
from d3-array.
To close the loop here, the implementation was published as a separate package https://github.com/Fil/array-blur .
from d3-array.
Mind blown after seeing this
Due to the central limit theorem, the Gaussian can be approximated by several runs of a very simple filter such as the moving average. The simple moving average corresponds to convolution with the constant B-spline ( a rectangular pulse ), and, for example, four iterations of a moving average yields a cubic B-spline as filter window which approximates the Gaussian quite well.
from Wikipedia: Gaussian filter.
Makes me wonder if we could add a moving average function, then use that to implement blur
.
from d3-array.
I keep coming across scenarios where a bit of smoothing would be useful, for example this one:
CO2 Emissions Stacked Area Chart
from d3-array.
Proposed implementation (from Histogram Smoothing):
const blur = (data, property) => data.map((d, i) => {
const previous = (i === 0) ? i : i - 1;
const next = (i === data.length - 1) ? i : i + 1;
const sum = data[previous][property] + d[property] + data[next][property];
d[property] = sum / 3;
return d;
});
from d3-array.
Also did a Gaussian Smoothing notebook
https://observablehq.com/@fil/gaussian-smoothing
Obvious advantage over moving average is it's smoother :) It shows quite well on the electricity dataset which has a 24h base period.
from d3-array.
Here's an application of that blur function.
https://vizhub.com/curran/14bdac88ebf747c2b8a9a919bbe0a831
from d3-array.
I think we would also want optimized 2D or n-D blurs, like we have in https://github.com/d3/d3-contour/blob/master/src/blur.js (note the number of TODO in the comments ;-) ).
from d3-array.
Yes that would be very cool!
One "problem" I notice with the Gaussian blur approach of repeated averaging is that the slope of the curves appear to tend towards zero the closer you get to the endpoints (see the red rectangle in the image).
Is this a known problem with Gaussian blur in general? How to folks usually deal with this?
To me, it seems to skew the data into showing something that's not there (the "flattening out" effect at the end of the curve), which make the dataviz misleading. The original data does not seem to exhibit that behavior.
I'm thinking to just remove the data points that could have been indirectly impacted by the edge case - so chop off numIterations
values from the beginning and end of the timeseries.
from d3-array.
Here's a revised implementation, taking that end flattening effect into account:
export const blur = (data, property, numIterations) => {
const n = data.length;
for (let j = 0; j < numIterations; j++) {
for (let i = 0; i < n; i++) {
const previous = data[i === 0 ? i : i - 1];
const current = data[i];
const next = data[i === n - 1 ? i : i + 1];
const sum = previous[property] + current[property] + next[property];
current[property] = sum / 3;
}
}
// Chop off the ends, as they may represent misleading "flattening".
return data.slice(numIterations, data.length - numIterations);
};
from d3-array.
Here's a fork which uses my gaussian kernel blurring; it doesn't seem to suffer from the problem you describe?
from d3-array.
Very nice! Indeed, the problem is not there. The problem must lie in the repeated iteration technique.
from d3-array.
This new notebook shows how we could use the same API for 1D and 2D blur:
https://observablehq.com/@fil/moving-average-blur
I've also straightened the implementation a little to get more juice out of it (about 30% faster in my tests). I've tried to implemented gaussian blur but it does not compete at all in terms of speed.
from d3-array.
Related Issues (20)
- BUG: d3-array/dist/d3-array.js: Unexpected token (139:15) HOT 4
- fix(babel): cumsum HOT 1
- binary ticks increments on linear scale HOT 2
- D3-array produces ERR_REQUIRE_ESM with node >= 15 HOT 3
- bisectCenter naming HOT 1
- quantile returns undefined on an empty array, differs from extent HOT 1
- Docs: define the bin thresholds with array HOT 2
- First and last thresholds are set to data extent (not explicitly stated limits) HOT 2
- bisector no longer supports two-argument (object, value) comparator HOT 12
- Testing a lib using `d3-array` HOT 1
- d3.blur HOT 1
- Incorrect results for binary search on large arrays due to miscomputation of midpoint HOT 11
- d3.bin can mutate the user-specified thresholds
- About the sorting problem of d3.rank HOT 2
- Insecure Randomness for the useof Math.random() in shuffle API (security vulnerability) HOT 1
- d3.thresholdScott returns NaN for single-element arrays
- Feature request: `find` / `findValue` methods
- groupSort should use ascendingDefined instead of ascending
- medianIndex/quantileIndex doesnโt handle missing data HOT 3
- can d3-array also support BigInt numbers? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from d3-array.