Comments (3)
There's also some movement here I should mention.
Transformations on a working branch are allowed to return promises. Since duckdb-wasm queries are async, that means you could package an Arrow record batch on a tile, send it to the duckDB with arbitrary SQL, and then get a record batch back.
from deepscatter.
So this is a little more straightforward now, although the API is still in flux. Here's what the steps would be on this new branch.
- Define a transformation function that works on
Tile
objects with a signature(tile : Tile) => Promise<Float32Array>
, where the array is the same length as the tile. For example, you could do:
[...]
scatterplot._root.transformations['has dog in it'] = async function(tile) {
const output = new Float32Array(tile.record_batch.numRows)
// the column being searched.
const all_rows = tile.record_batch.getChild("full_text")
const all_strings = all_rows.toArray(); // The reason to use duckdb is that this function, deserializing a lot of text from arrow UTF-8 to javascript UTF-16 strings, is *extremely* slow.
let i = 0;'
for (let string of all_strings) {
if (string.match(/dog/) {
output[i] = 1;// Store a match as a float.
}
}
return output // This array will be attached to the record batch at render-time, lazily.
}
(It's allowed for the promise not to be async, which is what you'd want in this simple JS regex case. But with duckdb you would ship the array of full_text to the db, and then get back the results of a db query as arrow.)
And then actually plot it, which causes the transformation to be run on tiles as they're needed.
scatterplot.plotAPI({encoding: {
foreground: {
field: 'has dog in it',
op: 'eq',
a: 1
}}})
from deepscatter.
If anyone's following along at home, David and I got a prototype of this running over the weekend. https://observablehq.com/d/cae8e4a3a8b7d4db The hardest part turned out to be misalignment of Arrow versions among arrow-js
, duckdb
, and deepscatter
.
from deepscatter.
Related Issues (20)
- npm run build fails because of typescript errors HOT 1
- label_click and other interaction handlers should pass the event info HOT 2
- Tiles with multiple record batches are silently unplotted
- BUG: labels are incorrectly offset if x/y domain is 0-1 (fixed when, say, changed to 0-100) HOT 1
- How to use for geospatial data on maps HOT 1
- Fail gracefully on corrupted feather files or files without metadata.
- Example request: 3d scatter plot HOT 2
- Spots disappear when zoomed out HOT 6
- Points Visibility HOT 1
- Publish types
- plot.dim('color').scale undefined with certain color ranges.
- Allow resizing window
- .points() should include sidecar columns
- Throw error on buffers of length > 16m
- Deepscatter Review from an end-user and developer (usage, recommendations, feature requests, bugs, MWEs, troubleshooting, and more!)
- Calling same transformation twice in succession can result in race condition
- dataset.extent('field') doesn't work on sidecars
- plotAPI should stop to load at least the root sidecar for all requested columns HOT 1
- Visualizing dataset with large attributes HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepscatter.