Comments (3)
Pinging @elastic/es-analytical-engine (Team:Analytics)
from elasticsearch.
This is related to the scheduled work for ST_DISTANCE, which covers at least the distance calculation part. However calculating speed is a separate concern. At the simplest, this could be simply distance/duration
, which does not require a new function, so could be considered complete once the ST_DISTANCE is done. However, there are two further considerations:
- The above assumes we have some way of having the two points and two timestamps in the same row, which could be true of some datasets, but is far likely to not be true when each document contains only the current location and timestamp. So we need a way of getting both the
current
andprevious
document into the same row. - This feature feels like it is suitable for TSDB. During the original TSDB work we did a feature involving optimized
geo_line
aggregations, and those collected sequences of locations grouped by TSID intoLineString
geometries, ordered by time, including a feature for line simplification for very large geometries. There was a request to filter out outliers that deviate too much from the line, and the above feature sounds related, where we want speed outliers to be detected and highlighted. If the users of this feature are likely to use TSDB features, since they are working with time-ordered event data, perhaps we should consider a TSDB feature around outlier detection (both spatial, temporal and spatiotemporal/speed)?
from elasticsearch.
I also took a look at the linked enhancement request and the SPL query they use and have a few comments:
- It looks like the main missing feature from our side is
eventstats
, which I believe we're working on (calling 'inline stats' at the moment). - The SPL query seems to do a lot of unnecessary inefficient work. In particular it appears to use
eventstats
to associate every single event with the same user with every other event of that user, and then calculate the distance, duration and speed between every combination. This is extremely inefficient, if we assume we only really need to consider consecutive events in time-order.
It would be far more efficient to use some time-ordering, or event ordering approach and look at windowing functions. @alex-spies pointed out the SQL functions LEAD
and LAG
as a good approach to this. They also seem generally useful for event data, log data and the security use cases.
from elasticsearch.
Related Issues (20)
- [CI] elasticsearch-ci/7.17.22 / bwc-snapshots-windows fails HOT 3
- Replace MockBlockFactory with LeakTracker HOT 2
- Wildcard expansion for enrich fields in enrich policy HOT 4
- Limit the value in prefix query HOT 6
- 真的是垃圾 HOT 1
- Download Fail HOT 2
- [CI] AzureSnapshotRepoTestKitIT testRepositoryAnalysis failing HOT 4
- [CI] MonotonicClockTests testMonotonicityWithSystemClock failing HOT 1
- [CI] Capabilities yaml test hangs HOT 1
- Automated response action discovery HOT 1
- [CI] DebMetadataTests test05CheckLintian failing HOT 11
- [ESQL] SimplifyComparisonArithmetics can cause nulls on valid queries HOT 1
- [CI] DockerTests test500Readiness failing HOT 7
- [CI] LogicalPlanOptimizerTests testSimplifyComparisionArithmetics_floatDivision failing HOT 1
- [CI] LogicalPlanOptimizerTests testSimplifyComparisonArithmeticWithDisjunction failing HOT 1
- Remove extra mode in Metrics command HOT 1
- [CI] MonotonicClockTests testMonotonicityWithFakeClock failing HOT 2
- [CI] GroupByOptimizerTests testOrderByType failing HOT 3
- [CI] EnrichIT testManyDocuments failing HOT 1
- elasticsearch caught exception while handling client http traffic, closing HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from elasticsearch.