Comments (9)
I can reproduce the same regression without the named queries @shimpeko . The issue as explained above is related to the number of term queries on a single request. The change in apache/lucene#12183 introduces an overhead for large boolean queries composed of multiple term queries. It is emphasised when using named queries since they execute the query a second time during the fetch phase.
Regarding performance #108659 is more important for us as we haven't figured out a workaround. For this (possible) named query issue, removing named queries worked for us as a workaround.
I suspect that #108659 is a duplicate of this problem. Are you running lots of term queries (similar to this example) in a single boolean query?
@javanna I wonder if we should allow to opt-out from apache/lucene#12183? Using multiple threads to load terms can add a significant overhead when the number of terms is large as demonstrated in this issue.
from elasticsearch.
This was reported at: https://discuss.elastic.co/t/359189
from elasticsearch.
Pinging @elastic/es-search (Team:Search)
from elasticsearch.
Looking at the reproduction (thanks for providing one) the issue seems to be around a single query with 4k named term queries.
First of all, the reproduction query matches no document hence named queries, which are executed during the fetch phase, are not the culprit.
From the number of term queries the main culprit would be apache/lucene#12183 which creates term states concurrently using the searcher executor. Each term in the query creates one task per segment and executes in a different thread. The overhead in this scenario is tens of milliseconds due to the number of terms. It is significative in this setup because none of the terms are present in the dictionary so the work done by the thread is minimal.
The Lucene change was made to parallelise the IOs during a single query, in this case they are no IO involved so it ends up hurting the performance.
Another strategy is investigated for Lucene 10 where the goal is to rely on system calls to parallelise the IOs (rather than real Java threads). This might limit the impact when no IO is required like in this case.
@shimpeko is the scenario exposed here representative of your use case? I expect that the difference in performance should be much smaller when the query terms are actually present in the dictionary.
from elasticsearch.
@jimczi Thank you for taking a look at this.
I've already removed named queries from our production query and confirmed it improved response time to the same level as 8.7 with named queries. So I'm still suspecting the named query at the moment. I'll try to reproduce it with a query that matches documents.
Edit: Regarding performance #108659 is more important for us as we haven't figured out a workaround. For this (possible) named query issue, removing named queries worked for us as a workaround.
from elasticsearch.
@jimczi I've updated the reproduction query to match documents as shimpeko/es_named_query_perf@b897f91 and I still see a notable performance difference between 8.7.1 and 8.13.2
8.7.1
....
--- 3rd RUN ---
{
"took" : 55,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4000,
"relation" : "eq"
},
"max_score" : 11.1074705,
"hits" : [
{
"_index" : "test_index",
"_id" : "emGOgI8BQpKfxT3bI0cz",
"_score" : 11.1074705,
"_source" : {
"test_text" : "atrjmoueyc"
},
"matched_queries" : [
"query = atrjmoueyc"
]
},
{
"_index" : "test_index",
"_id" : "e2GOgI8BQpKfxT3bI0c0",
"_score" : 11.1074705,
"_source" : {
"test_text" : "fmjkwdxgnb"
},
"matched_queries" : [
"query = fmjkwdxgnb"
]
},
{
"_index" : "test_index",
"_id" : "fGGOgI8BQpKfxT3bI0c0",
"_score" : 11.1074705,
"_source" : {
"test_text" : "slaqatgrtw"
},
"matched_queries" : [
"query = slaqatgrtw"
]
},
{
"_index" : "test_index",
"_id" : "fWGOgI8BQpKfxT3bI0c0",
"_score" : 11.1074705,
"_source" : {
"test_text" : "mdvyrihjqq"
},
"matched_queries" : [
"query = mdvyrihjqq"
]
},
{
"_index" : "test_index",
"_id" : "fmGOgI8BQpKfxT3bI0c0",
"_score" : 11.1074705,
"_source" : {
"test_text" : "cvvbunsbyo"
},
"matched_queries" : [
"query = cvvbunsbyo"
]
},
{
"_index" : "test_index",
"_id" : "f2GOgI8BQpKfxT3bI0c0",
"_score" : 11.1074705,
"_source" : {
"test_text" : "aihmsruxby"
},
"matched_queries" : [
"query = aihmsruxby"
]
},
{
"_index" : "test_index",
"_id" : "gGGOgI8BQpKfxT3bI0c0",
"_score" : 11.1074705,
"_source" : {
"test_text" : "lmgsfemmca"
},
"matched_queries" : [
"query = lmgsfemmca"
]
},
{
"_index" : "test_index",
"_id" : "gWGOgI8BQpKfxT3bI0c0",
"_score" : 11.1074705,
"_source" : {
"test_text" : "isatduxwmn"
},
"matched_queries" : [
"query = isatduxwmn"
]
},
{
"_index" : "test_index",
"_id" : "gmGOgI8BQpKfxT3bI0c0",
"_score" : 11.1074705,
"_source" : {
"test_text" : "lvrmulxqyp"
},
"matched_queries" : [
"query = lvrmulxqyp"
]
},
{
"_index" : "test_index",
"_id" : "g2GOgI8BQpKfxT3bI0c0",
"_score" : 11.1074705,
"_source" : {
"test_text" : "bzwcblsdpi"
},
"matched_queries" : [
"query = bzwcblsdpi"
]
}
]
}
}
8.13.2
...
--- 3rd RUN ---
{
"took" : 448,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4000,
"relation" : "eq"
},
"max_score" : 11.1074705,
"hits" : [
{
"_index" : "test_index",
"_id" : "Ma2PgI8BD4MPqc5jWoVf",
"_score" : 11.1074705,
"_source" : {
"test_text" : "atrjmoueyc"
},
"matched_queries" : [
"query = atrjmoueyc"
]
},
{
"_index" : "test_index",
"_id" : "Mq2PgI8BD4MPqc5jWoVg",
"_score" : 11.1074705,
"_source" : {
"test_text" : "fmjkwdxgnb"
},
"matched_queries" : [
"query = fmjkwdxgnb"
]
},
{
"_index" : "test_index",
"_id" : "M62PgI8BD4MPqc5jWoVg",
"_score" : 11.1074705,
"_source" : {
"test_text" : "slaqatgrtw"
},
"matched_queries" : [
"query = slaqatgrtw"
]
},
{
"_index" : "test_index",
"_id" : "NK2PgI8BD4MPqc5jWoVg",
"_score" : 11.1074705,
"_source" : {
"test_text" : "mdvyrihjqq"
},
"matched_queries" : [
"query = mdvyrihjqq"
]
},
{
"_index" : "test_index",
"_id" : "Na2PgI8BD4MPqc5jWoVg",
"_score" : 11.1074705,
"_source" : {
"test_text" : "cvvbunsbyo"
},
"matched_queries" : [
"query = cvvbunsbyo"
]
},
{
"_index" : "test_index",
"_id" : "Nq2PgI8BD4MPqc5jWoVg",
"_score" : 11.1074705,
"_source" : {
"test_text" : "aihmsruxby"
},
"matched_queries" : [
"query = aihmsruxby"
]
},
{
"_index" : "test_index",
"_id" : "N62PgI8BD4MPqc5jWoVg",
"_score" : 11.1074705,
"_source" : {
"test_text" : "lmgsfemmca"
},
"matched_queries" : [
"query = lmgsfemmca"
]
},
{
"_index" : "test_index",
"_id" : "OK2PgI8BD4MPqc5jWoVg",
"_score" : 11.1074705,
"_source" : {
"test_text" : "isatduxwmn"
},
"matched_queries" : [
"query = isatduxwmn"
]
},
{
"_index" : "test_index",
"_id" : "Oa2PgI8BD4MPqc5jWoVg",
"_score" : 11.1074705,
"_source" : {
"test_text" : "lvrmulxqyp"
},
"matched_queries" : [
"query = lvrmulxqyp"
]
},
{
"_index" : "test_index",
"_id" : "Oq2PgI8BD4MPqc5jWoVg",
"_score" : 11.1074705,
"_source" : {
"test_text" : "bzwcblsdpi"
},
"matched_queries" : [
"query = bzwcblsdpi"
]
}
]
}
}
from elasticsearch.
Thank you so much for the investigation. I appreciate it.
I suspect that #108659 is a duplicate of this problem. Are you running lots of term queries (similar to this example) in a single boolean query?
I maybe misunderstood something but this example, the query on this issue, has multiple match queries in a single boolean query, not term queries.
Regarding #108659, again they are match queries (not term queries) but yes, the programmatic (slow) production queries have 100+ multi_match queries in a boolean query. Just FYI, we can still observe a significant difference in create_weight
value with a single match query in a boolean query between 8.7 and 8.13 as shared on #108659.
I can reproduce the same regression without the named queries @shimpeko .
Thank you again for confirming the issue.
I now think that my previous comment "I've already removed named queries from our production query and confirmed it improved response time to the same level as 8.7 with named queries." is not correct. What might have actually happened was that I removed named queries from our production environment, which improved the 99th percentile response time; however, a small number of queries with many match queries remained slow. I thought this was a separate problem and opened another GitHub issue as #108659.
opt-out from apache/lucene#12183
This would really help us if it fixes this issue and #108659. We are considering downgrading to 8.7 but it is a task as ES doesn't support downgrade.
from elasticsearch.
@jimczi ^
from elasticsearch.
I maybe misunderstood something but this example, the query on this issue, has multiple match queries in a single boolean query, not term queries.
Those match queries will be converted to term queries (unless a prefix query is used)
https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/index/search/MatchQueryParser.java#L523
from elasticsearch.
Related Issues (20)
- [CI] SearchStatesIT testBWCSearchStates failing HOT 3
- [CI] UberModuleClassLoaderTests testServiceLoadingWithModuleInfo failing HOT 32
- ESQL: Fold EsqlProject into Project HOT 1
- [CI] CsvTests test {ip.IpPrefixLengthFromColumn} failing HOT 1
- [CI] XPackRestIT test {p0=esql/70_locale/Date format with Italian locale} failing HOT 3
- [ILM] Avoid race condition between shrinking and ILM itself HOT 1
- ESQL: Create introduction to writing aggregations HOT 2
- ESQL: Architecture docs HOT 3
- Refactor ESQL optimizer rules HOT 2
- Javadoc for methods on `Node` HOT 2
- Chang how Not Exists Filter works - Performance Enhancement HOT 2
- SearchableSnapshotsIntegTests.testCreateAndRestoreSearchableSnapshot failing HOT 1
- [CI] SparseFileTrackerTests testCallsListenerWhenRangeIsAvailable failing HOT 1
- Improve synonyms expansion documentation HOT 4
- Features not matching version after an upgrade to 8.13+ HOT 16
- [ES|QL] provide nullish Kibana function definitions HOT 3
- [CI] SparseFileTrackerTests testCallsListenerWhenRangeIsAvailable failing HOT 2
- [CI] DocsClientYamlTestSuiteIT test {yaml=reference/esql/esql-async-query-api/line_17} failing HOT 2
- [ES|QL] make implicit string-to-date casting more consistent HOT 2
- [CI] CommonAnalysisClientYamlTestSuiteIT class failing HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from elasticsearch.