Describe the bug This bug is discovered when I trying to test the

PlaceholderRowExec shown when select from union results. about arrow-datafusion HOT 2 CLOSED

xinlifoobar commented on July 18, 2024

PlaceholderRowExec shown when select from union results.

from arrow-datafusion.

Comments (2)

xinlifoobar commented on July 18, 2024

This is like an optimization for empty tables. The result above is when both t3 and t4 are empty. After insert some values, the plan displayed correctly.

> insert into t3 values (1,2), (2,3)
;
+-------+
| count |
+-------+
| 2     |
+-------+
1 row(s) fetched. 
Elapsed 0.004 seconds.

> explain select count(*) from ((select distinct c1, c2 from t3 order by c1 ) union all (select distinct c2, c1 from t4 order by c1));
+---------------+-------------------------------------------------------------------------------------------+
| plan_type     | plan                                                                                      |
+---------------+-------------------------------------------------------------------------------------------+
| logical_plan  | Aggregate: groupBy=[[]], aggr=[[COUNT(Int64(1)) AS COUNT(*)]]                             |
|               |   Union                                                                                   |
|               |     Projection:                                                                           |
|               |       Sort: t3.c1 ASC NULLS LAST                                                          |
|               |         Projection: t3.c1                                                                 |
|               |           Aggregate: groupBy=[[t3.c1, t3.c2]], aggr=[[]]                                  |
|               |             TableScan: t3 projection=[c1, c2]                                             |
|               |     Projection:                                                                           |
|               |       Sort: t4.c1 ASC NULLS LAST                                                          |
|               |         Projection: t4.c1                                                                 |
|               |           Aggregate: groupBy=[[t4.c2, t4.c1]], aggr=[[]]                                  |
|               |             Projection: t4.c2, t4.c1                                                      |
|               |               TableScan: t4 projection=[c1, c2]                                           |
| physical_plan | AggregateExec: mode=Final, gby=[], aggr=[COUNT(*)]                                        |
|               |   CoalescePartitionsExec                                                                  |
|               |     AggregateExec: mode=Partial, gby=[], aggr=[COUNT(*)]                                  |
|               |       RepartitionExec: partitioning=RoundRobinBatch(14), input_partitions=28              |
|               |         UnionExec                                                                         |
|               |           ProjectionExec: expr=[]                                                         |
|               |             AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1, c2@1 as c2], aggr=[]   |
|               |               CoalesceBatchesExec: target_batch_size=8192                                 |
|               |                 RepartitionExec: partitioning=Hash([c1@0, c2@1], 14), input_partitions=14 |
|               |                   RepartitionExec: partitioning=RoundRobinBatch(14), input_partitions=1   |
|               |                     AggregateExec: mode=Partial, gby=[c1@0 as c1, c2@1 as c2], aggr=[]    |
|               |                       MemoryExec: partitions=1, partition_sizes=[1]                       |
|               |           ProjectionExec: expr=[]                                                         |
|               |             AggregateExec: mode=FinalPartitioned, gby=[c2@0 as c2, c1@1 as c1], aggr=[]   |
|               |               CoalesceBatchesExec: target_batch_size=8192                                 |
|               |                 RepartitionExec: partitioning=Hash([c2@0, c1@1], 14), input_partitions=1  |
|               |                   AggregateExec: mode=Partial, gby=[c2@0 as c2, c1@1 as c1], aggr=[]      |
|               |                     MemoryExec: partitions=1, partition_sizes=[0]                         |
|               |                                                                                           |
+---------------+-------------------------------------------------------------------------------------------+
2 row(s) fetched. 
Elapsed 0.018 seconds.

> select count(*) from ((select distinct c1, c2 from t3 order by c1 ) union all (select distinct c2, c1 from t4 order by c1));
+----------+
| COUNT(*) |
+----------+
| 2        |
+----------+
1 row(s) fetched. 
Elapsed 0.019 seconds.

from arrow-datafusion.

alamb commented on July 18, 2024

This is like an optimization for empty tables. The result above is when both t3 and t4 are empty. After insert some values, the plan displayed correctly.

I believe that is correct

You can see how the optimizer transforms the plan using EXPLAIN VERBOSE

from arrow-datafusion.

PlaceholderRowExec shown when select from union results. about arrow-datafusion HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent