When running query64 of tpcds 10T data，Ifind a stage have shuffle wrote 1.3T of data,b

<img src="" alt="Uploading IMG_20

I find that when I use aqe ,I got the wrong statistics <span class

Shuffle read does not read all data completely? about firestorm HOT 31 CLOSED

tencent commented on April 28, 2024

Shuffle read does not read all data completely?

from firestorm.

Comments (31)

colinmjj commented on April 28, 2024

Can you share the spark UI for stage IO?

from firestorm.

xunxunmimi5577 commented on April 28, 2024

I can't upload pictures in my company

from firestorm.

xunxunmimi5577 commented on April 28, 2024

Compared to the native Spark, Shuffle Write has the same amount of data, but Firestorm reads very little data during Shuffle Read. The label Task:Succeeded/Total in spark ui shows only one Task in Firestorm，but Spark shows 5000 tasks are successfully executed.

from firestorm.

colinmjj commented on April 28, 2024

How about the result? Is it the same as the result with native Spark?
we passed result compare based on 1TB data, but haven't did this with 10TB data.

from firestorm.

xunxunmimi5577 commented on April 28, 2024

from firestorm.

xunxunmimi5577 commented on April 28, 2024

from firestorm.

xunxunmimi5577 commented on April 28, 2024

How about the result? Is it the same as the result with native Spark? we passed result compare based on 1TB data, but haven't did this with 10TB data.

I need to confirm this, because we modified the SQL and did not collect the results

from firestorm.

xunxunmimi5577 commented on April 28, 2024

Does Firestorm print partition lengths to MapStatus？

from firestorm.

jerqi commented on April 28, 2024

We record the length, aqe need the metrics.

from firestorm.

xunxunmimi5577 commented on April 28, 2024

I find that when I use aqe ,I got the wrong statistics

…

---- Replied Message ---- | From | ***@***.***> | | Date | 06/21/2022 11:47 | | To | ***@***.***> | | Cc | ***@***.******@***.***> | | Subject | Re: [Tencent/Firestorm] Shuffle read does not read all data completely? (Issue #175) | We record the length, aqe need the metrics. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

from firestorm.

jerqi commented on April 28, 2024

Could you give me more detail information?

from firestorm.

xunxunmimi5577 commented on April 28, 2024

Firestorm for spark2 does‘t support AQE？I saw that the implementation of the stop() method in RssShuffleWriter(Spark2) seems to fill the partitionLengthse with dummy value.

from firestorm.

xunxunmimi5577 commented on April 28, 2024

However spark2 do support this configuration spark.sql.adaptive.enabled. If as mentioned above,then spark.sql.adaptive.enabled can't be set to true?

from firestorm.

jerqi commented on April 28, 2024

Firestorm for spark2 does‘t support AQE？I saw that the implementation of the stop() method in RssShuffleWriter(Spark2) seems to fill the partitionLengthse with dummy value.

spark2 don't support AQE.

from firestorm.

jerqi commented on April 28, 2024

The open source Spark2 don't support AQE, too.

from firestorm.

xunxunmimi5577 commented on April 28, 2024

Firestorm for spark2 does‘t support AQE？I saw that the implementation of the stop() method in RssShuffleWriter(Spark2) seems to fill the partitionLengthse with dummy value.

spark2 don't support AQE.

But if I set spark.sql.adaptive.enabled=true，I will get the wrong result.

from firestorm.

jerqi commented on April 28, 2024

https://spark.apache.org/releases/spark-release-3-0-0.html
AQE is the Spark 3.0's feature.

from firestorm.

xunxunmimi5577 commented on April 28, 2024

As far as I know, spark2 can also use configuration spark.sql.adaptive.enabled.

from firestorm.

xunxunmimi5577 commented on April 28, 2024

Then ExchangeCoordinator.doEstimationIfNecessary() method will need mapOutputStatistics to determine the number of post-shuffle partitions.

from firestorm.

colinmjj commented on April 28, 2024

@xunxunmimi5577 For RSS + Spark2, AQE is not supported with current implementation. This feature was announced in Spark3, so there is no plan to support AQE with Spark2.

from firestorm.

jerqi commented on April 28, 2024

It's not available feature in Spark2. Maybe some configurations were added first , but the implement isn't complete.

from firestorm.

xunxunmimi5577 commented on April 28, 2024

If I use spark2 + firestorm + spark.sql.adaptive.enabled=true, partitionStartIndices from ExchangeCoordinator will be [0,200) instead of [0,1),[1,2),... ,then shuffleReader will only read partition 0，this is the phenomenon described in my issue，there were supposed to be 200 tasks to execute, but only one was executed.
I think users should at least be prompted of this.

from firestorm.

jerqi commented on April 28, 2024

If I use spark2 + firestorm + spark.sql.adaptive.enabled=true, partitionStartIndices from ExchangeCoordinator will be [0,200) instead of [0,1),[1,2),... ,then shuffleReader will only read partition 0，this is the phenomenon described in my issue，there were supposed to be 200 tasks to execute, but only one was executed. I think users should at least be prompted of this.

OK, We can check whether ADAPTIVE_EXECUTION_ENABLED is enabled in RssShuffleManager. If true, we can throw an illegal argument exception. Would you like to contribute it?

from firestorm.

colinmjj commented on April 28, 2024

@xunxunmimi5577 thanks for report this, I think it should be described in readme for such unsupported case.

from firestorm.

xunxunmimi5577 commented on April 28, 2024

If I use spark2 + firestorm + spark.sql.adaptive.enabled=true, partitionStartIndices from ExchangeCoordinator will be [0,200) instead of [0,1),[1,2),... ,then shuffleReader will only read partition 0，this is the phenomenon described in my issue，there were supposed to be 200 tasks to execute, but only one was executed. I think users should at least be prompted of this.

OK, We can check whether ADAPTIVE_EXECUTION_ENABLED is enabled in RssShuffleManager. If true, we can throw an illegal argument exception. Would you like to contribute it?

I would like to, or maybe you just want to describe it in readme？

from firestorm.

xunxunmimi5577 commented on April 28, 2024

Moreover, is it possible to record an array of partitionLengths like Spark3?

from firestorm.

jerqi commented on April 28, 2024

If I use spark2 + firestorm + spark.sql.adaptive.enabled=true, partitionStartIndices from ExchangeCoordinator will be [0,200) instead of [0,1),[1,2),... ,then shuffleReader will only read partition 0，this is the phenomenon described in my issue，there were supposed to be 200 tasks to execute, but only one was executed. I think users should at least be prompted of this.

OK, We can check whether ADAPTIVE_EXECUTION_ENABLED is enabled in RssShuffleManager. If true, we can throw an illegal argument exception. Would you like to contribute it?

I would like to, or maybe you just want to describe it in readme？

Actually, we want to do two things. We want to add the parameter check in code. And we also want to increase document description.

from firestorm.

jerqi commented on April 28, 2024

Moreover, is it possible to record an array of partitionLengths like Spark3?

It's not available Feature in Spark 2. We wouldn't do it.

from firestorm.

xunxunmimi5577 commented on April 28, 2024

from firestorm.

jerqi commented on April 28, 2024

Could I close this issue? Is it solved?

from firestorm.

xunxunmimi5577 commented on April 28, 2024

I think it's solved.Let me close this issue.

from firestorm.

Shuffle read does not read all data completely? about firestorm HOT 31 CLOSED

Comments (31)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent