When a retention policy of a datasource is greater than the last coordinator issued ki

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

We are using the default indexingPeriod of 30 minutes and we have <code class="notrans

KillUnusedSegments ignores segments that are marked unused after it last ran about druid HOT 6 OPEN

JRobTS commented on September 22, 2024

KillUnusedSegments ignores segments that are marked unused after it last ran

from druid.

Comments (6)

JRobTS commented on September 22, 2024

So it appears that this sometimes works and I traced the reason why it sometimes works:

After a leader election it appears the coordinator is working from a clean slate and is able to delete the older segments. So I guess a side-effect of leader election is that it will delete old segments but while a coordinator holds leadership it will not delete old segments.

from druid.

zachjsh commented on September 22, 2024

@JRobTS , thanks for submitting this issue. As I recall, the coordinator should loop around to the older segments that i missed on way up, after it has found no new unused segments for a given later interval. Are you finding this not to be the case? I believe this should happen when the datasource key is removed from the interval map here,

druid/server/src/main/java/org/apache/druid/server/coordinator/duty/KillUnusedSegments.java

Line 216 in fe2ba8c

datasourceToLastKillIntervalEnd.remove(dataSource);

. Such should result in the interval for that datasource starting from the beginning of time again.

from druid.

JRobTS commented on September 22, 2024

That's correct. Old unused segments are not included in the interval unless a leader election just happened. To be fair, the kill job does find unused segments, it's just finding the recent stuff from compaction and not including the older unused segments from drop forever rule.

…

On Thu, Feb 15, 2024, 10:49 zachjsh ***@***.***> wrote: @JRobTS <https://github.com/JRobTS> , thanks for submitting this issue. As I recall, the coordinator should loop around to the older segments that i missed on way up, after it has found no new unused segments for a given later interval. Are you finding this not to be the case? — Reply to this email directly, view it on GitHub <#15912 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASMJP67VFNFEDQ24YNDZIDLYTZDCXAVCNFSM6AAAAABDJI4A6CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBWG42TGMJRG4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

from druid.

zachjsh commented on September 22, 2024

@JRobTS The killUnusedSegments duty is tied to the coordinator indexing period (druid.coordinator.period.indexingPeriod), it will not issue kill tasks at a faster rate than the max of that config and druid.coordinator.kill.period. what are these configs set to on your cluster? Can you try lowering this to say 10 minutes as to see if that helps with progress? It seems that in your case newer unused segments are being created at a rate faster than the indexing / kill period.

from druid.

JRobTS commented on September 22, 2024

We are using the default indexingPeriod of 30 minutes and we have druid.coordinator.kill.period=PT1H (although the kill actually runs every 1.5 hours, I've logged a separate ticket on that).

Compaction is definitely occurring more frequently than the kill operation but (from a requirements perspective) this shouldn't stop kill from deleting the unused segments created by the dropForever rule. I have set druid.coordinator.kill.maxSegments=10000 but it only gets around 100 or so segments per run.

I will enable continuous auto-kill by removing druid.coordinator.kill.period and monitor.

from druid.

JRobTS commented on September 22, 2024

I switched to 10 minute intervals and it is now working; the kill task runs every hour (since I have hourly segments) and picks up both the unused segments from compaction and unused segments from drop rules.

Using the default indexing and kill periods would allow deep storage to grow forever but using a configuration like the following works:

# Enable auto-delete from deep storage
druid.coordinator.kill.on=true
druid.coordinator.kill.period=PT10M
druid.coordinator.kill.durationToRetain=P1D
druid.coordinator.kill.bufferPeriod=PT6H
druid.coordinator.kill.maxSegments=10000
druid.coordinator.period.indexingPeriod=PT10M

I will stress that this works because the kill task runs more frequently than compaction does, which allows the kill job to find nothing and clear the memory map so that subsequent kill jobs can run with full context (include drop rules).

The kill job has to find nothing before it will look for everything. :)

from druid.

KillUnusedSegments ignores segments that are marked unused after it last ran about druid HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent