Git Product home page Git Product logo

sanecleanup's Introduction

sanecleanup

SAP Commerce 1811+

Sensible defaults for data retention and cleanup for SAP Commerce, based on my CX Works article Data Maintenance and Cleanup

How-To

  1. Download the latest release

  2. Unpack to hybris/bin/custom

  3. If possible, disable saved values / change history (ref. help.sap.com, further recommendations in my article)

  4. Add extension to your localextensions.xml

    <extension name="sanecleanup" />
  5. ๐Ÿ”ด Adapt the retention rules to your project requirements ๐Ÿ”ด
    (check the available properties in project.properties)

  6. Build and deploy.
    (The rules will be automatically imported during system update)

    If you get a build error regarding missing types like the example below:

    1. Open sanecleanup-items.xml
    2. Search for the type
    3. Comment-out the whole <itemtype> tag
    invalid index sanecleanup for type _TYPE_ on [...] declared at ((sanecleanup))::YIndex[sanecleanup-items.xml:...] due to missing enclosing type '_TYPE_'
    

WARNING
The very first execution of the retention cron jobs will take a while, depending on how long your poject is already live and if you have cleaned up anything in the past.

Consider performing a one-time cleanup before adding the extension / enabling the retention rules.

Especially the first run of cronJobLogCleanupCronJob will take a very long time, if you have never removed any cronjob log files (type LogFile).
Please consider importing and executing the script job defined in bulkdelete-cronjoblogs.impex before you set up the automated cleanup!
The job will remove all log files except the five most recent logs per CronJob. (Disclaimer: the script was tested on MS SQL / Azure SQL and SAP HANA. It is not guaranteed to work for other databases)

Do I have to clean up?

If have never even thought about that topic - yes!

You can run the following scripts in the administration console to get a quick overview:

Here are some additional queries and "rules of thumb" that help you investigate further:

Type(s)QueryNotes
AbstractRule
SELECT
  COUNT({ar:pk}),
  MIN({ar:modifiedtime}) AS "oldest",
  MAX({ar:modifiedtime}) AS "newest" 
FROM
  {AbstractRule AS ar},
  {RuleStatus AS rs} 
WHERE
  {ar:status} = {rs:pk} 
  AND {rs:code} = 'PUBLISHED' 
  AND {ar:enddate} IS NOT NULL 
  AND {ar:enddate} < getutcdate()

Are there any outdated rules? i.e rules that aren't valid anymore because their enddate is in the past.

Warning: change getutcdate() to your DBMS (for HANA/MySQL: now() )

BusinessProcess
SELECT
  {p:processDefinitionName},
  {s:code} AS "status",
  COUNT({p:pk}) AS "total",
  MIN({p:modifiedTime}) AS "oldest",
  MAX({p:modifiedTime}) AS "newest" 
FROM
  {BusinessProcess AS p 
  LEFT JOIN
    ProcessState AS s 
    ON {p:state} = {s:pk} } 
GROUP BY
  {p:processDefinitionName},
  {s:code} 
ORDER BY
  "total" DESC

Are there too many (let's say > 1000) or very old BusinessProcess in your system?

Also, if a lot of processes are stuck in "RUNNING" / "WAITING", you have to investigate what's wrong. (What is causing your processes to be stuck?)

Cart
SELECT
  {b:uid} AS "BaseSite",
  {u:uid} AS "USER",
  CASE
    WHEN
      {c:saveTime} IS NULL 
    THEN
      'regular' 
    ELSE
      'saved' 
  END
  AS "cart type",
  COUNT({c:pk}) AS "total", 
  MIN({c:modifiedtime}) AS "oldest",
  MAX({c:modifiedtime}) AS "newest" 
FROM
  { Cart AS c 
  LEFT JOIN
    USER AS u 
    ON {c:user} = {u:pk} 
  LEFT JOIN
    BaseSite AS b 
    ON {c:site} = {b:pk} } 
GROUP BY
  {b:uid}, {u:uid}, 
  CASE
    WHEN
      {c:saveTime} IS NULL 
    THEN
      'regular' 
    ELSE
      'saved' 
  END
ORDER BY
  "total" DESC
  • Are there excessive amount of carts per site or per user?
  • Too many saved carts?
  • Stale (= old) carts?
CronJob (auto-generated)
SELECT
  {t:code} AS "CronJob Type",
  COUNT({c:pk}) AS "total",
  MIN({c:modifiedtime}) AS "oldest",
  MAX({c:modifiedtime}) AS "newest" 
FROM
  {CronJob AS c 
  JOIN
    ComposedType AS t 
    ON {c:itemtype} = {t:pk} 
  LEFT JOIN
    TRIGGER AS trg 
    ON {trg:cronjob} = {c:pk} } 
WHERE
  {trg:pk} IS NULL 
  AND {c:code} LIKE '00%' 
  AND {t:code} IN 
  (
    'ImpExImportCronJob',
    'CatalogVersionSyncCronJob',
    'SolrIndexerCronJob' 
  )
GROUP BY
  {t:code} 
ORDER BY
  "total" DESC

Are there too many (>10) outdated, auto-geneated jobs in your system?

CronJobHistory
SELECT
  {cj:code},
  COUNT({h:pk}) AS "total",
  MIN({h:modifiedtime}) AS "oldest",
  MAX({h:modifiedtime}) AS "newest" 
FROM
  {cronjobhistory AS h 
  JOIN
    cronjob AS cj 
    ON {h:cronjob} = {cj:pk} } 
GROUP BY
  {cj:code} 
ORDER BY
  "total" DESC

Is there any job with > 50 histories and/or histories older than an hour?

This cleanup is enabled by default in recent SAP Commerce patch releases, so this query shouldn't find anything.

EmailMessage
SELECT
  {bp:processDefinitionName} AS "source",
  {m:sent},
  COUNT({m:pk}) AS "total",
  MIN({m:modifiedtime}) AS "oldest",
  MAX({m:modifiedtime}) AS "newest" 
FROM
  {EmailMessage AS m 
  LEFT JOIN
    BusinessProcess AS bp 
    ON {m:process} = {bp:pk} } 
GROUP BY
  {bp:processDefinitionName},
  {m:sent} 
ORDER BY
  "total" DESC
  • Are there more than a handful sent/unsent messages?
  • Are there messages that do not belong to any process?
ImpExImportCronJob (distributed impex)
SELECT
  {s:code} AS "status",
  COUNT({i:pk}) AS "total",
  MIN({i:modifiedtime}) AS "oldest",
  MAX({i:modifiedtime}) AS "newest" 
FROM
  {ImpExImportCronJob AS i 
  LEFT JOIN
    CronJobStatus AS s 
    ON {i:status} = {s:pk} } 
WHERE
  {i:code} LIKE 'distributed-impex-%' 
GROUP BY
  {s:code}
  • More than ~10 FINISHED distributed impex jobs?
  • More than a few PAUSED jobs? You may have a faulty distributed impex script.
ImpexMedia
SELECT
  COUNT(*) 
FROM
  {ImpexMedia AS i} 
WHERE
  (
    {i:code} LIKE '0_______' 
    OR {i:code} LIKE 
      'generated impex media - %' 
  )

Are there more than a handful (>100) of generated impex medias?

ImportBatchContent
SELECT
  COUNT({c:pk}) AS "total",
  MIN({c:modifiedTime}) AS "oldest",
  MAX({c:modifiedTime}) AS "newest" 
FROM
  {ImportBatchContent AS c 
  LEFT JOIN
    ImportBatch AS b 
    ON {b:importContentCode} = {c:code} } 
WHERE
  {b:pk} IS NULL

Are there any left-over distributed import batches?

LogFile
SELECT
  COALESCE({cj:code}, '<null>'),
  COUNT({l:pk}) AS "total",
  MIN({l:modifiedtime}) AS "oldest",
  MAX({l:modifiedtime}) AS "newest" 
FROM
  {LogFile AS l 
  LEFT JOIN
    CronJob AS cj 
    ON {l:owner} = {cj:pk} } 
GROUP BY
  {cj:code} 
ORDER BY
  "total" DESC

Are there are cronjob with more than ~10 logs and/or logs older than 14 days? (those are default values for log file retention)

ProcessTaskLog
-- Query tested with MS SQL
-- Adjust the date calculation for 
-- other databases
SELECT
    COUNT({l:pk}) AS "total",
    MIN({l:modifiedtime}) AS "oldest",
    MAX({l:modifiedtime}) AS "newest"
FROM
  {ProcessTaskLog AS l} 
WHERE
  {l:creationTime} < DATEADD( 
    MONTH, 
    -2, 
    GETUTCDATE() 
  )

We recommend customer to BusinessProcess cleanup, which will eventually take care of TaskLogs cleanup. There might be the few scenarios for ProcessTaskLog cleanup:

  1. The customer wants to keep the BusinessProcess for reporting, although we don't recommend it.
  2. The customer might be using the custom task without any business process.
SavedValues,SavedValueEntry
-- total SavedValue / SavedValueEntry
SELECT
  * 
FROM
  (
    {{ 
    SELECT
      'SavedValues' AS "type",
      COUNT({s:pk}) AS "total" 
    FROM
      {savedvalues AS s} }} 
    UNION ALL
    {{ 
    SELECT
      'SavedValueEntry' AS "type",
      COUNT({e:pk}) AS "total" 
    FROM
      {savedvalueentry AS e} }} 
  )
  summary

-- SavedValues per item
SELECT
  {s:modifiedItem} AS "item",
  COUNT({s:pk}) AS "total",
  MIN({s:modifiedtime}) AS "oldest",
  MAX({s:modifiedtime}) AS "newest" 
FROM
  {SavedValues AS s } 
GROUP BY
  {s:modifiedItem} 
ORDER BY
  "total" DESC

-- orphaned SavedValueEntry
-- (there shouldn't be any)
SELECT
  COUNT({e:pk}) AS "total",
  MIN({e:modifiedtime}) AS "oldest",
  MAX({e:modifiedtime}) AS "newest" 
FROM
  {SavedValueEntry AS e 
  LEFT JOIN
    SavedValues AS s 
    ON {e:parent} = {s:pk} } 
WHERE
  {s:pk} IS NULL

A lot of those items accumulated over the project lifetime. If possible, disable storing saved values. (hmc.storing.modifiedvalues.size=0)

SolrIndexOperation
SELECT
  {i:qualifier},
  COUNT({o:pk}) AS "total",
  MIN({o:modifiedTime}) AS "oldest",
  MAX({o:modifiedTime}) AS "newest" 
FROM
  {SolrIndexOperation AS o 
  LEFT JOIN
    SolrIndex AS i 
    ON {o:index} = {i:pk} } 
GROUP BY
  {i:qualifier} 
ORDER BY
  "total" DESC

Too many solr operations (more than ~100 per index)?

StoredHttpSession
SELECT
  COUNT({s:pk}) AS "total",
  MIN({s:modifiedtime}) AS "oldest",
  MAX({s:modifiedtime}) AS "newest" 
FROM
  {StoredHttpSession AS s}

Excessive amount of session? This is hard to generalize as it highly depends on your site's traffic, but if you are near or over 5 digits, it's probably too much.

Simarly, stale sessions (e.g older than a day) don't need to be retained.

TaskCondition
SELECT
  COUNT({tc:pk}),
  MIN({tc:modifiedtime}) AS "oldest",
  MAX({tc:modifiedtime}) AS "newest" 
FROM
  {TaskCondition AS tc } 
WHERE
  {tc:task} IS NULL

Is there an excessive amount of "premature events"? Or very old (older than a a few weeks) events?

Support

Please open an issue describing your problem or your feature request.

Contributing

Any and all pull requests are welcome.
Please describe your change and the motiviation behind it.

sanecleanup's People

Contributors

mpern avatar thendricks0 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

sanecleanup's Issues

excessive-platform-types.groovy execution fails in Groovy Console

I am running Hybris v1811
I included sanecleanup extension and as per guided, I ran excessive-platform-types.groovy prior to running system update for hook execution for resources import.
The groovy did not ran successfully and following stacktrace was shown:

Script execution has failed [reason: org.codehaus.groovy.runtime.typehandling.GroovyCastException: Cannot cast object 'de.hybris.platform.servicelayer.search.impl.DefaultFlexibleSearchService$$EnhancerBySpringCGLIB$$32fc0d45@252d3db9' with class 'de.hybris.platform.servicelayer.search.impl.DefaultFlexibleSearchService$$EnhancerBySpringCGLIB$$32fc0d45' to class 'groovy.lang.Reference']

The other mentioned groovy check-audit.groovy ran successfully.

Links to SAP Article are dead

Hi Markus,

I would love to read your article that you reference multiple times, but the link no longer works.
Is the article available elsewhere, and can you update the links to point at the new url ?

Best regards
Anders Olsen

sanecleanup-items.xml is not a valid XML document

Getting below error with latest code i.e. 3.4.0 version. Version 3.3.0 was working fine.
Invalid content was found starting with element 'include'. One of '{key}' is expected.
We are using commerce cloud 2005.

Logs:
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:11:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:23:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:35:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:45:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:56:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:67:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:79:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:89:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:100:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:110:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:119:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:130:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:140:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:150:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:160:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:170:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:182:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:193:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.
[schemavalidate] hybris/bin/custom/sanecleanup/resources/sanecleanup-items.xml:204:46: cvc-complex-type.2.4.a: Invalid content was found starting with element 'include'. One of '{key}' is expected.

BUILD FAILED

compilation fails in commerce 2105.7

Hi Markus,
With commerce 2105.7 compilation of sanecleanup extension fails because CMSVersionModel._CMSVERSIONGCPROCESS2CMSVERSION doesnt exist anymore (relation CMSVersionGCProcess2CMSVersion does not exist anymore in cms2-items.xml).

ERROR in /hybris/bin/custom/xyz/sanecleanup/src/mpern/sap/cleanup/cms2/CMSVersionGCPerformable.java (at line 152)
        ComposedTypeModel relationType = typeService.getComposedTypeForCode(CMSVersionModel._CMSVERSIONGCPROCESS2CMSVERSION);
                                                                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_CMSVERSIONGCPROCESS2CMSVERSION cannot be resolved or is not a field
----------
1 problem (1 error)

Best Regards,
Joachim

CCV2 Server startup error

Server Startup on CCV2 environment failed with error as below:

      Caused by: org.springframework.beans.FatalBeanException: 
      Context hybris Global Context Factory  couldn't  be created correctly due to, Error creating bean with name 'saneVersionRangeEvaluator' defined in 
      class path resource [sanecleanup-spring.xml]: Unexpected exception during bean creation; nested exception is 
      org.springframework.beans.factory.BeanExpressionException: Expression parsing failed; 
      **nested exception is java.lang.NumberFormatException: For input string: "1-e98b768-feature-HYB-cleanup-extension"**

Checked the build logs from CCV2 and below details might be useful to debug this:

############ BEGIN: Build metadata ############
build.version=2005.8-2005-2202.26-20220405.1-e98b768-feature-HYB-cleanup-extension
project.repository.revision=e98b768
project.repository.branch=feature-HYB-cleanup-extension
product.version=2005.8-2005-2202.26
############ END: Build metadata ############

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.