magento / commerce-data-export Goto Github PK
View Code? Open in Web Editor NEWLicense: Open Software License 3.0
License: Open Software License 3.0
We have a catalog of 150k products, 12k of product attributes and 300 attribute sets. We notice memory limit issues during the reindexation process for catalog_data_exporter_products.
I've noticed that the query used in Magento\CatalogDataExporter\Model\Provider\Product\AttributeMetadata.php in method getRawOptionsSelect is performing a query like the following:
SELECT
`a`.*
`o`.`option_id` AS `optionId`,
`v`.`value` AS `optionValue`,
`s`.`code` AS `storeViewCode`
FROM
`eav_attribute` AS `a`
INNER JOIN `eav_entity_type` AS `t` ON t.entity_type_id = a.entity_type_id
INNER JOIN `eav_attribute_option` AS `o` ON o.attribute_id
INNER JOIN `eav_attribute_option_value` AS `v` ON o.option_id = v.option_id
INNER JOIN `store` AS `s` ON v.store_id = s.store_id
WHERE
(t.entity_table = 'catalog_product_entity')
AND (a.attribute_code = 'manufacturer_filter');
When you notice closely, the INNER JOIN for eav_attribute_option is lacking a comparison with the eav_attribute table.attribute_id, this results in a huge dataset to be loaded. We also don't need to retrieve all the columns used in eav_attribute. I've created the following patch for my project, but would like to verify this issue with you as well:
Index: Model/Provider/Product/AttributeMetadata.php
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/Model/Provider/Product/AttributeMetadata.php b/Model/Provider/Product/AttributeMetadata.php
--- a/Model/Provider/Product/AttributeMetadata.php
+++ b/Model/Provider/Product/AttributeMetadata.php (date 1641919694877)
@@ -65,9 +65,11 @@
private function getRawOptionsSelect(string $attributeCode) : Select
{
return $this->getAttributesSelect($attributeCode)
+ ->reset('columns')
+ ->columns(['source_model', 'attribute_code'], 'a')
->join(
['o' => $this->resourceConnection->getTableName('eav_attribute_option')],
- 'o.attribute_id',
+ 'a.attribute_id = o.attribute_id',
[
'optionId' => 'o.option_id',
]
```
There's a performance issue with the module magento/module-catalog-data-exporter
, it takes more than 1 hour on processing catalog_data_exporter_products
indexer, when the reindex happen through cron, the indexer takes so long that it puts in backlog other indexers scheduled after this one, see below image:
The issue seems to be related to out of memory issues when a few crons are scheduled at similar time to catalog_data_exporter_products
indexer
catalog_data_exporter_products
will keep status = Working and multiple items in backlog (the issue specially could happen after massive products changes, eg. adding a catalog price rule which affects multiple products for multiple stores)catalog_data_exporter_products
should be executed in much less than 1 hour in order not to backlog next scheduled indexers)catalog_data_exporter_products
will keep status = Working and multiple items in backlog, see below screenshotBoth simple products and configurable products fixtures generation goes on error during CLI command : The entity ID field for the catalog_data_exporter_products_cl table wasn't found. Verify the field and try again.
composer require magento/catalog-service=^3.0.1
bin/magento setup:perf:generate-fixtures setup/performance-toolkit/profiles/ee/medium.xml
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.