Git Product home page Git Product logo

silverstripe-gridfieldqueuedexport's Introduction

GridField Queued Export

CI Silverstripe supported module

Introduction

Allows for large data set exports from a GridField. By using an asynchronous job queue, we avoid running out of PHP memory or exceeding any maximum execution time limits.

The exact limitations of a standard GridField export vary based on the server configuration, server capacity and the complexity of the exported DataObject. As a rough guide, you should consider using this module when more than 1000 records need to be exported. The module should be able to export 10,000 records on a standard server configuration within a few minutes.

Installation

composer require silverstripe/gridfieldqueuedexport

Configuration

Since this component operates on a GridField, you can simply use it's addComponent() API.

$gridField = GridField::create('Pages', 'All pages', SiteTree::get())
$config = $gridField->getConfig();
$config->addComponent(GridFieldQueuedExportButton::create('buttons-after-left'));

If you want to replace the GridFieldExportButton created by the default GridField configuration, you also need to call removeComponentsByType().

// Find GridField
$gridField = $fields->fieldByName('MyGridField');
$config = $gridField->getConfig();

// Add new component
$oldExportButton = $config->getComponentByType(GridFieldExportButton::class);
$config->addComponent($newExportButton = GridFieldQueuedExportButton::create('buttons-after-left'));

// Set Header and Export columns on new Export Button
$newExportButton->setCsvHasHeader($oldExportButton->getCsvHasHeader()); 
$newExportButton->setExportColumns($oldExportButton->getExportColumns());

// Remove original component
$config->removeComponentsByType(GridFieldExportButton::class);

Note: This module is preconfigured to work with the silverstripe/userforms submission CSV export.

Related

silverstripe-gridfieldqueuedexport's People

Contributors

assertchris avatar baukezwaan avatar brettt89 avatar chillu avatar dependabot[bot] avatar dhensby avatar dizzystuff avatar emteknetnz avatar github-actions[bot] avatar guysartorelli avatar igor-silverstripe avatar ishannz avatar kinglozzer avatar maxime-rainville avatar nightjar avatar raissanorth avatar robbieaverill avatar sabina-talipova avatar scopeynz avatar ssmarco avatar wilr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

silverstripe-gridfieldqueuedexport's Issues

Export incorrect when QueuedJob takes more than 1 cycle to complete

Thanks for a very useful module!

Situation:
7000+ rows export on a high load machine

Problem:
QueuedJobService has a default memory limit of 256Mb. Their method runJob has a check isMemoryTooHigh. In our case, that limit was met. The idea behind the way QueuedJobService handles this situation seems to be to release memory and let the next iteration of ProcessJobQueueTask take up where we left it.

It looks like silverstripe-gridfieldqueuedexport can not handle this situation. In the end I have an export with only the records from the last restart. This is easy to spot because the column headers are missing in the first line.

Then the part I am not sure about:
It looks like the problem might be that silverstripe-gridfieldqueuedexport uses write mode and not append mode for file handling in GenerateCSVJob->getCSVWriter():

$csvWriter = Writer::createFromPath($this->getOutputPath(), 'w');

Maybe the module is not build to account for this situation. In that case I think it is worth mentioning it in the README.

Need to implement i18n javascript

This will require a new client/lang folder, upstream master file on transifex, new source file, and update the PHP logic to include the necessary javascript files conditionally.

Userforms integration documentation

Can you please note in the documentation that out of the box this only auto-applies to UserDefinedForm (pages), and that to apply this to the submissions gridfield of ElementForm requires:

DNADesign\ElementalUserForms\Model\ElementForm:
  extensions:
    - SilverStripe\GridfieldQueuedExport\Extensions\UserFormUseQueuedExportExtension

(Alternatively, would this be an acceptable addition to the module's config.yml?)

PermissionFailure for user after clicking export button

On a particular environment we encountered a permission-issue after a server-update. The permission is checked in the GridField_URLHandler to make sure the right user is accessing the export.

/src/Forms/GridFieldQueuedExporButton.php

    public function checkExport($gridField, $request = null)
    {
        $id = $request->param('ID');
        $job = QueuedJobDescriptor::get()->filter('Signature', $id)->first();

        
        if ((int)$job->RunAsID !== Security::getCurrentUser()->ID) {
            return Security::permissionFailure();
        }

The problem seems to be that the Security::getCurrentUser()->ID is returned as a sting, and the $job->RunAsID is casted into an INT.

Can I make a PR to change this into

        if ((int)$job->RunAsID !== (int)Security::getCurrentUser()->ID) {

Or is there some other voodoo going on, why my CurrentUser is returning a string?

silverstripe/framework: 4.8.0
silverstripe/gridfieldqueuedexport: 2.3.0

None of the following templates could be found (namespace issue)

Hi there,
I am running silverstripe-gridfieldqueuedexport 2.1, SilverStripe 4.2 & PHP 7.1.22 and I came across a namespacing issue where the template for GridFieldQueuedExportButton cannot be found after clicking the export button.
None of the following templates could be found: SilverStripe\GridfieldQueuedExport\Forms\GridFieldQueuedExportButton in themes "Array ( [0] => silverstripe/admin:cms-forms [1] => $default ) " SSViewer.php:215
template could not be found when exporting csv

Although the template file is actually located under templates/SilverStripe/GridFieldQueuedExport/Forms/GridFieldQueuedExportButton.ss (note the difference of GridField vs Gridfield.

Looks like the namespacing needs to be updated to match GridField or I just changed the folder for the template to templates/SilverStripe/GridfieldQueuedExport/Forms/GridFieldQueuedExportButton.ss for the meantime.

Cheers,
Alex

Investigate: Multi server "export has already been downloaded"

On multi server environments sometime the error message "This export has already been downloaded. For security reasons each export can only be downloaded once." appears

This is because the queued job has completed, and the file is on one server, but not the server you are currently on, so the GridFieldQueuedExportButton.php file_exists() check will fail

The file will sync across, and refreshing the CMS will fix this, however this isn't obvious to the user

Pull requests

Dir perms affected by current user mask, prevents file cleanup

https://github.com/silverstripe/silverstripe-gridfieldqueuedexport/blob/master/src/Jobs/GenerateCSVJob.php#L134-L153

mkdir is affected by the current user's mask, which can result in writing the .exports dir (and sub dir/files) with permissions other than the declared 0770 (I'm seeing 750).


https://github.com/silverstripe/silverstripe-gridfieldqueuedexport/blob/master/src/Forms/GridFieldQueuedExportButton.php#L249-L258

This precludes the users with group perms from being able to cleanup the csv dir/file before downloading the csv file.

  • On the surface this doesn't affect Live as the warning is supressed and the file sent as normal, but the files aren't cleaned up.
  • On Dev this prevents download of the csv due to both the warning and file contents being output.

This wont affect those who execute the export-job and download action as the same user. But as one is likely handled by cron, and the other apache, chances are that different users will be involved.

An easy fix here is to use umask(0) when making the dirs.
EDIT: turns out umask is unsafe in a multi-theaded environment and chmod should be used instead.

PR here: #37

Nested gridfield compatibility seems broken

I think this may be an issue with nested grid fields.

Upon clicking the queued export button, the page reloads, and simply displays the string SilverStripe\View\ViewableData_Customised

This is occurring on ElementForm submission exports, and any other gridfield I've tried it on within elemental Elements.

Can't bookmark/refresh download page in DataObject

Bookmarking or refreshing the download page works at the Model Admin level but if I have the queued export button on a GridField for a has_many relationship on a data object then refreshing the page doesn't work. I get a blank page except for the string ViewableData_Customised.

Subverting SS4 assets module and potentially weakening the security of the module

Silverstripe Gridfield Queued Export stores CSV files in the assets/.exports dir. These files get deleted once downloaded, but if for some reason, you never download the file, the file will be stuck in limbo forever and never get deleted.

It also completely bypasses all the assets logic and directly writes and unlinks the files. There's some attempt to write a .htaccess file to block direct download from the file, but that method is fallible because your webserver could be configured to ignore .htaccess files or you might be running your site on NGINX or IIS. The file names are also random, which minimise the risk that someone will stumble on them.

It's arguable whatever this is an actual security vulnerability. I guess you need a lot of things to go wrong for the files to be disclosed publicly. It sure is not good security architecture.

At the very least, it's a GDPR problem because the CSV data could be stuck there without a way to delete it.

This is the bit that creates the file.

protected function makeDir($path)
{
if (!is_dir($path)) {
// whether to use 'chmod' to override 'mkdir' perms which obey umask
$ignore_umask = $this->config()->get('ignore_umask');
// perms mode given to 'mkdir' and 'chmod'
$permission_mode = $this->config()->get('permission_mode');
// only permit numeric strings as they work with or without the leading zero
if (!is_string($permission_mode) || !is_numeric($permission_mode)) {
throw new Exception("Only string values are allowed for 'permission_mode'");
}
// convert from octal to decimal for mkdir
$permission_mode = octdec($permission_mode);
// make dir with perms that obey the executing user's umask
mkdir($path, $permission_mode, true);
// override perms to ignore user's umask?
if ($ignore_umask) {
chmod($path, $permission_mode);
}
}
}
protected function getOutputPath()
{
$base = ASSETS_PATH . '/.exports';
$this->makeDir($base);
// Although the string is random, so should be hard to guess, also try and block access directly.
// Only works in Apache though
if (!file_exists("$base/.htaccess")) {
file_put_contents("$base/.htaccess", "Deny from all\nRewriteRule .* - [F]\n");
}
$folder = $base . '/' . $this->getSignature();
$this->makeDir($folder);
return $folder . '/' . $this->getSignature() . '.csv';
}
/**
* @return Writer
*/
protected function getCSVWriter()
{
if (!$this->writer) {
$csvWriter = Writer::createFromPath($this->getOutputPath(), 'w');
$csvWriter->setDelimiter($this->Seperator);
$csvWriter->setNewline("\r\n"); //use windows line endings for compatibility with some csv libraries
$csvWriter->setOutputBOM(Writer::BOM_UTF8);
if (!Config::inst()->get(GridFieldExportButton::class, 'xls_export_disabled')) {
$csvWriter->addFormatter(function (array $row) {
foreach ($row as &$item) {
// [SS-2017-007] Sanitise XLS executable column values with a leading tab
if (preg_match('/^[-@=+].*/', $item)) {
$item = "\t" . $item;
}
}
return $row;
});
}
$this->writer = $csvWriter;
}
return $this->writer;
}

This is the bit that serves the file and delete it.

/**
* @param GridField $gridField
* @param HTTPRequest $request
* @return HTTPResponse
*/
public function downloadExport($gridField, $request = null)
{
$id = $request->param('ID');
$job = QueuedJobDescriptor::get()->filter('Signature', $id)->first();
if ((int)$job->RunAsID !== Security::getCurrentUser()->ID) {
return Security::permissionFailure();
}
$now = Date("d-m-Y-H-i");
$servedName = "export-$now.csv";
$path = $this->getExportPath($id);
$content = file_get_contents($path);
unlink($path);
rmdir(dirname($path));
$response = HTTPRequest::send_file($content, $servedName, 'text/csv');
$response->addHeader('Set-Cookie', 'downloaded_' . $id . '=true; Path=/');
$response->output();
exit;
}

Notes

This was initially reported as a security issue. We decided to threat it as a regular issue since there isn't anything directly exploitable.

Erroring with elemental-userforms

Apologies in advance, I don't have the specific scenario that caused the error, but I do have the simple fix! While doing SS5 upgrades and reviews, I'm PR'ing back what I can as I go.

Basically updateCMSFields in UserFormUseQueuedExportExtension.php is expecting $gridField = $fields->fieldByName('Root.Submissions.Submissions'); to always return a valid gridfield.

When I use this module along with elemental-userforms, in some scenario (which I can't recall - but I think is as commonplace as on creation of the userforms element) the Root.Submissions.Submissions gridfield doesn't exist, so when $gridField->getConfig() is called on null, an error is thrown.

I'm going to submit a PR, ideally could we just add if (empty($gridField)) return; between the two lines?

I can do more homework here, I'm just hoping that it's enough of a commonsense edit that you don't need me to ๐Ÿ˜…

PRs

Module breaks on Silverstripe 4.12

Situation:

Clean Silverstripe 4.12 install. ModelAdmin with GridField Queued Export button.

Problem:
Job queues fine but always breaks immediately when the task starts to process the queue.
When the project is downgraded tot Silverstripe 4.11 all is fine.

Cause:
LeftAndMain now has afterHandleRequest that checks for $this->response->isError()
If an error is found, the current response (a GridFieldQueuedExportButtonResponse) is replaced by a new HTTPResponse representing an error state.
This is problematic because in the constructor of GridFieldQueuedExportButtonResponse a status of 500 is assigned to the response. As far as I can see this status is never altered after that and triggers the new afterHandleRequest functionality.

Fix:
This could be a fix. Perhaps someone can suggest someting more elaborate to detect the correct setting of the body on the response:

class GridFieldQueuedExportButtonResponse extends HTTPResponse
{
    /**
     * @var GridField
     */
    protected $gridField;

    public function __construct(GridField $gridField)
    {
        $this->gridField = $gridField;

        if($this->gridField) {
            parent::__construct('', 200);
        }
        else {
            parent::__construct('', 500);
        }
    }
}

Edit:

The specific error I'm seeing when I try to run the job via the CLI is:

Call to a member function getManipulatedList() on null at /var/www/vendor/silverstripe/gridfieldqueuedexport/src/Jobs/GenerateCSVJob.php:395)

PR

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.