cgross / grunt-dom-munger Goto Github PK

View Code? Open in Web Editor NEW

93.0 5.0 40.0 659 KB

Grunt task to read and manipulate HTML with CSS selectors.

License: MIT License

JavaScript 93.20% CSS 0.33% HTML 6.47%

grunt-dom-munger's Introduction

grunt-dom-munger

Read and manipulate HTML documents using CSS selectors.

Use this task to read and transform your HTML documents. Typical use cases include:

Read the references from your script or link tags and pass those to concat,uglify, etc automatically.
Update HTML to remove script references or anything that is not intended for your production builds.
Add, update, or remove any DOM elements for any reason.

Getting Started

This plugin requires Grunt ~1.1.0 and Node >=10.0.

npm install grunt-dom-munger --save-dev

Once the plugin has been installed, it may be enabled inside your Gruntfile with this line of JavaScript:

grunt.loadNpmTasks('grunt-dom-munger');

The "dom_munger" task

Overview

The dom-munger reads one or more HTML files and performs one or more operations on them.

grunt.initConfig({
  dom_munger: {
    your_target: {
      options: {
        //You typically would only specify one option per target but they may be combined
        //All options (except callback) can be arrays
        read: [
          {selector:'link',attribute:'href',writeto:'myCssRefs',isPath:true},
          {selector:'script[src]',attribute:'src',writeto:'myJsRefs',isPath:true}
        ],
        remove: '#removeMe',
        update: {selector:'html',attribute:'appmode', value:'production'},
        prefix: {selector:'link',attribute:'href',value:'project-name/'},
        suffix: {selector:'html',attribute:'version',value:'.0.1'},
        append: {selector:'body',html:'<div id="appended">Im being appended</div>'},
        prepend: {selector:'body',html:'<span>Im being prepended</span>'},
        text: {selector:'title',text:'My App'},
        callback: function($){
          $('#sample2').text('Ive been updated via callback');
        }
      },
      src: 'index.html', //could be an array of files
      dest: 'dist/index.html' //optional, if not specified the src file will be overwritten
    },
  },
})

Options

Note: each option (except callback) requires a selector. This can be any valid CSS selector. Also, each option (except callback) can be a single object (or String for remove) or an array of objects/Strings. In this way, one target may perform multiple actions of the same type.

options.read

Extract the value of a given attribute from the set of matched elements then set the values into dom_munger.data.{writeto}. A typical use-case is to grab the script references from your html file and pass that to concat,uglify, or cssmin.

grunt.initConfig({
  dom_munger: {
    your_target: {
      options: {
        read: {selector:'script',attribute:'src',writeto:'myJsRefs',isPath:true}
      },
      src: 'index.html'
    },
  },
  uglify: {
    dist: {
      src:['other.js','<%= dom_munger.data.myJsRefs %>'],
      dest: 'dist/app.min.js'
    }
  }
})

When isPath is true, the extracted values are assumed to be file references and their path is made relative to the Gruntfile.js rather than the file they're read from. This is usually necessary when passing the values to another grunt task like concat or uglify.

options.remove

Removes one or more matched elements.

grunt.initConfig({
  dom_munger: {
    your_target: {
      options: {
        remove: '#removeMe' //remove an element with the id of removeMe
      },
      src: 'index.html',
      dest: 'dist/index.html'
    },
  },
})

options.update

Updates the value of a given attribute for the set of matched elements.

grunt.initConfig({
  dom_munger: {
    your_target: {
      options: {
        update: {selector:'html',attribute:'appmode', value:'production'}, //set a appmode="production" on <html>
      },
      src: 'index.html',
      dest: 'dist/index.html'
    },
  },
})

options.prefix

Prepends to the value of a given attribute for the set of matched elements.

grunt.initConfig({
  dom_munger: {
    your_target: {
      options: {
        prefix: {selector:'link',attribute:'href', value:'project-name/'}, //prepend project-name to the href attribute, for example href="project-name/next/path" on <link>
      },
      src: 'index.html',
      dest: 'dist/index.html'
    },
  },
})

options.suffix

Appends to the value of a given attribute for the set of matched elements.

grunt.initConfig({
  dom_munger: {
    your_target: {
      options: {
        suffix: {selector:'html',attribute:'version', value:'.0.1'}, //append .0.1 to the version attribute, for example version="1.0.1" on <html>
      },
      src: 'index.html',
      dest: 'dist/index.html'
    },
  },
})

options.append

Appends the content to each matched element.

grunt.initConfig({
  dom_munger: {
    your_target: {
      options: {
        append: {selector:'body',html:'<div id="appended">Im being appended</div>'}
      },
      src: 'index.html',
      dest: 'dist/index.html'
    },
  },
})

options.prepend

Prepends the content to each matched element.

grunt.initConfig({
  dom_munger: {
    your_target: {
      options: {
        prepend: {selector:'body',html:'<span>Im being prepended</span>'}
      },
      src: 'index.html',
      dest: 'dist/index.html'
    },
  },
})

options.text

Updates the text content of the matched elements.

grunt.initConfig({
  dom_munger: {
    your_target: {
      options: {
        text: {selector:'title',text:'My App'} //Updates the <title> to "My App"
      },
      src: 'index.html',
      dest: 'dist/index.html'
    },
  },
})

options.callback

When you feel like busting loose. Set a callback function and use the passed JQuery object to do anything you want to the HTML. The second argument to the callback is the name of the file being processed. If the callback function returns false the source file is assumed to only have been read and no output will be written.

grunt.initConfig({
  dom_munger: {
    your_target: {
      options: {
        callback: function($,file){
          //do anything you want here
        }
      },
      src: 'index.html',
      dest: 'dist/index.html'
    }
  }
})

Quick Note About Ordering

When specifying multiple actions for a single task, the order of the actions is always:

read actions
remove actions
all other actions

This ensures that you can use one task to read script or link tags, then remove them, then append tags containing the concatenated/minified references.

Full End-to-End Example for Concatenation and Minification

The following is an example config to read your js and css references from html, concat and min them, and update the html with the new combined files.

This configuration would be run in this order:

grunt dom_munger cssmin uglify

grunt.initConfig({
  dom_munger: {
    main: {
      options: {
          read: [
            {selector:'link',attribute:'href',writeto:'cssRefs',isPath:true},
            {selector:'script',attribute:'src',writeto:'jsRefs',isPath:true}
          ],
          remove: ['link','script'],
          append: [
            {selector:'head',html:'<link href="css/app.full.min.css" rel="stylesheet">'},
            {selector:'body',html:'<script src="js/app.full.min.js"></script>'}
          ]
        }
      },
      src: 'index.html',
      dest: 'dist/index.html'
    },
  },
  cssmin: {
    main: {
      src:'<%= dom_munger.data.cssRefs %>', //use our read css references and concat+min them
      dest:'dist/css/app.full.min.css'
    }
  },
  uglify: {
    main: {
      src: '<%= dom_munger.data.jsRefs %>', //use our read js references and concat+min them
      dest:'dist/js/app.full.min.js'
    }
  }
});

Release History

v4.0.0 - Updated to grunt 1.1.0 and bumped min node version to 10.0. Please note that whitespace/line endings may be removed, this is caused by parse5 producing a spec-compliant DOM structure
v3.4.0 - Update task actions ordering. Reads always first, removes second, all others after.
v3.3.0 - All task actions can now be arrays for multiple actions per type.
v3.2.0 - Added second file argument to callback (#15).
v3.1.0 - Prefix and suffix options added. Fixes for issues #8, #10, and #11.
v3.0.0 - Removed jsdom engine as cheerio is as good without needing contextify.
v2.0.0 - Moved to cheerio engine. Upgraded jquery to v2.
v1.0.1 - remove moved to the second to last operation performed (only callback is later).
v1.0.0 - Read task modified to write values to dom_munger.data rather than to write directly to a task config.
v0.1.0 - Initial release.

grunt-dom-munger's People

Contributors

Stargazers

Watchers

grunt-dom-munger's Issues

Pass xmlMode option to cheerio

Cheerio transforms all attributes into lower case when it loads with the xmlMode="false" option.

This breaks the svg.viewBox attribute. Ideally, we should be able to pass the xmlMode into grunt-dom-munger.

options.read() get only needed ones

Can I configure the options.read() method to only get me the script tags I want to?
If I got it right, I can only get all script tags' src attribute.

Html elements replaced automatically

I have a html file in a web project, part of it reads something like:
<body>
<div>
<h4>something</h5><span>something else</span>
</div>

<div>
    something I want to modify
</div>
</body>

Note that h4-h5 mistake, it has been there for a long time and browsers seem to render it correctly.

Yesterday I added dom_munger to my GruntScript in order to select and modify some elements in the same page, something in another <div> as you see at the bottom. However by running the task,  dom_munger replaced the h4-h5 block to something like this:

<h4>something<span>something else</span></h4>

I am very certain that it is dom_munger that changed the tag, since it is the only npm task I added to build process. I want to know the reason behind this, does the selector assume that the html is written completely correctly? Or it tries to correct the html for me before selecting anything?

Additional Options

This is more a RFC cuz I'm not 100% sure this is necessary.

What do you think about adding a replace, after and before options?

The reason is because we had to replace a LESS import to a CSS import and rather than doing 3 separate updates it was easier to just replace. You can't append/prepend however since a <link> doesn't have children.

Also, theoretically if you have after or before you don't need replace since you can just remove the original element afterwards.

Any plans for having this plugin for gulpjs?

This is a very functional plugin from usability perspective, and I've used it on grunt-powered projects. I do, however, have specific requirements to use this for projects powered with gulp, so was wondering if you have any plans for porting this to gulp. Cheers!

Cheerio converts the XML into lowercase after loading.

When I load XML string into cheerio it converts all nodes (attributes + element node) into lowercase. Everything works fine but post cheerio transformation my XPath query fails because of all lowercase issue. Is there any way we could avoid this behavior of cheerio?

Thanks,
M

How can I remove the original js files after uglify?

Hello:
I'm the freshener of the module. I'm confused how remove the js useless files. update task only add the uglified file to the html and original files preserved. Can I must use an id wrapping the scripts or css?

Best Regards.

Add warning for missing attribute

Just started using dom-munger, started out with the read example and got an error that I wasn't passing strings to path.join. No idea what to do here so I debugged dom_munger.js and found out that 'undefined' was being used in path.join (line 34) for val. Eventually I realized that this was due to a <script> tag in my HTML file that had inline JS rather than a source attribute. Admittedly, had I looked at my HTML sooner, I may have realized this quicker, easily fixed by switching selector to 'script[src]'. However it would have been nice to get some warning at least like "Hey, the attribute you're requesting doesn't exist for this node".

Liking what this plug-in does so far, keep up the good work!

Any thoughts on reordering the options so that remove/update is performed last?

This way people can use DOM that will eventually be removed as a selector BEFORE it's gone.

&& broken

Using
aaa6969 : {
optionjjkjkjks: {
prefix:
{ selector: 'div.span4',
attribute:'class', value:'aaa6969'
}
},
src: 'documentation/src/ngDocs/index.html',
dest: 'documentation/src/ngDocs/1.html'
},

ruins my output.
convertts && to escaped. Why?

     <div class="dropdown search" ng-class="{open: focused &amp;&amp; bestMatch.rank &gt; 0 &amp;&amp; bestMatch.page != currentPage}">
        <input type="text" ng-model="search" placeholder="search the docs" tabindex="1" accesskey="s" class="search-query" focused="focused">

when should stay as:

     <div class="dropdown search"
           ng-class="{open: focused && bestMatch.rank > 0 && bestMatch.page != currentPage}">
        <input type="text" ng-model="search" placeholder="search the docs"
               tabindex="1" accesskey="s" class="search-query" focused="focused">

Add a filter option to `read` operation

It would be very useful to filter the text retrieved during read operations by passing a function to be called each time text is extracted.

My use-case is such that I have to manipulate the urls retrieved when scanning the src attribute of some script tags before they are used in any subsequent tasks.

Close tags for link, meta tags

Hi @cgross

Thanks for your good grunt task.
I have an issue with your grunt.
In some cases, I don't want dom-munger replace /> in link, meta tags.

So, how can I do?

Regards,
Nguyen

Write to file the return of a callback?

I'm not sure if you can't cant do this or the documentation is just unclear.
Basically what i want to do is to use the callback to select one dom object in the html and then write that to the destination file.

src option isn't detected

I've tried using this module using the following config:

dom_munger: {
  release: {
    options: {
      src: './www_dev/index.html',
      read: {
        selector: 'script[src]',
        attribute: 'src',
        writeto: 'jsFiles',
        isPath: true
      }
    }
  }
},

But found that the processFile function is never run and this.files on line 170 is an empty array (I refer to the dom_munger.js file in the module source). I've tried version 3.3.0 and 3.4.0. I'm not sure is this an issue with my config or a bug. I may get more time to investigate later.

Multiple runs of dom-munger can fail

Hi there, great plugin, think I found an issue though. Let me know if there's a better way to do this but I have a situation where I have to run six different dom-munger operations/targets across three different HTML files, all to separate HTML destinations. The easiest way it seems to do this was to run dom-munger three times, changing necessary variables for the paths. One of the six operations is a read to a property named 'jsFiles'. When dom-munger is run a second time, it sees dom_munger.data (where 'jsFiles' resides) as a target and attempts to run it. This causes the whole grunt file to stop. I think something needs to be implemented where a target named 'data' is skipped.

The quick work around for now is to just explicitly call each of the six dom-munger targets in order for every file as opposed to just blanket running 'dom-munger'.

Allow short or --verbose output

Hey! My recent project has grown to the point that now dom_munger updates/inserts a lot of parts (100s) into html. With this being said a lot of unnecessary logs are written to output, like "Appended to body", which brings 0 value and frustrates when I need to see logs that were written before munger. My suggestion is to use grunt.verbose instead of grunt.log in the places where we insert/update/delete something. This will allow to use --verbose flag when running grunt tasks to see full output if needed.

File changed/formatted after only reading attributes

I'm using grunt-dom-munger for extracting each value for a certain html attribute throughout my templates. I'm doing this using a callback function which only uses jQuery to read the attribute and writes to a grunt configuration attribute (almost what options.read does, but not exactly).

I'm expecting the files to remain untouched since I'm only reading from them, but it looks like grunt-dom-munger modifies the files as it parses the file? Notably, all newlines added for readability and code formatting are normalised and removed. After running the task, my working directory shows up as modified for all html templates I have, which is not desirable.

Would it be possible to somehow disable this formatting/modification of files when only reading from them?

Resolving relative paths

Hello, this is more of a general grunt question, not an issue with your awesome module.

I want to check if files linked from src attributes of multiple html files exist. Since the src links have relative paths, I need to know which file I am processing.

I want to do (config):

dom_munger: {
html: {
options: {
callback: function($) {
$('*').each(function() {
if (this.type === 'script' && this.attribs.src) {
// check if linked files actually exist
if (!grunt.file.read(CURRENT_FILEPATH + this.attribs.src)) {
grunt.fail.fatal('you have linked a file which does not exist');
}
}
});
}
}
}
}

Can you advise how to get the CURRENT_FILEPATH inside the callback function?

grunt.task.current.data.files[0].src does not work because I am passing a glob match pattern such as:
files: [{
expand: true,
src: 'something/else/*/.html'
}]

Cheers,

Mike

Array src - writeto variable overwritten

I want to collect all the javascript file references (paths) from all html files [src].

Is there a way to achieve this? Currently, the writeto variable jsRefs is getting overwritten and I am not sure how to get around this.

grunt.initConfig({
        dom_munger: {
            your_target: {
                options: {
                    read: {
                        selector: 'script',
                        attribute: 'src',
                        // Is there a way to instruct jsRefs to be
                        // a multi-dimensional array or object.key
                        // combination per source file holding
                        // the jsRefs Arrays?
                        writeto: 'jsRefs',
                        // <-
                        isPath: true
                    }
                },
                // Going through multiple html files, collecting javascript references
                src: ['html/*.html']
            },
        },
        log: {}
    });

    grunt.registerTask("log", function() {
        var jsFiles = grunt.config.process('<%= dom_munger.data.jsRefs %>');

        // only last processed file with array of paths
        console.dir(jsFiles);
    });

callback not exposing jquery object's full methods

Gives me error: $.each is undefined.
And var $methodsAndParm = $( 'ul.methods > li > h3' ); returns length of 0 when it should be 3

Is $ NOT a fully featured jquery object>
Also $.fn.jquery is undefined because $.fn is undefined.

Help?

Thank you.

        dom_munger: {
            // https://github.com/cgross/grunt-dom-munger
            gst6969: {
                options: {
                    callback: function ( $, file ) {
                        var $methodsAndParm = $( 'ul.methods > li > h3' );

                        grunt.log.writeln  ( 'jq: ' + $.fn );
                        $.each (
                            $methodsAndParm,
                            function (  intIndex, methodAndParm ) {

                                var strHtml = methodAndParm.html();

                                grunt.log.writeln ( strHtml );
                            }
                        );
                    }
                },
                src: [ 'documentation/src/ngDocs/partials/api/*.html' ]
                // dest: 'dist/index.html' // optional, if not specified the src file will be overwritten
            }

Mistake in Full End-to-End Example for Concatentation and Minification

Does your "Full End-to-End Example" have a mistake? The src and dest fields have to be inside the main task, haven't they?

adding attributes to commented element

Hi,
I was wondering is it possible to apply attributes to the conditional elements on document which they are commented out?

Node v6.9.2 error but works on earlier versions

Updated to node v6.9.2 and now get the following error in a yo generator that includes a Gruntfile.js

Running "dom_munger:read" (dom_munger) task

Processing index.html
Warning: Path must be a string. Received [ 'index.html' ] Use --force to continue.

I've made no changes to the Gruntfile.js and this is the config of dom_munger read object:

            read: {
                options: {
                    read: [
                        {selector: '[build-concat]', attribute: 'src', writeto: 'jsRefs', isPath:true}
                    ]
                },
                src: 'index.html'
            },

Any suggestions?

Adhere to Cheerio's XML mode when xmlMode option is specified

Cheerio exposes an option "xmlMode: true", which should be used to distinguish between writing out HTML ( $.html() ) and XML ( $.xml() ).

Therefore, in method processFile:

updatedContents = $.html();

Should become:

if (options.xmlMode) {
        updatedContents = $.xml();
} else {
        updatedContents = $.html();  
}

Perform `remove` operation after `read`

First off: great plugin!

It would be handy for my use-case if the remove operation was performed after the read.

I'd like to use dom_munger to do the following:

read and store the src attributes of some script tags
remove the same script tags
append a new script tag to the page

Unfortunately, the script tags I want to read are first removed by the remove option.

Using callback could be a workaround but moving the order of operations around would make for a cleaner implementation.

Thanks!

Escape rails tags

I'm working with a index.html.erb file (rails file), and I need to update the src attribute of a script to be something like this:

<script src="<%= path_to_code %>"></script>

I'm using the update method, but when I do so (by value: ''<%= path_to_code %>"), it doesn't put transfter that string literally to my file. I'm guessing that <%= %> is used in grunt/this library.

How would I escape this so I can pass it as a string?