sfrancisx / dupfind Goto Github PK
View Code? Open in Web Editor NEWScan JavaScript code and find duplicated sections
Scan JavaScript code and find duplicated sections
Version 1.0.2 dupfind: Scan JavaScript code and find duplicated sections. Synopsis -------- dupfind [file_set ...] [file_pattern ...] Description ----------- dupfind will read multiple JS files and look for duplicated segments of code. It matches tokens, so it will ignore comments and it's insensitive to whitespace. It can do a fuzzy match, which will also ignore changes to variable names. You can define filesets in a configuration file, for often used files. When run with no arguments, default filesets from the configuration file will be scanned. When run with arguments, the arguments are either the names of filesets in the configuration file, or file patterns to match. Configuration ------------- You can create a configuration file for dupfind to use. dupfind will search for the configuration file in these locations, in this order: ./dupfind.cfg ~/.dupfind.cfg /home/y/conf/dupfind/dupfind.cfg If it doesn't find a configuration file, it will search all .js files in or below the current directory. The configuration file should contain JSON that looks like this: { min: 30, max: 500, increment: 10, fuzzy: true, cpd: false, sources: [ { name: "yui", def: true, root: "/home/y/share/htdocs/yui3", directories: [ "build" ], include: [ "*.js" ], exclude: [ "*/.svn", "*simpleyui.js", "*-[^/]*.js", "*yui.js", "*datatype*" ] } ] } min: The minimum number of consecutive duplicated tokens required to report the duplication. max: The maximim number of consecutive tokens to check increment: dupfind looks for duplication multiple times. The first time, it looks for 'max' consecutive tokens. It reduces the number by 'increment' and checks again (I guess 'increment' should really be 'decrement'...) It continues until the number being checked is less than 'min'. fuzzy: 'true' to do a fuzzy match. A fuzzy match ignores changes to variable names. cpd: 'true' to output in XML (like PMD/CPD). sources: An array of objects describing the files to scan. Each object in the sources array contains: name: The name of the file set. This name can be provided as an argument on the command line to limit scanning to this fileset. def: 'true' if this is a default fileset. All default filesets will be scanned when dupfind is executed with no arguments. root: The root directory for the fileset. directories: An array of strings. These are subdirectories under the root to scan. If this member isn't present, all subdirectories are scanned. include: Files to include. This is a DOS style regular expression - '.' means '.', '*' means '.*' and '?' means '.'. The expression is matched against the full path to each regular file (not directories), and it has to match for the file to be scanned. exclude: Files and directories to exclude. This is also a DOS style regular expression. Matching files are excluded. Matching directories aren't scanned at all (meaning they're not recursed into, either.)
PMD/CPD outputs an XML format that would be nice for dupfind to emulate as an option as that makes for easy Jenkins integration.
Here is the format (fun on dupfind itself!):
[findresult-lm] dupfind > ~/Downloads/pmd-bin-5.0-alpha/bin/run.sh cpd --format xml --minimum-tokens 25 --files . --language java
obj.put(name, t.value); break; case Token.LB: ArrayList a = new ArrayList(); first = parseArray(tokens, first, last, a); t = tokens[first]; switch (t.type) { case Token.STRING: case Token.TRUE: case Token.FALSE: for (i = 0; i < cfg.sources.length; i++) { if (cfg.sources[i].def) obj.put(name, a); break; case Token.LC: Hashtable o = new Hashtable(); first = parseObject(tokens, first, last, o); error("Found unexpected } on line %d", t.lineNum); return first; } if (t.type != Token.COMMA) error("Expected comma on line %d", t.lineNum); s = (String)obj.get("fuzzy"); if (s == null || s.compareTo("0") == 0 || s.compareTo("false") == 0)Here is the link to the CPD source file that does it:
FWIW I had to add the 'saxon9-dom.jar' files to PMD's distro to get it to run which I found here:
http://sourceforge.net/projects/saxon/files/Saxon-B/9.1.0.8/saxonb9-1-0-8j.zip/download
THANKS!!! Mark
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.