Git Product home page Git Product logo

code-history-mining's Introduction

Code History Mining IntelliJ Plugin

This is a plugin for IntelliJ IDEs to visualize project source code history. Analysis is based on file-level changes and therefore programming language-agnostic. You can install it from IDE Settings -> Plugins or download from plugin repository.

Some examples of code history visualizations: JUnit, TestNG, Cucumber, Scala, Clojure, Kotlin, Groovy, CoffeeScript, Go, Erlang, Maven, Gradle, Ruby, Ruby on Rails, Node.js, GWT, jQuery, Bootstrap, Aeron, GHC, IntelliJ . Csv files with VCS data for the above visualizations are available on google drive.

See also code history miner (web server and CLI application with functionality of this plugin).

Why?

There is a lot of interesting data captured in version control systems, yet we rarely look into it. This is an attempt to make analysis of project code history easy enough so that it can be done regularly.

See also Your Code as a Crime Scene book.

How to use?

  • grab project history from version control into csv file - Grab Project History action will use VCS roots configured in current project for checked out VCS branches. The main reason for separate grabbing step is that code history often contains some noise (e.g. automatically updated build system files). Having code history in csv file should make it easier to process it with some scripts before visualization.
  • visualize code history from csv file - at this step code history is consumed from csv file and visualized in browser. All visualizations are self-contained one file html pages so that they can be saved and shared without external dependencies.

Grab VCS history

Use "Main menu -> VCS -> Code History Mining" or "alt+shift+H".

You should see this window: screenshot

  • From/To - desired date range to be grabbed from VCS. Commits are loaded from version control only if they are not already in csv file.
  • Save to - csv file to save history to.
  • Grab history on VCS update - grab history on update from VCS (but not more often than once a day). This is useful to grab history in small chunks so that when you run visualization grabbed history is already up-to-date.
  • Grab change size in lines/characters and amount of TODOs - grab amount of lines and characters before/after commit and size of change. This is used by some of visualizations and is optional. Note that it requires loading file content and can slow down grabbing history and IDE responsiveness.

Visualize

By default cvs files with history are saved to "<plugins folder>/code-history-mining" folder. Files from this folder are displayed in plugin menu. Each csv file will have sub-menu with visualizations:

screenshot

When opened in browser visualizations will have help button with short description, e.g. see visualizations for JUnit.

Misc notes

  • any VCS supported by IntelliJ should work (tested with svn/git/hg)
  • merged commits are grabbed with date and author of the original commit, merge commit itself is skipped
  • visualisations use SVG and require browser with SVG support (any not outdated browser)
  • some of visualisations might be slow for long history of a big project (e.g. building treemap view of commits for project with 1M LOC for 10 years might take forever). In this case, filtering or splitting history into smaller chunks can help.

Code history csv format

Each commit is broken down into several lines. One line corresponds to one file changed in commit. Commits are stored ordered by time from present to past. For example two commits from JUnit csv:

2001-10-02 20:38:22 +0100,0bb3dfe2939cc214ee5e77556a48d4aea9c6396a,kbeck,,IMoney.java,,/junit/samples/money,MODIFICATION,Cleaning up MoneyBag construction,38,42,4,0,0,817,888,71,0,0,0,0
2001-10-02 20:38:22 +0100,0bb3dfe2939cc214ee5e77556a48d4aea9c6396a,kbeck,,Money.java,,/junit/samples/money,MODIFICATION,Cleaning up MoneyBag construction,70,73,3,1,0,1595,1684,86,32,0,0,0
2001-10-02 20:38:22 +0100,0bb3dfe2939cc214ee5e77556a48d4aea9c6396a,kbeck,,MoneyBag.java,,/junit/samples/money,MODIFICATION,Cleaning up MoneyBag construction,140,131,8,4,23,3721,3594,214,154,511,0,0
2001-10-02 20:38:22 +0100,0bb3dfe2939cc214ee5e77556a48d4aea9c6396a,kbeck,,MoneyTest.java,,/junit/samples/money,MODIFICATION,Cleaning up MoneyBag construction,156,141,0,34,0,5187,4785,0,1594,0,0,0
2001-07-09 23:51:53 +0100,ce0bb8f59ea7de1ac3bb4f678f7ddf84fe9388ed,egamma,,.classpath,,,NEW,added .classpath for eclipse,0,6,6,0,0,0,240,240,0,0,0,0
2001-07-09 23:51:53 +0100,ce0bb8f59ea7de1ac3bb4f678f7ddf84fe9388ed,egamma,,.vcm_meta,,,MODIFICATION,added .classpath for eclipse,6,7,1,0,0,199,221,21,0,0,0,0

Columns:

  • revisionDate - in "yyyy-MM-dd HH:mm:ss Z" format with local timezone (see javadoc for details).
  • revision - unique commit id, format depends on VCS.
  • author - committer name from VCS.
  • fileNameBefore - file name before change, empty if file was added or name didn't change.
  • fileName - file name after change, empty if file was deleted.
  • packageNameBefore - file path before change, empty if file was added, path didn't change or file is in root folder.
  • packageName - file path after change, empty if files was deleted or is in root folder.
  • fileChangeType - "NEW", "MODIFICATION", "MOVED" or "DELETED". Renamed or moved files are "MOVED" even if file content has changed.
  • commitMessage - commit message, new line breaks are replaced with "\n".
  • linesBefore - number of lines in file before change; "-1" if file is binary or "Grab change size" checkbox is not selected in "Grab Project History" dialog; "-2" if file is too big for IntelliJ to diff.
  • linesAfter - similar to the above.
  • other before/after columns - similar to the above, should be self-explanatory.

Output csv format should be compatible with RFC4180.

Acknowledgments

Similar projects

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.