Git Product home page Git Product logo

codemod's Introduction

codemod

PyPI downloads Travis CI Code Health

Overview

codemod is a tool/library to assist you with large-scale codebase refactors that can be partially automated but still require human oversight and occasional intervention.

Example: Let's say you're deprecating your use of the <font> tag. From the command line, you might make progress by running:

codemod -m -d /home/jrosenstein/www --extensions php,html \
    '<font *color="?(.*?)"?>(.*?)</font>' \
    '<span style="color: \1;">\2</span>'

For each match of the regex, you'll be shown a colored diff, and asked if you want to accept the change (the replacement of the <font> tag with a <span> tag), reject it, or edit the line in question in your $EDITOR of choice.

Install

In a virtual environment or as admin user

pip install codemod

or system wide with sudo

sudo -H pip install codemod

Usage

The last two arguments are a regular expression to match and a substitution string, respectively. Or you can omit the substitution string, and just be prompted on each match for whether you want to edit in your editor.

Options (all optional) include:

-m
  Have regex work over multiple lines (e.g. have dot match newlines).  By
  default, codemod applies the regex one line at a time.
-d
  The path whose ancestor files are to be explored.  Defaults to current dir.
-i
  Make your search case-insensitive
--start
  A path:line_number-formatted position somewhere in the hierarchy from which
  to being exploring, or a percentage (e.g. "--start 25%") of the way through
  to start.  Useful if you're divvying up the substitution task across
  multiple people.
--end
  A path:line_number-formatted position somewhere in the hierarchy just
  *before* which we should stop exploring, or a percentage of the way
  through, just before which to end.
--extensions
  A comma-delimited list of file extensions to process. Also supports Unix
  pattern matching.
--include-extensionless
  If set, this will check files without an extension, along with any
  matching file extensions passed in --extensions
--accept-all
  Automatically accept all changes (use with caution)
--default-no
  Set default behavior to reject the change.
--editor
  Specify an editor, e.g. "vim" or "emacs".  If omitted, defaults to $EDITOR
  environment variable.
--count
  Don't run normally.  Instead, just print out number of times places in the
  codebase where the 'query' matches.
--test
  Don't run normally.  Instead, just run the unit tests embedded in the
  codemod library.

You can also use codemod for transformations that are much more sophisticated than regular expression substitution. Rather than using the command line, you write Python code that looks like:

import codemod
codemod.Query(...).run_interactive()

See the documentation for the Query class for details.

Background

Announcement by Justin Rosenstein on Facebook Notes, circa December 2008

Part of why most code -- and most software -- sucks so much is that making sweeping changes is hard.

Let's say that a month ago you wrote a function that you -- or your entire company -- have been using frequently. And now you decide to change its name, or change the order of its parameters, or split it up into two separate functions and then have half the call sites use the old one and half the call sites use the new one, or change its return type from a scalar to a structure with additional information. IDEs and standard *nix tools like sed can help, but you typically have to make a trade-off between introducing errors and introducing tedium. The result, all too often, is that we decide (often unconsciously) that the sweeping change just isn't worth it, and leave the undesirable pattern untouched for future versions of ourselves and others to grumble about, while the pattern grows more and more endemic to the code base.

What you really want is to be able to describe an arbitrary transform -- using either regexes in the 80% case or Python code for more complex transformations -- that matches for lines (or sets of lines) of source code and converts them to something more desirable, but then have a tool that will show you each of the change sites one at a time and ask you either to accept the change, reject the change, or manually intervene using your editor of choice.

So, while at Facebook, I wrote a script that does exactly that. codemod.py a nifty little utility/library to assist with codebase refactors that can be partially automated but still require human oversight and occasional intervention. And, thanks to help from Mr. David Fetterman, codemod is now open source. Check it out (so to speak):

git clone git://github.com/facebook/codemod.git
(previously svn checkout https://codemod.svn.sourceforge.net/svnroot/codemod/trunk codemod)

It's one of those tools where, the more you use it, the more you think of places to use it -- and the more you realize how much you were compromising the quality of your code because reconsidering heavily-used code patterns sounded just too damn annoying. I use it pretty much every day.

Dependencies

  • python2

Credits

Copyright (c) 2007-2008 Facebook.

Created by Justin Rosenstein.

Licensed under the Apache License, Version 2.0.

codemod's People

Contributors

adamjernst avatar asm89 avatar astonm avatar beyang avatar bhageena avatar daneden avatar dannixon avatar davefet avatar facebook-github-bot avatar filipallberg avatar fried avatar fuchida avatar georgelesica-wf avatar ide avatar jamesgpearce avatar kastiglione avatar keyan avatar meowcoder avatar modocache avatar orip avatar pda avatar ptarjan avatar rochacbruno avatar sbz avatar shantanu404 avatar snickl avatar swolchok avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

codemod's Issues

Fails with backtrace on permission denied.

Traceback (most recent call last):
File "./codemod.py", line 795, in
run_interactive(**options)
File "./codemod.py", line 141, in run_interactive
_ask_about_patch(patch, editor)
File "./codemod.py", line 578, in _ask_about_patch
_save(patch.path, lines)
File "./codemod.py", line 602, in _save
file_w = open(path, 'w')
IOError: [Errno 13] Permission denied: '/path/to/file.inc'

It should handle this error gracefully, and continue. (or, potentially, not "see" files that cannot be written to).

Python 3 support

Though I am saying python 3 but what I actually mean Python version agnostic. Wouldn't it be great?
I mean python 3 is the future after all ๐Ÿ˜„

why not realease it to PyPI and make entry_points?

Hi,

Currently it has to be installed from github source

Why not release it to PyPI and add console_scripts as entry point?

So it allows us to do

pip install codemod
codemod -m -d /home/jrosenstein/www --extensions php,html \
    '<font *color="?(.*?)"?>(.*?)</font>' \
    '<span style="color: \1;">\2</span>'

If any help needed I can help packacking and releasing.

Tag a release?

Hello,
Could you please tag a release here so we could include codemod in Homebrew core?
Thanks!

Unicode crash should be handled gracefully

Traceback (most recent call last):
  File "/opt/homebrew/bin/codemod", line 9, in <module>
    load_entry_point('codemod==1.0.0', 'console_scripts', 'codemod')()
  File "/opt/homebrew/lib/python3.5/site-packages/codemod/base.py", line 1025, in main
    run_interactive(**options)
  File "/opt/homebrew/lib/python3.5/site-packages/codemod/base.py", line 167, in run_interactive
    for patch in suggestions:
  File "/opt/homebrew/lib/python3.5/site-packages/codemod/base.py", line 442, in generate_patches
    lines = list(open(path))
  File "/opt/homebrew/Cellar/python35/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 130: invalid continuation byte

fork in modone

Hey codemod folks-

just wanted to bring to your notice https://github.com/indigoviolet/modone, which is a fork of codemod but delegates all the filesystem traversal and filtering to unix utilities. I think it's probably too large of a behavioral change to attempt to bring upstream (hence the fork), but wanted to point it out anyway.

Move tests into separate module

This suggestion intersects a bit with #64. I think it would make the codebase more readable and extensible if we moved tests out of docstrings and into a separate tests.py module. This would make it easier to write more involved tests such as the one we need to reproduce the bug in #33.

Report filename with invalid encoding

I'm trying to run codemod on my codebase and it is choking with:

Traceback (most recent call last):
  File "/home/ezyang/local/pytorch-tmp-env/bin/codemod", line 8, in <module>
    sys.exit(main())
  File "/home/ezyang/local/pytorch-tmp-env/lib/python3.7/site-packages/codemod/base.py", line 1025, in main
    run_interactive(**options)
  File "/home/ezyang/local/pytorch-tmp-env/lib/python3.7/site-packages/codemod/base.py", line 167, in run_interactive
    for patch in suggestions:
  File "/home/ezyang/local/pytorch-tmp-env/lib/python3.7/site-packages/codemod/base.py", line 442, in generate_patches
    lines = list(open(path))
  File "/home/ezyang/local/pytorch-tmp-env/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe3 in position 16: invalid continuation byt

It would be very nice if it told me what file had the bad encoding!

PyPI ownership

codemod is released to PyPI https://pypi.python.org/pypi/codemod

But it is under my ownership

I need to know what is the PyPI username/mail that I have to transfer/add ownership as a maintainer of package there.

There is an official PyPI facebook user?

Allow for obeying of .gitignore

I typically have to run codemod on several directories in my working repository with the -d option. I would run it in the root of my working repository, but it includes some large irrelevant files that slow codemod down. Is there a way to have it obey .gitignore?

Query.start_position has no setter

If I interrupt my codemod session (with ctrl-C) and attempt to restart it, I get the expected prompt

Resume where you left off, at ./my/files/code.py:10 (y/n)?

But when I choose y I get the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/codemod", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/site-packages/codemod/base.py", line 1004, in main
    run_interactive(**options)
  File "/usr/local/lib/python2.7/site-packages/codemod/base.py", line 130, in run_interactive
    query.start_position = bookmark
AttributeError: can't set attribute

Looking at codemod.py there's a start_position = property(get_start_position), but no setter is specified. I can't find record of a setter having ever been there in this repo, but I know this feature used to work for me in some past version of codemod, so something must have changed. This may be an issue across Python versions (my /usr/bin/env python2 is 2.7.12).

If indeed some code is missing, I'm happy to submit a PR to add (re-add?) a setter for Query.start_position, and also one for Query.end_position while I'm at it.

"Skip file" feature

Here's a feature request that I can submit a patch for if there is interest from other people. It'd be nice to be able to skip a file (i.e., say 'no' to all suggested changes in a given file).

Motivation: This comes up a lot when I am codemodding a directory with a bunch of vendored libraries in addition to my own code. I know I don't want to make any mods to the vendored libraries.

Argument Parsing

codemod currently uses getopt. But we have optparse and argparse in the stdlib. Moving to argparse would cleanup the codebase a little and provide more control around the argument parsing.

Split up this monolithic file in to a set of modules

With the merging of #63 it is possible to split the base.py in to a set of modules separating utils, funcions and classes, leaving only the main cli and its arguments in base.py which could be renamed to cli.py or main.py

Show a smarter diff

This mostly only applies to usage as a library, although in principle one could hit it with a large multiline regexp. It would be really nice if instead of just showing the entire removed block as red and the entire added block as green, codemod could show a more concise diff. This would be especially useful for codemods that reorder lines, for example by moving a line up a few lines -- right now they show a big unreadable block of red and green, but they could just show a single red and a single green line.

(In an ideal world, they could even allow diff-splitting, like git add -p, but I don't think that's particularly compatible with the current Patch architecture.)

exclude filter

comma delimited list of directory and file globs to ignore a-la gitignore or ack would make this my number-one goto.

My example is I'm in JS currently, so I want to be able to exclude things like node-modules/, package-lock.json, build/

Replace complete line (including newline) matches only every second line

Hi,
I want to remove every line that matches a specific pattern completely (including newline).
I would expect that [ \t]*require_once[^\n\r]*\r?\n? would match every line that contains requrie_once.
In fact, only every second line is removed:

Testcase:

$ cat codemodtest/A.php
<?php

require_once 'B.php';
require_once 'C.php';
require_once 'D.php';
require_once 'E.php';

class A {
}

?>


$ codemod.py -m -d codemodtest --extension php,inc '[ \t]*require_once[^\n\r]*\r?\n?' ''

$ cat codemodtest/A.php
<?php

require_once 'C.php';
require_once 'E.php';

class A {
}

?>

How to test regex

Iโ€™m using https://pythex.org with this search = \((.*)\) => { and this string Date.now = () => 1; and it does not match. However, when using that in codemod it is matched. How can I tell codemod to respect the { character.

codemod -m -d lib --extensions js \
    ' = \((.*)\) => {' \
    '(\1) {'

Windows Support

I'm trying to use this package on windows. There's no mention of the dependency on curses and fcntl in the readme or in a requirements.txt/Pipfile.

I was able to get curses installed from here, but there's no fcntl on that page.

Here's the call I'm using that's asking for it:

$ codemod --extensions rb --count '(\s+)(get|delete|put)' ''
Traceback (most recent call last):
  File "C:\Users\CM022291\AppData\Local\Programs\Python\Python36\Scripts\codemod-script.py", line 11, in <module>
    load_entry_point('codemod==1.0.0', 'console_scripts', 'codemod')()
  File "C:\Users\CM022291\AppData\Local\Programs\Python\Python36\lib\site-packages\pkg_resources\__init__.py", line 565, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "C:\Users\CM022291\AppData\Local\Programs\Python\Python36\lib\site-packages\pkg_resources\__init__.py", line 2631, in load_entry_point
    return ep.load()
  File "C:\Users\CM022291\AppData\Local\Programs\Python\Python36\lib\site-packages\pkg_resources\__init__.py", line 2291, in load
    return self.resolve()
  File "C:\Users\CM022291\AppData\Local\Programs\Python\Python36\lib\site-packages\pkg_resources\__init__.py", line 2297, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "C:\Users\CM022291\AppData\Local\Programs\Python\Python36\lib\site-packages\codemod-1.0.0-py3.6.egg\codemod\__init__.py", line 1, in <module>
    from codemod.base import *  # noqa
  File "C:\Users\CM022291\AppData\Local\Programs\Python\Python36\lib\site-packages\codemod-1.0.0-py3.6.egg\codemod\base.py", line 34, in <module>
    import codemod.terminal_helper as terminal
  File "C:\Users\CM022291\AppData\Local\Programs\Python\Python36\lib\site-packages\codemod-1.0.0-py3.6.egg\codemod\terminal_helper.py", line 11, in <module>
    import fcntl
ModuleNotFoundError: No module named 'fcntl'

Would it be possible to detect windows and use naive terminal handling in that case?

Does not include extensionless files

The issue in #37 will work for named files without an extension, but does not provide an easy way of including all files in a tree that don't have any extension.

Usage should indicate that one of '--extensions' and '--include-extensionless' is required

Currently, if I run codemod.py foo bar, I get the following:

usage: codemod.py [-h] [-m] [-d D] [-i] [--start START] [--end END]
                  [--extensions EXTENSIONS] [--include-extensionless]
                  [--exclude-paths EXCLUDE_PATHS] [--accept-all]
                  [--editor EDITOR] [--count] [--test]
                  [match] [subst]

In most utilities I've used, the usage string does not use square brackets around required parameters.

Problem when commenting code with '--'

Some languages use chars -- for comments. Codemod seems to interpret last -- debug as invalid parameter and prints help screen. This should be interpreted as replacement regexp.

$ codemod.py --extensions lua 'codetocomment' '-- codetocomment'

Workaround for now:

$ codemod.py --extensions lua 'codetocomment' ' -- codetocomment'

Running codemod.py without any params results in an error

It also seems to cause this for a few other seemingly ok inputs.

$ codemod.py
Searching for first instance...
Traceback (most recent call last):
File "/usr/local/bin/codemod.py", line 853, in
run_interactive(**options)
File "/usr/local/bin/codemod.py", line 129, in run_interactive
for patch in suggestions:
File "/usr/local/bin/codemod.py", line 357, in generate_patches
path_list = Query._walk_directory(self.root_directory)
File "/usr/local/bin/codemod.py", line 395, in _walk_directory
for root, dirs, files in os.walk(root_directory)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/os.py", line 276, in walk
names = listdir(top)
TypeError: coercing to Unicode: need string or buffer, NoneType found

auto git-add all changes

i always find myself doing 2 loops over the edit locations.
once to make the edit, and then using git add -p, to add all changes to the git index. would be pretty convenient to automatically stage to the index all changes i make

--count result is incorrect

Hi,

It seems that the --count option returns the correct result minus one. For example if I have only one matching string, it returns zero matches, or if I have 30 matches, it returns 29.

Thanks for building for this very useful tool!

Cannot codemod extensionless files

There's no way to specify extensionless files to codemod. This gets in the way of files used by some build systems, like BUILD or BUCK files.

remove line

Hey!

Anyone who can tell me how to remove a line as a replacement?

zip_safe=False

As far as I can tell this module does not include any c extensions or data files. So zip_safe should be true

Extension groups?

Sometimes I want to modify all instances of a particular string in an Objective-C/Swift codebase. At Facebook, that means --extensions=h,m,mm,swift,BUCK,plist.

Typing those out is a pain--instead, it might be cool to have "extension groups", or aliases for a list of extensions. For example, if we defined a group called ios, these two would be equivalent:

--extensions=h,m,mm,swift,BUCK,DEFS,plist
--extensions=ios

Since ios means different things to different teams, maybe one should be able to register their own custom groups, instead of codemod making a decision as to what ios means for everyone.

Not sure if this is a good idea, as it might just be adding a complex feature without a great need to--thoughts? ๐Ÿ™Œ

Match all extensions / --extensions should accept glob arguments

I have a codebase in which I would like to search all files within a certain directory for matches. This directory also has several files without an extension (shell scripts). At the moment, there is no way of specifying this in codemod as far as I'm aware?

Ignore binary files automatically

Is it possible to add an option to ignore binary files automatically rather than exiting?

For example,

codemod --ignore-binary foo bar

Currently, the only way to avoid stopping replacement at binary files is to specify all file extensions known to be text files although it's a kind of tedious. For example, I often do this:

codemod --extension ts,md,js,tsx,jsx,txt,json foo bar

This feature might be related to #105.

Doesn't recognize .d.ts extension

Trying to run codemod on .d.ts files, but it doesn't find any.

I'm running; codemod --extensions 'd.ts,' '(export { default as (\w*) }.*)' 'export const \2: unknown; //\1 'a.

Select default action

Currently the default action for Accept change is yes, when doing a change where you want to change only the minority of instances it would be good to be able to set the default to be no.

"invalid syntax" line 180

line_transformation = lambda line: None if regex.search(line) else line

It's pointing to the "f" of "if". Any idea how to get this work?

ValueError: thrown on lines with //inline comments

eg:

           var tab = new MainTabModel({
                   id: tabId,
                   text: searchDateString,
                   isClosable: true, // presumably useful comment here
                   ...
               });

And I've seen it err as on the line above that which contains // said comment

[solved] Installing on Ubuntu [permission problem, use sudo]

Hi this tool sounds so useful and I'm trying to install it,
I don't speak Python, but usually I manage to install Python utilities on my Ubuntu. But seems like this is not the case.
Is there anything needed after running
pip install codemod

To get it running in my terminal?
I see the codemod folder created with some python files in it, should I add something to path to be able to run it?
/home/omid/.local/lib/python2.7/site-packages/codemod

Thank you!

Versioning?

I find this tool extremely handy. Thanks for making it.

I submitted a request for it to be added to Homebrew (an OS X package manager), however their inclusion policy requires real version numbers. See Homebrew/legacy-homebrew#23974

I could tag it in the recipe as 0.1 as referenced by https://github.com/facebook/codemod/blob/master/setup.py#L5

However that seemed like a placeholder and hasn't changed even when new features have been added.

So would you be willing to start tagging releases? Otherwise, I'll send the formula over to https://github.com/Homebrew/homebrew-headonly, but that's a bit of an impediment to getting distribution with new users.

Thanks again!

feature request: undo / go back

When making a large, mostly-straightforward codemod, clicking "enter" in rapid succession is the only tolerable modus operandi.

Sometimes, I see things that look potentially problematic, but my feeble human brain doesn't detect this until my eager pinky has already pressed "enter".

It would be nice to have a "b" (or maybe "u") shortcut to the previous diff and reevaluate.

I'm not sure how straightforward that would be to implement ๐Ÿ˜„

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.