17451k / clade Goto Github PK

View Code? Open in Web Editor NEW

17.0 5.0 6.0 2.48 MB

Clade is a tool for extracting information about software build process and source code

License: Apache License 2.0

CMake 0.23% C 6.68% Python 89.03% Makefile 0.13% C++ 3.93%

compilation-database build-tool source-code-analysis callgraph

clade's People

Stargazers

Watchers

Forkers

mutilin vikramsubramanian doytsujin kateya liuchaoxd vmordan

clade's Issues

Clade does not support spaces in directory names

BTW, this can result in #42.

Rewrite high-level interface

Right now it is quite confusing, difficult to use and it lacks some features available in the low-level interface. docstrings are also must be added.

Provide more information about unsupported CIF output

At the moment there is just the following error message "CIF output has unexpected format" on screen. This does not show particular errors. Besides, there is not any other sources of additional information.

Replace "black list" of options that are passed to CIF by "white list"

Passing an unsupported option to CIF usually results in fail. It is not possible to specify complete list of unsupported options, so instead it would be better to pass only those options that are truly needed.

Allow calculating check sums for stored sources

To easily distinguish different stored sources using check sums calculated in advance seem to be the best solution. Other solutions like providing attributes for stored sources, calculating check sums on the fly or archiving sources and comparing archive sizes are worse by different reasons.

Add ability to preprocess intercepted build commands before their execution

Without this it is impossible to properly work with commands that store their arguments in temporary files, like some Microsoft build utilities:

cl.exe @"C:\Users\shchepetkov\AppData\Local\Temp\tmp0a392aa8bcbf41339a14bcbe325dd7d1.rsp"

Check for file "functions.json" existence

When directory "Functions" exists Clade still needs to check for file "functions.json" existence.

Add Python 3.4 support

Improve incorrect arguments handling

Currently instead of shutting down with error in case of incorrect command line argument, Clade can just hang.

info: expand($) doesn't work

Build indexes for cross referencing

Recently we discussed that cross referencing is a very valuable feature for users. Clade should perform additional code querying for gathering necessary data and generate indexes for source files.

Find a way to get rid of duplicate commands.

Some commands are in fact just wrappers for other commands. For example, on macOS calling gcc can result in the following command stack:

- /usr/bin/gcc ...
- /Library/Developer/CommandLineTools/usr/bin/gcc ...
- /usr/bin/xcrun clang ...
- /Library/Developer/CommandLineTools/usr/bin/clang ...
- /Library/Developer/CommandLineTools/usr/bin/clang -cc1 ...

Generally we are interested only in the first command in such stack, but currently there is no way to know that these commands are connected to each other, so corresponding extensions (CC, in case of this example) process all of them as independent commands. The result is several duplicate commands.

CIF still doesn't work on macOS

The reason is Linux-specific command line tools used in cif.c file.

CIF is an optional dependency of Clade used for getting information about source code.

LD extension parses options incorrectly

Example:

 {
        "command":"ld",
        "cwd":"/work/git/linux",
        "id":29309,
        "in":[
            "max-page-size=0x200000",
            "drivers/acpi/.tmp_scan.o"
        ],
        "opts":[
            "-m",
            "elf_x86_64",
            "-z",
            "-r",
            "-T",
            "drivers/acpi/.tmp_scan.ver"
        ],
        "out":"drivers/acpi/scan.o"
}

max-page-size is definitely not an input file.

Unparsed command:

    {
        "command":[
            "ld",
            "-m",
            "elf_x86_64",
            "-z",
            "max-page-size=0x200000",
            "-r",
            "-o",
            "drivers/acpi/scan.o",
            "drivers/acpi/.tmp_scan.o",
            "-T",
            "drivers/acpi/.tmp_scan.ver"
        ],
        "cwd":"/work/git/linux",
        "id":29309,
        "which":"/usr/bin/ld"
    },

Add "cwd" argument to intercept() method of Clade interface class

It should be possible to execute build process without having to manually change current directory.

Add support for multiple output files

For example, compilation command with "-c" option and multiple input files have multiple output files as well.

Support make commands that ignore errors

Like the ones that starts with "-":

clean:
  -rm -f *.o

Currently clade-intercept exits with error on such make commands, but it should return 0 instead.

Implement proper API for extensions

There are already some API functions inside extensions classes (for example, methods load_cmd_by_id() and load_all_cmds()). We need to add more such methods.

There are another problem: to use these interface methods it is required to manually create extension object beforehand. Perhaps interface functions should be independent from extensions classes.

Evaluate and output progress

I think that for some long operations of Clade, e.g. querying source code, you can evaluate and output a progress quite easily.

The issue is not very important since Clade does not work very much time first and it is not intended to be invoked often.

Each extention object always creates a tmp dir and does not clean it

Class Extention constructor contains the following code:
def __init__(self, work_dir, conf=None, preset="base"): ... self.temp_dir = tempfile.mkdtemp()

This leads to creation of billions of dirs in /tmp even if a user is not going to call parse method.

Change logging implementation in intercept.py

Logging implementation should be the same across all Clade scripts.

Implement "load_all_compilation_cmds" method

load_all_cmds() method of the CC extension can contain linker or assembler commands. This is expected and right, but sometimes user want to receive only proper compilation commands, so an additional method is required.

clade fails to install on Fedora 28

Installation log:

$ sudo pip3 install -e .
WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead.
Obtaining file:///home/work/tmp/clade
Requirement already satisfied: ujson in /usr/local/lib64/python3.6/site-packages (from clade==1.0)
Requirement already satisfied: graphviz in /usr/local/lib/python3.6/site-packages (from clade==1.0)
Requirement already satisfied: jinja2 in /usr/local/lib64/python3.6/site-packages (from clade==1.0)
Requirement already satisfied: ply in /usr/lib/python3.6/site-packages (from clade==1.0)
Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib64/python3.6/site-packages (from jinja2->clade==1.0)
Installing collected packages: clade
  Found existing installation: clade 1.0
    Can't uninstall 'clade'. No files were found to uninstall.
  Running setup.py develop for clade
Successfully installed clade

Run:

$ clade
Traceback (most recent call last):
  File "/usr/local/bin/clade", line 11, in <module>
    load_entry_point('clade', 'console_scripts', 'clade')()
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 476, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2699, in load_entry_point
    raise ImportError("Entry point %r not found" % ((group, name),))
ImportError: Entry point ('console_scripts', 'clade') not found

In some cases libinterceptor do not intercept /usr/bin/as commands

Can be reproduced on the test-project on Ubuntu 18.04 with any version of GCC.

Clade changes loggers of programs that import it

The problem is in the abstract extension that calls logging.basicConfig.

Store configuration files for various projects in the repository

Clade occupies memory during normalizing CIF output uselessly (likely)

Log: "11:43:51 clade Info: Normalizing CIF output".
System monitor state:

Failed building wheel for clade

pip3 install clade (Ubuntu):
Failed building wheel for clade
error: can't copy 'clade/libinterceptor/lib': doesn't exist or not a regular file
Command "/usr/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-install-0cttqg1y/clade/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-1b0ttqqp/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-0cttqg1y/clade/

Cannot install Clade into any specific directory

Installing Clade with a command
sudo pip3 install -t ../install .

I do not have bin directory with required scripts. Without sudo it does not work at all because pip prevents installation at all (Ubuntu bug as far as I know):
raise DistutilsOptionError("can't combine user with prefix, " distutils.errors.DistutilsOptionError: can't combine user with prefix, exec_prefix/home, or install_(plat)base

Problem with temporary directories

As Eugeny explained os.path.join() does not work with absolute paths. Thus, in info.py:148 cif_out is usually assigned to origin "cmd_in" + ".o"

Move all options to the base preset

Create several subclasses of Intercept class to represent different interception methods

Code starting to look really complicated without it.

Replace default "which_list" values by regular expressions that better describe executable names

Test compatibility with Python 3.4

Currently only compatibility with Python 3.5, 3.6 and 3.7 is tested. Adding 3.4 to the list is a little bit tricky due to issues in Travis.

Change the format of cmds.txt file

At the moment, the commands arguments are separated by the "||" characters. These characters can also occur in the arguments themselves, so it would be better if we structure this file a little bit differently.

Update readme

Updated readme should include:

Build prerequisites
Installation instructions (both PYPI and source code)
How to use: command line utilities, importing as Python module
Troubleshooting

There may be no declarations in some files when processing function calls during building call graphs

See details at: clade/extensions/callgraph.py:121-122.

BTW, this was mentioned at #45.

Do not hide stack traces

I catch the following exception:
clade Callgraph: Processing calls
'NoneType' object is not iterable

It is hard to understand what went wrong.

The Linux preset requires additional filters to reject junky CC and LD build commands

Experimenting with the linux bases I noted that load_all_cmds_by_type provides more commands than expected by the implied preset. In PDF file I do not observe these commands but the method returns them.

For instance, there are commands with empty in files attribute or .tmp\w+.s in files.

WANTED: tests, dead or alive

Do not blame CIF when it was not run

If Clade does not run CIF at all (e.g. this is the case when there aren't input files), it still blames it that it fails on every command. Instead, I expect that there should be errors, e.g. that input files are missed.

Path to file with global variables must be relative to the top build directory

Currently, this path is relative to the directory in which the compilation command was executed.

CIF doesn't print information about exported functions for the Linux kernel

Probably info: expand(__EXPORT_SYMBOL(sym, sec)) request doesn't work properly.

Removing duplicate lines is too slow

Sometimes CIF outputs really large files: for example, for macros expansions for the whole Linux kernel the size of the output file is approximately 34GB. It consists almost entirely from duplicate lines which must be removed from the file. Currently this process takes almost an hour of time, which is unacceptable.

17451k / clade Goto Github PK

clade's People

Stargazers

Watchers

Forkers

clade's Issues

Recommend Projects

Recommend Topics

Recommend Org