Git Product home page Git Product logo

coccinelle / coccinelle Goto Github PK

View Code? Open in Web Editor NEW
599.0 32.0 101.0 49.43 MB

Source code of the Coccinelle project (mirror of the main Coccinelle repository located at Inria)

Home Page: http://coccinelle.lip6.fr/

License: GNU General Public License v2.0

Makefile 0.77% Shell 0.89% OCaml 86.23% Standard ML 0.51% C 3.89% C++ 0.11% Emacs Lisp 0.40% TeX 2.67% Python 0.45% Perl 0.56% Awk 0.02% M4 0.64% Roff 0.05% HTML 0.01% CSS 0.01% Dockerfile 0.03% SmPL 1.86% ReScript 0.89% Vim Script 0.02% HCL 0.01%

coccinelle's Introduction

                            Coccinelle


Coccinelle allows programmers to easily write some complex
style-preserving source-to-source transformations on C source code,
like for instance to perform some refactorings.

To install Coccinelle from its source, see the instructions in install.txt.
Once you have installed coccinelle, there is a script 'spatch' in /usr/bin
or /usr/local/bin that invokes the Coccinelle program.

If you want to run Coccinelle without installing it, you can run the
Coccinelle program directly from the download/build directory. You may then
have to setup a few environment variables so that the Coccinelle program
knows where to find its configuration files.
For bash do:

  $ source env.sh

For tcsh do:

  $ source env.csh


You can test coccinelle with:

  $ spatch -sp_file demos/simple.cocci demos/simple.c -o /tmp/new_simple.c

If you haven't installed coccinelle, run then ./spatch or ./spatch.opt



If you downloaded the bytecode version of spatch you may first
have to install OCaml (which contains the 'ocamlrun' bytecode interpreter,
the equivalent of 'java', the Java virtual machine, but for OCaml) and then do:

  $ ocamlrun spatch -sp_file demos/simple.cocci demos/simple.c -o /tmp/new_simple.c


For more information on Coccinelle, type 'make docs' and have a look at the
files in the docs/ directory. You may need to install the texlive-fonts-extra
packages from your distribution to compile some of the LaTeX documentation
files.

 ** Runtime dependencies under Debian/Ubuntu**

 - For the OCaml scripting feature in SmPL
	ocaml-native-compilers
     or ocaml-nox

 - For the Python scripting feature in SmPL: python3-dev
   Note python3-dev is only a runtime dependency: it is _not_ required for
   building coccinelle.

---------------------------------------------------------------------------

Contributing:

Contributions are welcome.  Please sign your contributions, according to
the following text extracted from Documentation/SubmittingPatches.txt of
the Linux kernel:

The sign-off is a simple line at the end of the explanation for the
patch, which certifies that you wrote it or otherwise have the right to
pass it on as an open-source patch.  The rules are pretty simple: if you
can certify the below:

        Developer's Certificate of Origin 1.1

        By making a contribution to this project, I certify that:

        (a) The contribution was created in whole or in part by me and I
            have the right to submit it under the open source license
            indicated in the file; or

        (b) The contribution is based upon previous work that, to the best
            of my knowledge, is covered under an appropriate open source
            license and I have the right under that license to submit that
            work with modifications, whether created in whole or in part
            by me, under the same open source license (unless I am
            permitted to submit under a different license), as indicated
            in the file; or

        (c) The contribution was provided directly to me by some other
            person who certified (a), (b) or (c) and I have not modified
            it.

	(d) I understand and agree that this project and the contribution
	    are public and that a record of the contribution (including all
	    personal information I submit with it, including my sign-off) is
	    maintained indefinitely and may be redistributed consistent with
	    this project or the open source license(s) involved.

then you just add a line saying

	Signed-off-by: Random J Developer <[email protected]>

using your real name (sorry, no pseudonyms or anonymous contributions.)

coccinelle's People

Contributors

a-middelk avatar cj-tommi-rantala avatar elfring avatar fantazio avatar iagoabal avatar ihc avatar jajajasalu2 avatar jtojnar avatar julialawall avatar krishchatterjie avatar lpeak avatar markusboehme avatar mashybasker avatar matthieucan avatar mcgrof avatar michelemartone avatar mmcco avatar mu-mu-mu avatar npalix avatar petersenna avatar rrhansen avatar shindere avatar standby24x7 avatar sylfrena avatar taskset avatar tathagataroy1278 avatar thedrlambda avatar thierry-martinez avatar xclerc avatar xvilka avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

coccinelle's Issues

patching nethack for "-Werror=format-security" compatibility

Hi,

I am trying to use Coccinelle to patch NetHack 3.4.3, so that it compiles with -Werror=format-security" flag.

Here is my ".cocci" file to patch nethack.

$ cat fix-pline.cocci 
@rule1@
expression argc1;
@@
- pline(argc1)
+ pline("%s", argc1)

@rule2@
expression fbuf;
expression dfeature;
@@
- if (dfeature) pline(fbuf);
+ pline("%s", fbuf)

However, it fails to patch http://nethackwiki.com/wiki/Source:Invent.c#look_here (line 2217), which says,

if (dfeature) pline(fbuf);

I was expecting Coccinelle to generate a patch which looked like,

$ cat desired.patch
diff --git a/src/invent.c b/src/invent.c
index b9a3683..9b43767 100644
--- a/src/invent.c
+++ b/src/invent.c
@@ -2214,7 +2214,7 @@ boolean picked_some;
        Sprintf(fbuf, "There is %s here.", an(dfeature));

    if (!otmp || is_lava(u.ux,u.uy) || (is_pool(u.ux,u.uy) && !Underwater)) {
-       if (dfeature) pline(fbuf);
+       if (dfeature) pline("%s", fbuf);
        read_engr_at(u.ux, u.uy); /* Eric Backus */
        if (!skip_objects && (Blind || !dfeature))
            You("%s no objects here.", verb);

... but no luck so far. I suspect that all those crazy "ifdef" statements are causing problems.

How do I persuade Coccinelle to generate such a patch?

You can get NetHack 3.4.3 from http://www.nethack.org/v343/download-src.html page.

Begin error message display for Python scripts on a separate line.

I produced also occasional programming mistakes during my SmPL software development with Python scripts.

The Coccinelle software 1.0.0-rc23 starts a corresponding message display like the following.

Traceback (most recent call last):
  File "<string>", line 15, in <module>
RuntimeError
while running simple python string: from coccinelle import *
…

I find it a bit easier to read when the display of the involved Python script version will start on a separate line like the following.

Traceback (most recent call last):
  File "<string>", line 15, in <module>
RuntimeError
while running simple python string:
from coccinelle import *
…

Explicit reuse of metavariables from SmPL rule definitions in change scripts

The semantic patch language of the software "Coccinelle 1.0.4" supports transformation rules. Such a SmPL rule is structured into the following two sections.

  1. Declaration and/or definition of metavariables
  2. Source code change specification for the supported programming language (mainly "C")

Each metavariable has got an effect on source code transformation. It is applied by a direct reuse of the selected identifier so far. This syntax design aspect seems to be convenient usually. The name space from the rule definition section is extended into the name space of the change script.
But I see a special risk for name clashes there. I imagine use cases where it should be avoided that an identifier (or other syntax element) in the target language will be redefined by a SmPL metavariable by default.

I suggest to make it possible that the application of metavariables can become explicit in change scripts. Which reuse markers will be appropriate here?

A software tool like "make" provides also functionality around the reuse of variables in recipes in a way which I would like to propose for the semantic patch language.

Support for control flow analysis together with source code layout constraints

I came along a source file where a bit of source code caught also my software development attention. A few goto statements referred to source code positions directly behind them. I tried to fix this implementation detail by the following tiny script for the semantic patch language.

@delete_questionable_jump@
identifier target;
@@
-goto target;
 target:

This simple search pattern can work as expected on a reduced test source file. But it points further development challenges out for the Coccinelle software on longer source files. It seems that the Coccinelle software looks only at a basic control flow graph during a stage of its data processing so far with the consequence that the software version "1.0.4" does not store the information in an easy reusable way if there exists a bit of useful control flow on which a goto statement can jump over.
The pattern can only work as an approximation solution by the combination of three other small SmPL scripts. This approach points a few undesirable effects out.

  • The involved code conversion updates source files at places which should be kept unchanged ideally.
  • How are the chances to avoid the repeated data processing for the involved input files?

I suggest to extend the control flow management data structures with information for this use case so that source code layout can be taken better into account.

Easier construction of regular expressions in SmPL constraints

The current syntax specifies that the string which can be used as a regular expression in a constraint for a meta-variable should be enclosed by double quotes.

I would find it a bit more convenient if it will become allowed to enclose such a string also by single quotes and to omit the delimiting quotes completely.

I would also appreciate to mention in the documentation if special characters can be escaped in this context.

Software development challenges around OCaml-SmPL scripting interface with OCaml (>= 4.02.1 ?)

I became curious to get one of my SmPL scripts running since 2015-07-13:

@initialize:ocaml@
@@
let show_positions f_name typ name_places ex_string ex_ASG e_places =
(* First loop? *)
Printf.printf "%s|%s\n" f_name typ;
(* Second loop? *)
Printf.printf "%s\n" ex_string

@returns@
expression express;
identifier work;
position e_pos, name_pos;
type return_type;
@@
 return_type work@name_pos(...)
 {
  ...
  return express@e_pos;
  ...
 }

@script:ocaml display depends on returns@
f_name << returns.work;
typ << returns.return_type;
(ex_string,ex_ASG) << returns.express;
e_places << returns.e_pos;
name_places << returns.name_pos;
@@
show_positions f_name typ name_places ex_string ex_ASG e_places

I get still an error message like the following together with the software "Coccinelle 1.0.4" and "OCaml 4.02.3-1.1" on an openSUSE system.

elfring@Sonne:~/Projekte/Coccinelle/Probe> spatch.opt -sp-file show_returns2.cocci API-test1.c
init_defs_builtins: /usr/local/lib64/coccinelle/standard.h
Using native version of ocamlc/ocamlopt/ocamldep
ocamlopt.opt -shared -o /tmp/ocaml_cocci_73ee74.cmxs -g -I /usr/lib64/ocaml  -I /usr/local/lib64/coccinelle/ocaml /tmp/ocaml_cocci_73ee74.ml
File "/tmp/ocaml_cocci_73ee74.ml", line 15, characters 6-36:
Error: Unbound value Iteration.add_pending_instance
Fatal error: exception Yes_prepare_ocamlcocci.CompileFailure("/tmp/ocaml_cocci_73ee74.ml")

Another test does also not work as expected for a few OCaml versions so far.

elfring@Sonne:~/Projekte/Coccinelle/1.0.4> COCCINELLE_HOME=$(pwd) scripts/spatch demos/ocaml2.cocci demos/ocaml2.c
init_defs_builtins: /home/elfring/Projekte/Coccinelle/1.0.4/standard.h
Using native version of ocamlc/ocamlopt/ocamldep
ocamlopt.opt -shared -o /tmp/ocaml_cocci_7a4d1f.cmxs -g -I /usr/lib64/ocaml  -I /home/elfring/Projekte/Coccinelle/1.0.4/ocaml /tmp/ocaml_cocci_7a4d1f.ml
File "/tmp/ocaml_cocci_7a4d1f.ml", line 15, characters 6-36:
Error: Unbound value Iteration.add_pending_instance
Fatal error: exception Yes_prepare_ocamlcocci.CompileFailure("/tmp/ocaml_cocci_7a4d1f.ml")

Exclusion of unsupported source code parts

The Coccinelle software supports the analysis and adjustment of source files which were written in the C programming language. I suggest to add the capability to exclude parts from the source code which can not be supported so far because of parsing limitations.

I find that places around a preprocessor symbol like "__cplusplus" show source code where such a functionality would be needed.

Extend support for handling of optional source code parts

The Coccinelle software provides the search operator "question mark". If I try a small script for the semantic patch language out on a source file like the following, I get an incomplete analysis result.

show_variable_definitions1.cocci:

@variable_definition@
expression value;
identifier var;
type data_type;
@@
*data_type var
?= value
 ;

optional_initialisation1.c:

#include <stdio.h>

int main(void)
{
 char const message[] = "Test example\n";
 int result;

 result = puts(message);
 return result;
}

Output:

elfring@Sonne:~/Projekte/Coccinelle/janitor> spatch.opt -sp-file show_variable_definitions1.cocci ../Probe/optional_initialisation1.c
init_defs_builtins: /usr/local/lib64/coccinelle/standard.h
HANDLING: ../Probe/optional_initialisation1.c
diff = 
warning: incompatible arity found on line 7
warning: incompatible arity found on line 6
warning: incompatible arity found on line 6
--- ../Probe/optional_initialisation1.c
+++ /tmp/cocci-output-30894-5c6d7c-optional_initialisation1.c
@@ -2,7 +2,6 @@

 int main(void)
 {
- char const message[] = "Test example\n";
  int result;

  result = puts(message);

I hope that the mentioned arity can become compatible. It seems that SmPL disjunctions need to be used instead for a while.

show_variable_definitions2.cocci:

@variable_definition@
expression value;
identifier var;
type data_type;
@@
(
*data_type var;
|
*data_type var = value;
)

Output:

elfring@Sonne:~/Projekte/Coccinelle/janitor> spatch.opt -sp-file show_variable_definitions2.cocci ../Probe/optional_initialisation1.c
init_defs_builtins: /usr/local/lib64/coccinelle/standard.h
HANDLING: ../Probe/optional_initialisation1.c
diff = 
--- ../Probe/optional_initialisation1.c
+++ /tmp/cocci-output-32184-d244bd-optional_initialisation1.c
@@ -2,8 +2,6 @@

 int main(void)
 {
- char const message[] = "Test example\n";
- int result;

  result = puts(message);
  return result;

I hope that the software implementation can also be improved around the operator "question mark" so that search element duplication can be avoided in more use cases.

How to fix a stack overflow?

I have tried to display the parsing statistic for the source files of another software project.

elfring@Sonne:~/Projekte/pkg/lokal> OCAMLRUNPARAM=b spatch.opt -parse-c .
init_defs_builtins: /usr/local/share/coccinelle/standard.h
checking stack size (do ulimit -s 50000 if problem)
Fatal error: exception Stack_overflow

How can this surprise be resolved?

Exclusion (or selection) for pointer data types by type metavariables

The semantic patch language supports the specification for a metavariable with a list of data types as "ctypes". This functionality makes it possible to express a source code property by matching against a known set of data types.
But I see that there are general difficulties so far. Some pointers need special care in static source code analysis. Thus I am looking for a safer (and more convenient) way to put constraints on pointer data types directly.

I would appreciate if data types will become also supported which are not categorised as "generic_ctype" so far.

Failure in patch generation for the deletion of unnecessary pointer checks?

The following semantic patch works to some degree on the source files for the software "Linux Containers" for example.

@Remove_unnecessary_pointer_checks1@
expression x;
@@
-if (x)
    free(x);

@Remove_unnecessary_pointer_checks2@
identifier x;
@@
-if (x) {
    free(x);
    x = 0;
-}

Now I wonder why the tool "spatch 1.0.0-rc21" does not suggest changes for the following test source code.

#include <stdlib.h>

char* absolute_path(const char *filename)
{
    char *dir, *dirname = 0, *path = 0, *s;
    char *cwd = 0, *twd = 0;

    if (filename == 0 || *filename == '\0' || *filename == '/')
        return 0;

/* Some stuff was deleted here. */

error:
    if (path != 0) {
        free(path);
        path = 0;
    }

end:
    if (dirname != 0)
        free(dirname);
    if (cwd != 0)
        free(cwd);
    if (twd != 0)
        free(twd);

    return path;
}

I hope that the software "Simple X Image Viewer" can be similarly improved.

Safer distinction for identifiers in source files with SmPL

The Coccinelle software supports the generation of patches to some degree for source files which were written in the programming language "C". Source code transformations are supported in a way which is similar to approaches that would be suggested by reviewers and integrated by average programmers by a domain-specific language which is called "the semantic patch language". A big part of every language deals with the selection of appropriate identifiers. A SmPL metavariable is provided with the type "identifier" for this purpose.

Such a metavariable plays a special role during the specification of desired source code transformations. It denotes the involved entities usually without distinguishing them between the unprocessed form and the text format that would be finally seen by a compiler so far. The handling of a single name space seems to be convenient for this use case. But it means also that this implementation detail is a systemic risk for name clashes.

I suggest to improve this software situation considerably so that the probability of unwanted collisions will be reduced to a remarkable level.

  • I imagine that the existing data structures and programming interfaces could be extended in the way that an identifier variable will handle additional properties or constraints. I hope that it can be improved for context-dependent data processing then.
  • The addition of two metavariable types would be another software design option. They would enable the selection of desired names in an unambiguous way.

SmPL constraints are broken for type metavariables.

The manual for the semantic patch language mentions also the operator != in the section "9.2 Metavariables for transformations".

I wonder why the following SmPL script does not work.

@function_implementation_with_non_void_return_type@
identifier func;
type rt != void;
@@
*rt func(...)
 { ... }

Will the specification of such constraints be useful in the near future?

elfring@Sonne:~/Projekte/Coccinelle/Probe> spatch.opt --version && spatch.opt -sp-file find_non-void_function1.cocci /usr/src/linux-stable/drivers/gpu/drm/ast/ast_ttm.c
spatch version 1.0.0-rc22 with Python support and with PCRE support
init_defs_builtins: /usr/local/share/coccinelle/standard.h
77 79
Fatal error: exception Failure("meta: parse error: 
 = File "find_non-void_function1.cocci", line 3, column 8,  charpos = 77
    around = '!=', whole content = type rt != void;
")

Checking printk() calls with SmPL

I have tried the following tiny scripts out for the semantic patch language of the software "Coccinelle 1.0.4".

replace_printk1j.cocci:

@logging@
expression list[number] parameters;
@@
 printk(parameters);

@script:python XYZ@
count << logging.number;
@@
print "count:", count

Test result:

elfring@Sonne:~/Projekte/Linux/next-patched> script_dir=~/Projekte/Coccinelle/janitor && source=drivers/media/tuners/xc5000.c && spatch.opt -sp-file $script_dir/replace_printk1j.cocci $source                            
…
count: 1
count: 2
count: 3

replace_printk2j.cocci:

@initialize:python@
@@
def display_call_data(source, count):
   for place in source:
      print "function:", place.current_element, \
            "| line:", place.line, \
            "| count:", count

@logging@
expression list[number] el;
position pos;
@@
 printk@pos(el);

@script:python XYZ@
count << logging.number;
place << logging.pos;
@@
display_call_data(place, count)

Test result:

elfring@Sonne:~/Projekte/Linux/next-patched> analysis="spatch.opt -sp-file $script_dir/replace_printk2j.cocci $source" && eval $analysis && echo '=====' && eval $analysis | grep function | wc -l 
…
function: xc5000_attach | line: 1466 | count: 1
…
function: xc_load_fw_and_init_tuner | line: 1181 | count: 2
…
=====
…
34

Another check:

elfring@Sonne:~/Projekte/Linux/next-patched> grep --extended-regexp '\bprintk\b' $source | wc -l
36

Should this number of function calls be finally found by such a source code analysis?

Storage of value for "--dir" in resulting paths within patches

I tried out a command like the following.

elfring@Sonne:~/Projekte/MLT/lokal> spatch.opt --version && spatch.opt --sp-file ../delete_if4.cocci --dir src > ../delete_if4.diff
spatch version 1.0.0-rc21 with Python support and with PCRE support
…

This semantic patch approach worked mostly as expected. But I wonder about the failure of the following command.

elfring@Sonne:~/Projekte/MLT/lokal> LANG=C git apply ../delete_if4.diff
error: modules/core/consumer_multi.c: No such file or directory
…

The software update can be successfully performed if I change the working directory to the previously used folder.

elfring@Sonne:~/Projekte/MLT/lokal> cd src && git apply ../../delete_if4.diff

I would appreciate the possibility that I do not need to remember the string which was passed to the parameter "dir". How do you think about to make the involved relative paths more complete (or even absolute)?

I can also work with the option "--in-place" here. But I imagine that the observed situation can not be circumvented for some other use cases.

Support for specification of additional library directories

I would like to reuse an other module for OCaml scripts within semantic patch rules.

The following source code example shows one of my experiments for this purpose.

@initialize:ocaml@
@@
let table = new Omap.c []

let counting dir =
    let add_one key =
        table#add key (try (table#find key) + 1
                       with Not_found -> 1
                      )
    in
    let register name = add_one (String.length name) in
    Array.iter register (Sys.readdir dir)

@find@
identifier fu;
@@
*fu(...)
 { ... }

@script:ocaml collection@
area << virtual.folder;
@@
if area = "" then
   counting "/"
else
   counting area

@finalize:ocaml@
@@
if table#is_empty then
   prerr_endline "No result for this analysis!"
else
   let delimiter = '|' in
   let output ~key:nr ~data:count = Printf.printf "%i%c%i\r\n" nr delimiter count in
   Printf.printf "length%cincidence\r\n" delimiter;
   table#iter output

How can the following error message be resolved?

elfring@Sonne:~/Projekte/Coccinelle/janitor> CAML_LD_LIBRARY_PATH=/home/elfring/Projekte/OCaml/classes/3.05/installed spatch --sp-file list_file_name_lengths3.cocci ../Probe/f-ptr-test1.c
init_defs_builtins: /usr/local/share/coccinelle/standard.h
Using native version of ocamlc/ocamlopt/ocamldep
ocamlfind: Package `omap' not found
Fatal error: exception Failure("hd")

The program "ocamlc" provides the parameter "-I" for the specification of an additional library directory. The program "ocamlfind" supports corresponding functionality by the parameter "-package".
Would it be needed for such an use case to add a similar command line parameter for the program "spatch"?

Support for varying order of type qualifiers

I imagine that C software developers can become also picky about the ordering for type qualifiers. A tiny SmPL script tries to demonstrate a simple adjustment on a small source file.

test_const_order1.c:

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>

void display_message(void)
{
 const char * text_pointer = "Hallo!\n";
 char const text_array[] = "Good luck!\n";
 const char * selected_text = text_pointer;

 if (fputs(selected_text, stdout) < 0)
    exit(errno);
}

move_const1.cocci:

@move_const@
expression initialisation;
@@
-const char * selected_text = initialisation;
+char const * selected_text = text_array;

The software "Coccinelle 1.0.4" does not like such a change attempt at the moment.

elfring@Sonne:~/Projekte/Coccinelle/janitor> spatch.opt -sp-file move_const1.cocci test_const_order1.c
init_defs_builtins: /usr/local/lib64/coccinelle/standard.h
95 100
Fatal error: exception Failure("plus: parse error: \n = File \"move_const1.cocci\", line 5, column 6,  charpos = 95\n    around = 'const', whole content = +char const * selected_text = text_array;\n")

Support for file names with special characters

I would like to point out that the program "spatch 1.0.0-rc7" has got difficulties with the processing of file names that contain special characters like spaces.

elfring@Sonne:~/Projekte/Geany> LANG=C spatch -sp_file 'A BC.cocci' lokal/src/templates.c -debug
init_defs_builtins: /usr/share/coccinelle/standard.h
-----------------------------------------------------------------------
processing semantic patch file: A BC.cocci
with isos from: /usr/share/coccinelle/standard.iso
-----------------------------------------------------------------------
cat: A: No such file or directory
cat: BC.cocci: No such file or directory

HANDLING: lokal/src/templates.c
-----------------------------------------------------------------------
let's go

Support for string literals by another metavariable type

A function like "printk" (from Linux) gets a format string as the first parameter. This format string is often constructed by the concatenation of two substrings.

I would appreciate if an efficient and safe source code analysis will become possible for such parameters. A tiny script like the following seems to indicate further development challenges for the semantic patch language of the software "Coccinelle 1.0.4".

show_printk1.cocci:

@logging@
constant text;
identifier level;
@@
*printk(level
 text, ...);

Test result:

elfring@Sonne:~/Projekte/Linux/next-patched> spatch.opt -sp-file ~/Projekte/Coccinelle/janitor/show_printk1.cocci drivers/media/tuners/xc5000.c
init_defs_builtins: /usr/local/lib64/coccinelle/standard.h
61 65
Fatal error: exception Failure("minus: parse error: \n = File \"/home/elfring/Projekte/Coccinelle/janitor/show_printk1.cocci\", line 6, column 1,  charpos = 61\n    around = 'text', whole content =  text, ...);\n")

I suggest to add direct support for advanced data processing around various string literals by the introduction of a corresponding metavariable type.

Bogus install-sh is bogus

It turns out that there's a reason that autoconf insists that an install-sh script be provided: not only does AC_PROG_INSTALL fall back to it when no suitable install(1) is found, but AC_PROG_MKDIR_P falls back to it whenever a sufficiently recent GNU mkdir(1) can't be found.

pkg-config is not a reliable way to find Python

Starting with coccinelle-1.0.0-rc11, ./configure --enable-python does not work on OS X, because it expects Python to be registered with pkg-config. Some packages included with OS X are registered with pkg-config (despite pkg-config itself not being included), but Python is not among them. (Python only seems to have started installing pkg-config registrations quite recently.)

It would be better to use python-config, python -e, or specialized Python scripts to extract the necessary information from Python, like the old Perl-based configure script did.

Support for alternative display format of selection results

The semantic patch language has got the capability that interesting details can be displayed from source code by the corresponding tool "spatch" when a selection is marked with an asterisk to specify a semantic match. The output is a kind of difference display which shows selected lines as if they would be deleted. The minus character is used at the beginning of each affected line for this purpose while it is also used in diff headers as "---". I find that this technical detail results in the need to apply context-dependent processing with the shown data format. The corresponding peculiarities are usually handled by a tool like "patch". I would like to avoid such software challenges for automatic and safe processing.

I suggest to offer an alternative display which will contain the following fields from source location information in a format that is similar to comma-separated values files.

  • name for a selected source file
  • line number
  • source code fragment from the match

Would you like to support the export of such tabular data in an easy way?

Support for SmPL disjunctions on every token

The Coccinelle software supports to use disjunctions to some degree in search patterns for semantic patch language scripts. A parsing error is reported at some places from such scripts where the software implementation has got technical limitations.

I suggest to make it possible that SmPL disjunctions should usually work with every token for an improved and succinct source code selection specification.

Configuration or escaping of @ characters for embedded programming language scripts

The Coccinelle software provides data processing for a semantic patch language. This language usually puts a specific meaning on some characters as it is described by its syntax.


script_code is any code in the chosen scripting language. Parsing of the semantic patch does not check the validity of this code; any errors are first detected when the code is executed. Furthermore, @ should not be use in this code. Spatch scans the script code for the next @ and considers that to be the beginning of the next rule, even if @ occurs within e.g., a comment.


This at symbol has got the role to separate specifications in the SmPL notation from other text parts. Now I am looking for more clarifcation of involved implementation details when this special symbol will be needed by the embedded scripts for further programming in languages like Python and OCaml.

Use cases:

  1. Fiddling with mail addresses by regular expressions (for example)
  2. Some application developers would like to pass data for remote database connections.
  3. Another need came along during my software development with SQL database storage. The SQLAlchemy documentation describes tweaks for the pysqlite driver. (I have found out that I will essentially need the support for Python decorators there.)
...
engine = create_engine("sqlite:///myfile.db")

@event.listens_for(engine, "connect")
...

Now I see possibilities like the following to improve such a situation.

  1. Adjustments for the host language
    • Switch to an other quotation approach by a special command in a SmPL declaration block
    • Selection of an alternative or longer delimiter by a configuration parameter
    • Include an external source file for a SmPL rule
  2. Changes for the embedded data

Advanced data processing for whitespace characters

Various source files are structured in the way that they contain key words (for a programming language) or special comments. A bit of information remains between these data elements often in the form of whitespace characters or other filler. They influence the readability and text comprehension to some degree and are occasionally needed for correct data processing. Some software developers care for indentation and positioning of blank lines.

I would appreciate if corresponding data processing capabilities can be improved also for the semantic patch language. Thus I suggest to add another metavariable type for this purpose.

Metavariables with the type "virtual" prevent proper initialisation for Python scripts.

I can use the following source code together with the software "Coccinelle 1.0.0-rc23".

@initialize:python@
@@
import sys
import sqlalchemy
sys.stderr.write("\n".join(["Using SQLAlchemy version:",
                            sqlalchemy.__version__]))
sys.stderr.write("\n")
raise RuntimeError
…

The following try shows the expected display for a small test message so far.

elfring@Sonne:~/Projekte/Linux/linux-stable/scripts/coccinelle/deletions> spatch.opt -sp-file list_input_parameter_validation2.cocci -dir ~/Projekte/Linux/next-patched/drivers/gpu/drm/msm                     
init_defs_builtins: /usr/local/share/coccinelle/standard.h
Using SQLAlchemy version:
0.9.7
Traceback (most recent call last):
  File "<string>", line 15, in <module>
RuntimeError
while running simple python string:
…

But I got very surprised after I tried another code variant out.

@initialize:python@
DB_URL << virtual.database_URL;
@@
import sys
import sqlalchemy
sys.stderr.write("\n".join(["Using SQLAlchemy version:",
                            sqlalchemy.__version__]))
sys.stderr.write("\n")
sys.stderr.write("\n".join(["DB-URL:", str(DB_URL)]))
sys.stderr.write("\n")
raise RuntimeError
…

Does the following display fit to your expectations?

elfring@Sonne:~/Projekte/Linux/linux-stable/scripts/coccinelle/deletions> spatch.opt -sp-file list_input_parameter_validation_template-with_static2.cocci -dir ~/Projekte/Linux/next-patched/drivers/gpu/drm/msm
init_defs_builtins: /usr/local/share/coccinelle/standard.h
HANDLING: /home/elfring/Projekte/Linux/next-patched/drivers/gpu/drm/msm/msm_ringbuffer.c
…
HANDLING: /home/elfring/Projekte/Linux/next-patched/drivers/gpu/drm/msm/hdmi/hdmi.c
Traceback (most recent call last):
  File "<string>", line 3, in <module>
NameError: name 'sys' is not defined
while running simple python string: from coccinelle import *

sys.stderr.write("\n".join(["++++++ 2. session:", str(session)]))
sys.stderr.write("\n")
session.commit()
…

I would appreciate if this issue can be fixed in the near future.

Determination for the number of named function parameters with SmPL

I became interested to look at some function calls once more. I tried the following script out for the semantic patch language of the software "Coccinelle 1.0.4" on another source file example.

list_parameter_numbers1.cocci:

@initialize:python@
@@
import sys
import sqlite3 as SQLite
connection = SQLite.connect(":memory:")
c = connection.cursor()
c.execute("""create table numbers (number integer)""")
delimiter = "|"

def store_number(count):
    """Add an integer to an internal list."""
    c.execute("""insert into numbers (number) values (?)""",
              (count, )
             )

@counting_parameters@
identifier work;
parameter list[number] pl;
type return_type;
@@
 return_type work(pl)
 {
  ...
 }

@script:python collection@
count << counting_parameters.number;
@@
store_number(count)

@finalize:python@
@@
c.execute("""select count(*) nr from numbers""")
result = c.fetchone()

if result[0] > 0:
   c.execute("""create index x on numbers (number)""")
   c.execute("""select number, count(*) nr from numbers group by number""")
   sys.stdout.write(delimiter.join( ("number", "incidence") ))
   sys.stdout.write("\r\n")
   for result in c:
      sys.stdout.write(delimiter.join((str(result[0]),
                                       str(result[1])
                                      )))
      sys.stdout.write("\r\n")
else:
   sys.stderr.write("No result for this analysis!\n")

connection.close()

Test result:

elfring@Sonne:~/Projekte/Linux/next-patched> script_dir=~/Projekte/Coccinelle/Probe/ && source_file=drivers/media/tuners/xc5000.c && spatch.opt -sp-file ${script_dir}list_parameter_numbers1.cocci ${source_file}
…
number|incidence
1|1
2|1
3|1
4|1
5|1

I find that this information shows a behaviour which is unexpected for me in comparison to useful results that were generated by a similar SmPL script together with the software "Coccinelle 1.0.0-rc22". I have tried a small adjustment out.

A few differences:

--- list_parameter_numbers1.cocci       2014-12-16 11:28:58.463250456 +0100
+++ list_parameter_numbers1c.cocci      2016-01-04 21:38:06.138846387 +0100
@@ -16,15 +16,17 @@
 @counting_parameters@
 identifier work;
 parameter list[number] pl;
+position pos;
 type return_type;
 @@
- return_type work(pl)
+ return_type work@pos(pl)
  {
   ...
  }

 @script:python collection@
 count << counting_parameters.number;
+place << counting_parameters.pos;
 @@
 store_number(count)

Test result:

elfring@Sonne:~/Projekte/Linux/next-patched> spatch.opt -sp-file ${script_dir}list_parameter_numbers1c.cocci ${source_file}
…
number|incidence
1|18
2|22
3|6
4|1
5|1

It seems that the connection of the SmPL variable "pos" with a variable "place" for the embedded programming language like "Python" has got a remarkable effect for such a source code analysis approach.
Should these numbers be computed also without the extra use of another position variable?

list_parameter_numbers1j.cocci:

@counting@
identifier handle;
parameter list[number] arguments;
type return_type;
@@
 return_type handle(arguments)
 {
  ...
 }

@script:python display@
count << counting.number;
@@
print "count:", count

Test result:

elfring@Sonne:~/Projekte/Linux/next-patched> spatch.opt -sp-file ${script_dir}list_parameter_numbers1j.cocci ${source_file}
…
count: 1
count: 2
count: 3
count: 4
count: 5

Complete support for fork-join work flows

Support for direct parallel execution of SmPL scripts was added to the version "Coccinelle 1.0.0-rc24". This functionality is only usable with a specific constraint so far:

…
This option is not compatible with the use of an initialize or finalize rule in the semantic patch.
…

I hope that this software limitation can be changed in the future so that fork-join work flows will become directly feasible also by this application interface.

  • Special function calls are occasionally essential for the preparation and finishing of parallel tasks (also for advanced source code analysis).
  • The scope should be distinguished for these actions:
    • SmPL main program - actions as in the use case for serial process execution
    • spatch background processes - task-specific initialisation and clean-up

-o option fails on single file; "multiple modified files"

Using the latest source (1.0.0-rc19), I'm trying to run this command:

spatch --no-includes --sp-file repr.cocci in.c -o out.c

This is the output I'm getting:

init_defs_builtins: /home/christoph/Software/install/share/coccinelle/standard.h
warning: line 9: should self be a metavariable?
warning: line 22: should self be a metavariable?
(ONCE) Expected tokens self PyObject PyString_FromString PyString_FromFormat
Skipping:in.c
Fatal error: exception Failure("-o can not be applied because there are multiple modified files")

strange behavior with #else

I'm getting some strange parse error with #ifdef/#else/#endif

when I run spatch -sp_file log_erreur.cocci macro.c -verbose_parsing with :

log_erreur.cocci
8<---
@log@
expression exp1;
expression exp2;
expression exp3;
@@

  • LOG_Log(exp1, exp2, exp3)
  • LOG_print_f(LOG_ERREUR, exp2, "Erreur : %d", exp3)
    8<---

and macro.c:
8<---

include "errorlog.h"

int init(int a, int b) {

ifdef FOO

LOG_Log(LOG_FIC_TOTO, 2, 0);

else

LOG_Log(LOG_FIC_TOTO, 3, 0);

endif

return 0;

}
8<---

the result is :
init_defs_builtins: /usr/share/coccinelle/standard.h
HANDLING: macro.c
ERROR-RECOV: found sync '}' at line 10
parsing pass2: try again
ERROR-RECOV: found sync '}' at line 10
parse error
= File "macro.c", line 7, column 23, charpos = 124
around = '3', whole content = LOG_Log(LOG_FIC_TOTO, 3, 0);
badcount: 9
bad: #include "errorlog.h"
bad:
bad: int init(int a, int b) {
bad: #ifdef FOO
bad: LOG_Log(LOG_FIC_TOTO, 2, 0);
bad: #else
BAD:!!!!! LOG_Log(LOG_FIC_TOTO, 3, 0);
bad: #endif
bad: return 0;
bad: }

if I add a space juste after #else, everything goes well :
init_defs_builtins: /usr/share/coccinelle/standard.h
HANDLING: macro.c
diff =
--- macro.c
+++ /tmp/cocci-output-30724-e588af-macro.c
@@ -2,9 +2,9 @@

int init(int a, int b) {
#ifdef FOO

  • LOG_Log(LOG_FIC_TOTO, 2, 0);
  • LOG_print_f(LOG_ERREUR, 2, "Erreur : %d", 0);
    #else
  • LOG_Log(LOG_FIC_TOTO, 3, 0);
  • LOG_print_f(LOG_ERREUR, 3, "Erreur : %d", 0);
    #endif
    return 0;
    }

any idea ?

Support C++ codebases

Being able to do simple transformations on codebases like LibreOffice would be nice.

Failure in a directory traversal

I tried another source code analysis out for a software project.

elfring@Sonne:~/Projekte/MINIX/lokal> date && OCAMLRUNPARAM=b spatch.opt -debug -timeout 12 -sp-file ~/Projekte/Coccinelle/janitor/list_input_parameter_validation3.cocci -dir . > ../list_input_parameter_validation3.txt 2> ../list_input_parameter_validation3-errors.txt ; date
Mi 8. Okt 09:55:47 CEST 2014
Mi 8. Okt 09:59:06 CEST 2014

How do you think about the following message at the end of the referenced log file?

Fatal error: exception Sys_error("Is a directory")
Raised at file "pervasives.ml", line 330, characters 12-32

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.