google / souper Goto Github PK

A superoptimizer for LLVM IR

License: Apache License 2.0

Shell 0.93% C++ 72.20% C 0.94% Perl 7.05% Python 3.17% CMake 1.45% LLVM 14.07% Dockerfile 0.19%

souper's Introduction

Souper is a superoptimizer for LLVM IR. It uses an SMT solver to help identify missing peephole optimizations in LLVM's midend optimizers.

The architecture and concepts of Souper are described in Souper: A synthesizing superoptimizer.

Requirements

Souper should work on any reasonably modern Linux or OS X machine.

You will need a reasonably modern compiler toolchain. LLVM has instructions on how to get one for Linux: http://llvm.org/docs/GettingStarted.html#getting-a-modern-host-c-toolchain

You will also need CMake to build Souper and its dependencies.

Building Souper

Download and build dependencies:

$ ./build_deps.sh $buildtype $extra_cmake_flags

$buildtype is optional; it defaults to Release and may be set to any LLVM build type. $extra_cmake_flags is optional. It is passed to CMake.

Run CMake from a build directory:

$ mkdir /path/to/souper-build
$ cd /path/to/souper-build
$ cmake -DCMAKE_BUILD_TYPE=$buildtype /path/to/souper

Again, the build type is optional and defaults to Release. In any case it must match the build type used when compiling the dependencies.

Run 'make' from the build directory.
Optionally run 'make check' to run Souper's test suite. To run the test suite under Valgrind, run 'make check LIT_ARGS="-v --vg --vg-leak"' instead. By default the solver is also run under Valgrind. This can be disabled by by adding --vg-arg=--trace-children-skip=/path/to/solver to LIT_ARGS.

Note that GCC 4.8 and earlier have a bug in handling multiline string literals. You should build Souper using GCC 4.9+ or Clang.

Using Souper

After following the above instructions, you will have a Souper executable in /path/to/souper-build/souper and a Clang executable in /path/to/souper/third_party/llvm/$buildtype/bin/clang. You can use the Clang executable to create an LLVM bitcode file like this:

$ /path/to/clang -emit-llvm -c -o /path/to/file.bc /path/to/file.c

For example:

$ /path/to/souper -z3-path=/usr/bin/z3 /path/to/file.bc

Souper will extract SMT queries from the bitcode file and pass them to a solver. Unsatisfiable queries (which represent missed optimization opportunities) will cause Souper to print its internal representation of the optimizable expression along with the shorter expression that refines the original one.

Alternatively, you may immediately let Souper modify the bitcode and let it apply the missed optimization opportunities by using the Souper llvm opt pass. When loaded the pass will automatically register itself to run after LLVM's regular peephole optimizations.

For example:

$ /path/to/clang -Xclang -load -Xclang /path/to/libsouperPass.so \
                 -mllvm -z3-path=/usr/bin/z3 /path/to/file.c

Or to run the pass on its own:

$ /path/to/opt -load /path/to/libsouperPass.so -souper \
               -z3-path=/usr/bin/z3 -o /path/to/file.opt.bc \
               /path/to/file.bc

Or use the drop-in compiler replacements sclang and sclang++:

$ /path/to/configure CC=/path/to/sclang CXX=/path/to/sclang++
$ make

Compilation using Souper can be sped up by caching queries. By default, Souper uses a non-persistent RAM-based cache. The -souper-external-cache flag causes Souper to cache its queries in a Redis database. For this to work, Redis >= 1.2.0 must be installed on the machine where you are running Souper and a Redis server must be listening on the default port (6379).

sclang uses external caching by default since this often gives a substantial speedup for large compilations. This behavior may be disabled by setting the SOUPER_NO_EXTERNAL_CACHE environment variable. Souper's Redis cache does not yet have any support for versioning; you should stop Redis and delete its dump file any time Souper is upgraded.

Disclaimer

Please note that although some of the authors are employed by Google, this is not an official Google product.

souper's People

Contributors

Stargazers

Watchers

Forkers

jeroenk humeafo jubitaneja agra-uni-bremen oagbulut rsas hanumathrao santoshn regehr chenyang78 pcc uikit0 akashaio tari oztc chubbymaggie tempbottle mavenrain dddejan jamesbornholt gnomix pkdevbox joe8bit claudiouzelac mewbak 8l mcanthony vcgato29 hanw zhmz90 alexanderkyte rgov runt18 cnsuhao yang123vc gpakosz halfx amocy-wang runngezhang noname007 jellyr cseq ensemblr qicny nimishb mcgrady1 praveenmunagapati ufwt ezhangle twang15 minzhang20171225 kant zhengyang92 evilning longjohncoder wdv4758h renesugar pranavk fredfeng zhaishuai zyq8709 zilaiyedaren manasij7479 cherouvim darkbuck directorscut82 brettkoonce indigos33k3r lebedevri exhorder googol-lab finnoleary the-mad-pirate vsevolod-livinskij qianxincodesafe yazici jmcabandara shrin18 mshockwave dongaxis gwli kth fvrmatteo kurt-russell hjchen2 ointeractive-depot nystud c0de3 tfeng mbrukman loudy2000 liutongxuan kingking888 test-orgn feiyunwill qiqi1 tralivali1234 aeroowl tzf-omkey alex0freeman

souper's Issues

Build error. OS X 10.11.5

[ 98%] Building CXX object CMakeFiles/souperweb-backend.dir/tools/souperweb-backend.cpp.o
souper/tools/souperweb-backend.cpp:36:53: error: use of undeclared identifier
'getGlobalContext'
parseAssembly(MB->getMemBufferRef(), Err, getGlobalContext())) {
^
1 error generated.

UB bug

I think we have an ugly UB bug. Consider this program:

int printf(const char *, ...);

int foo(int x1, int x2) {
  return
    ((x2 > 0) && (x1 > (2147483647 - x2))) ||
    (x1 && (x2 < 0));
}

int main(void) {
  printf("%x\n", foo(1, -770027279));
  return 0;
}

Hopefully we all agree that it is not undefined and that it prints 1. Souper makes it print 0. Here's the critical transformation:

*** IR Dump After Combine redundant instructions ***
; Function Attrs: nounwind readnone uwtable
define i32 @foo(i32 %x1, i32 %x2) #0 {
  %1 = icmp sgt i32 %x2, 0
  %2 = sub nsw i32 2147483647, %x2
  %3 = icmp slt i32 %2, %x1
  %or.cond = and i1 %1, %3
  br i1 %or.cond, label %6, label %4

; <label>:4                                       ; preds = %0
  %5 = icmp slt i32 %x2, 0
  %not. = icmp ne i32 %x1, 0
  %. = and i1 %5, %not.
  br label %6

; <label>:6                                       ; preds = %4, %0
  %7 = phi i1 [ true, %0 ], [ %., %4 ]
  %8 = zext i1 %7 to i32
  ret i32 %8
}
*** IR Dump After Souper super-optimizer pass ***
; Function Attrs: nounwind readnone uwtable
define i32 @foo(i32 %x1, i32 %x2) #0 {
  %1 = icmp sgt i32 %x2, 0
  %2 = sub nsw i32 2147483647, %x2
  %3 = icmp slt i32 %2, %x1
  %or.cond = and i1 %1, %3
  br i1 %or.cond, label %5, label %4

; <label>:4                                       ; preds = %0
  %not. = icmp ne i32 %x1, 0
  %. = and i1 false, %not.
  br label %5

; <label>:5                                       ; preds = %4, %0
  %6 = phi i1 [ true, %0 ], [ %., %4 ]
  %7 = zext i1 %6 to i32
  ret i32 %7
}

LLVM is introducing UB by speculatively executing the right argument of the first &&, but it turns out to be harmless. Here's how the code works before Souper:

%1 = 0
%2 = poison
%3 = poison
%or.cond = poison
br goes either way
%5 = 1
%not. = 1
%. = 1
%7 = phi 1, 1

So -- despite the UB -- this function must return 1. On the other hand, Souper is optimizing %5 to 0. Unless I'm missing something, this is just wrong. Here's the Souper optimization:

%0:i32 = var                        ; x2
%1:i1 = slt 0:i32, %0               ; 0 < x2
%2:i32 = subnsw 2147483647:i32, %0  ; 2147483647 - x2
%3:i32 = var                        ; x1
%4:i1 = slt %2, %3                  ; (2147483647 - x2) < x1
%5:i1 = and %1, %4                  ; x2 > 0 && x1 > (2147483647 - x2)
pc %5 0:i1                          ;
%6:i1 = slt %0, 0:i32               ; x2 < 0
cand %6 0:i1                        ; WRONG

keep going when the solver stops?

For some time, souper's default behavior has been to continue optimizing a file if the solver times out, but to abort if the solver exits for any other reason. I think perhaps this is the wrong way to do things. In a few hours of Csmith tests, Souper in integer synthesis mode successfully optimized almost zero programs because at some point, the solver would use all RAM and be stopped by the OS, and then souper would give up.

Should we have souper keep going when the solver dies? The only drawback I can think of is that this can hide bugs. For example, if we mis-print an SMTLIB query, we'll miss optmizations and see no overt symptom. It may be possible to distinguish OOM crashes from other errors, but I've played that game before and don't like it.

unsupported tests?

I'm getting 6 unsupported tests on my Mac, is this a bug? Of course I'm giving Souper a TEST_SOLVER (or else more tests would be failing).

regehr@johnregehrs-MacBook-Air:build{master}$ make check
[  2%] Built target profileRuntime
[ 45%] Built target souperPass
[ 47%] Built target souperSMTLIB2
[ 71%] Built target kleeExpr
[ 73%] Built target souperInst
[ 80%] Built target souperExtractor
[ 82%] Built target souperTool
[ 84%] Built target souperKVStore
[ 86%] Built target souperParser
[ 89%] Built target souper
[ 91%] Built target parser-test
[ 93%] Built target souper-check
[ 95%] Built target extractor_tests
[ 97%] Built target inst_tests
[100%] Built target parser_tests
Testing Time: 17.90s
  Expected Passes    : 109
  Unsupported Tests  : 6
[100%] Built target check

Assertion `l->getWidth()==r->getWidth() && "type mismatch"' failed.

Take this file:

class x0 {
  int *x1;
  int *x2;

public:
  int operator!=(x0) { return 0 != x1; }
  void operator++() {
    ++x1;
    ++x2;
  }
}

x3;
void x4() {
  for (x0 x5; x3 != x5; ++x3)
    ;
}

and compile like this:

$SOUPER/third_party/llvm/Debug/bin/clang++ -Xclang -load -Xclang $SOUPER/build/libsouperPass.so -mllvm -stp-path=/usr/local/bin/stp -c small.cpp -O3

Or just run Souper directly on this file:

; ModuleID = 'small.bc'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%class.x0 = type { i32*, i32* }

@x3 = global %class.x0 zeroinitializer, align 16

; Function Attrs: nounwind uwtable
define void @_Z2x4v() #0 {
entry:
  %0 = load i32** getelementptr inbounds (%class.x0* @x3, i64 0, i32 0), align 16, !tbaa !1
  %cmp.i2 = icmp eq i32* %0, null
  br i1 %cmp.i2, label %for.end, label %for.inc.lr.ph

for.inc.lr.ph:                                    ; preds = %entry
  %.pre = load <2 x i32*>* bitcast (%class.x0* @x3 to <2 x i32*>*), align 16, !tbaa !6
  br label %for.inc

for.inc:                                          ; preds = %for.inc, %for.inc.lr.ph
  %1 = phi <2 x i32*> [ %.pre, %for.inc.lr.ph ], [ %2, %for.inc ]
  %2 = getelementptr <2 x i32*> %1, <2 x i64> <i64 1, i64 1>
  store <2 x i32*> %2, <2 x i32*>* bitcast (%class.x0* @x3 to <2 x i32*>*), align 16, !tbaa !6
  %3 = ptrtoint <2 x i32*> %2 to <2 x i64>
  %4 = bitcast <2 x i64> %3 to i128
  %trunc = trunc i128 %4 to i64
  %cmp.i = icmp eq i64 %trunc, 0
  br i1 %cmp.i, label %for.end.loopexit, label %for.inc

for.end.loopexit:                                 ; preds = %for.inc
  br label %for.end

for.end:                                          ; preds = %for.end.loopexit, %entry
  ret void
}

attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.ident = !{!0}

!0 = metadata !{metadata !"clang version 3.5.0 (213048)"}
!1 = metadata !{metadata !2, metadata !3, i64 0}
!2 = metadata !{metadata !"_ZTS2x0", metadata !3, i64 0, metadata !3, i64 8}
!3 = metadata !{metadata !"any pointer", metadata !4, i64 0}
!4 = metadata !{metadata !"omnipotent char", metadata !5, i64 0}
!5 = metadata !{metadata !"Simple C/C++ TBAA"}
!6 = metadata !{metadata !3, metadata !3, i64 0}

possible constant synthesis bug

The LLVM below should permit a constant to be synthesized, but it doesn't. In particular, I'd have expected %5 to be zero here.

%0:i32 = var
%1:i32 = var
%2:i1 = eq %0, %1
pc %2 1:i1
%3:i32 = addnsw 1:i32, %0
%4:i32 = addnsw 1:i32, %1
%5:i32 = subnsw %3, %4
infer %5

LLVM:

; ModuleID = 'vn10test.c'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: nounwind uwtable
define void @vnum_test10(i32* nocapture %data) #0 {
  %1 = load i32* %data, align 4, !tbaa !1
  %2 = add nsw i32 %1, 1
  %3 = getelementptr inbounds i32* %data, i64 1
  %4 = load i32* %3, align 4, !tbaa !1
  %5 = add nsw i32 %4, 1
  %6 = add nsw i32 %5, %2
  %7 = getelementptr inbounds i32* %data, i64 2
  store i32 %6, i32* %7, align 4, !tbaa !1
  %8 = icmp eq i32 %1, %4
  br i1 %8, label %9, label %13

; <label>:9                                       ; preds = %0
  %10 = sub nsw i32 %2, %5
  %11 = mul nsw i32 %10, 21
  %12 = getelementptr inbounds i32* %data, i64 3
  store i32 %11, i32* %12, align 4, !tbaa !1
  br label %13

; <label>:13                                      ; preds = %9, %0
  ret void
}

attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.ident = !{!0}

!0 = metadata !{metadata !"clang version 3.6.0 (trunk 223586)"}
!1 = metadata !{metadata !2, metadata !2, i64 0}
!2 = metadata !{metadata !"int", metadata !3, i64 0}
!3 = metadata !{metadata !"omnipotent char", metadata !4, i64 0}
!4 = metadata !{metadata !"Simple C/C++ TBAA"}

Store error flag in the Redis cache during inference

When using cache_infer, it would great to store a flag if an error has occurred during inference including the error message (e.g. Protocol error). Similarly, cache_dump should extended to retrieve those erroneous optimizations and their error codes.

No-wrap constraints should apply only if the relevant branch is taken

@regehr reported this incorrect optimization.

C code:

int printf(const char *, ...);
int a = -7L, b;
int main() {
  int c = a;
  b = c > 0 && 1 > 2147483647 - c ?: 7 + c;
  if (b)
    printf("%d\n", 0);
  return 0;
}

Souper IR:

%0 = block
%1:i32 = var
%2:i32 = subnsw 2147483647:i32, %1
%3:i1 = slt %2, 1:i32
%4:i1 = phi %0, 0:i1, %3
%5:i32 = zext %4
%6:i32 = addnsw 7:i32, %1
%7:i32 = select %4, %5, %6
%8:i1 = eq 0:i32, %7
cand %8 0:i1

What I think is going on here is that there is no value for %1 that simultaneously does not overflow the computation for %2 (an overflow requires %1 to be <= 0) and yields a zero value for %6. The program's control flow dictates that %2 is unreachable if %1 <= 0 (i.e. the no-wrap constraint for %2 should not have an effect), and phi nodes are defined to propagate poison values (such as overflowing nsw results) through only the taken predecessor [1]. However, the current implementation of *nsw applies the no-wrap constraint to the entire query, so the solver concludes that %6 must be non-zero. This leads it to conclude that %8 must be zero.

I think that what is needed is a way to make no-wrap constraints apply only if the relevant branch is taken. In this specific example, the solver would be able to pick the first predecessor of the phi node and let %1 = -7 to show that %8 may be non-zero.

[1] http://llvm.org/docs/LangRef.html#poisonvalues

make check fails on mac

The command below, which is executed as part of "make check", fails on my mac. This used to work -- perhaps the latest big xcode upgrade broke something. My unixy tools are installed via homebrew, which is up to date.

$ /Users/regehr/souper-jubi/build/souper -stp-path=/usr/local/bin/stp -check /Users/regehr/souper-jubi/build/test/Extractor/Output/vector-gep.ll.tmp
dyld: Library not loaded: libhiredis.0.11.dylib
  Referenced from: /Users/regehr/souper-jubi/build/souper
  Reason: image not found
Trace/BPT trap: 5

The library in question is where I'd expect it to be, but for whatever reason Souper isn'f finding it.

$ find . -name libhiredis.0.11.dylib
./third_party/hiredis/install/lib/libhiredis.0.11.dylib

impact of synthesis on the opt pass

If we synthesize one or more instructions, the opt pass will have to create them. We'll end up with a little tree-walker in Pass.cpp that is sort of the opposite of the one found in Inst.cpp/Candidates.cpp. This doesn't sound too difficult. We also need to connect the tree of generated code back to the LLVM function that is being optimized. It is not obvious to me that Souper currently stores enough information to support this. It seems to me that we'll need to store the originating LLVM Instruction for each Souper instruction, not just the origin of the replacement (as we currently do). Does that sound right?

Add GitHub repo description

Would be nice to see it in the list at https://github.com/google

Add instruction and KLEE expression cache

Currently, for each infer() call we scan the target instruction recursively for candidates, but any subsequent infer() calls don't reuse previous scanning results. Similarly, we don't reuse translated instructions in KLEEBuilder. We should eventually add a cache.

unbounded recursion in extractor

One of the extractor's termination conditions is hitting a phi that is marked as a loop header. When it encounters an irreducible loop for which LLVM doesn't detect the loop header, it keeps following edges until crashing from stack overflow. Optimized compile of this code will produce such a loop:

class A
{
    int *m_fn1 () const;
};
int a;
int fn1 ();
int *
A::m_fn1 () const const
{
    a && fn1 ();
    for (int K; K; ++K)
        a && fn1 ();
    return 0;
}

Here's the LLVM:

; ModuleID = 'foo.bc'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%class.A = type { i8 }

@a = global i32 0, align 4

; Function Attrs: nounwind uwtable
define noalias i32* @_ZNK1A5m_fn1Ev(%class.A* nocapture readnone %this) #0 align 2 {
entry:
  %0 = load i32* @a, align 4, !tbaa !1
  %tobool = icmp eq i32 %0, 0
  br i1 %tobool, label %for.body, label %land.rhs

land.rhs:                                         ; preds = %entry
  %call = tail call i32 @_Z3fn1v() #2
  br label %for.bodythread-pre-split

for.bodythread-pre-split:                         ; preds = %for.inc, %land.rhs
  %K.010.ph = phi i32 [ undef, %land.rhs ], [ %inc, %for.inc ]
  %.pr = load i32* @a, align 4, !tbaa !1
  %phitmp = icmp eq i32 %.pr, 0
  br label %for.body

for.body:                                         ; preds = %for.bodythread-pre-split, %entry
  %1 = phi i1 [ %phitmp, %for.bodythread-pre-split ], [ true, %entry ]
  %K.010 = phi i32 [ %K.010.ph, %for.bodythread-pre-split ], [ undef, %entry ]
  br i1 %1, label %for.inc, label %land.rhs5

land.rhs5:                                        ; preds = %for.body
  %call6 = tail call i32 @_Z3fn1v() #2
  br label %for.inc

for.inc:                                          ; preds = %land.rhs5, %for.body
  %inc = add nsw i32 %K.010, 1
  %tobool3 = icmp eq i32 %inc, 0
  br i1 %tobool3, label %for.end, label %for.bodythread-pre-split

for.end:                                          ; preds = %for.inc
  ret i32* null
}

declare i32 @_Z3fn1v() #1

attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #2 = { nounwind }

!llvm.ident = !{!0}

!0 = metadata !{metadata !"clang version 3.5.0 (213048)"}
!1 = metadata !{metadata !2, metadata !2, i64 0}
!2 = metadata !{metadata !"int", metadata !3, i64 0}
!3 = metadata !{metadata !"omnipotent char", metadata !4, i64 0}
!4 = metadata !{metadata !"Simple C/C++ TBAA"}

Souper output: group phi instructions of the same block

Currently, souper does not group phi instructions of the same block, e.g.

%0 = block 2
%1:i8 = phi %0, 0:i8, 1:i8
%2:i1 = ne 0:i8, %1
%3:i1 = xor 1:i1, %2
%4:i64 = var
%5:i64 = var
%6:i64 = phi %0, 0:i64, %5
%7:i1 = ult %4, %6
%8:i1 = or %3, %7
pc %8 0:i1
cand %7 0:i1

To enhance the readability and ease the translation to LLVM, it might be better to print as follows:

%0 = block 2
%1:i64 = var
%2:i64 = var
%3:i8 = phi %0, 0:i8, 1:i8
%4:i64 = phi %0, 0:i64, %2
%5:i1 = ne 0:i8, %3
%6:i1 = xor 1:i1, %5
%7:i1 = ult %1, %4
%8:i1 = or %6, %7
pc %8 0:i1
cand %7 0:i1

illegal sext

Raimondas noticed this today. From the LLVM code below, Souper extracts this IR:

%0:i64 = var
%1:i64 = sext 1:i64
%2:i64 = mul 4:i64, %1
%3:i64 = add %0, %2
%4:i64 = var
%5:i1 = ult %3, %4
cand %5 0:i1

But the sext here is illegal: LLVM requires that the extended type is at least one bit bigger than the argument.

; ModuleID = 'small.bc'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%class.x5 = type { i8 }
%class.x1 = type { i8 }

; Function Attrs: nounwind uwtable
define void @_ZNK2x52x6ER2x1(%class.x5* nocapture readnone %this, %class.x1* nocapture nonnull readnone) #0 align 2 {
  %x7 = alloca %class.x1, align 1
  call void @_ZN2x1C2Ei(%class.x1* %x7, i32 0) #2
  %2 = call i32* @_ZN2x12x2Ev(%class.x1* %x7) #2
  %3 = getelementptr inbounds i32* %2, i64 1
  %4 = call i32* @_ZN2x12x3Ev(%class.x1* %x7) #2
  %5 = icmp ult i32* %3, %4
  br i1 %5, label %_ZN2x1ixEi.exit, label %6

; <label>:6                                       ; preds = %1
  call void @_Z2x0i(i32 0) #2
  br label %_ZN2x1ixEi.exit

_ZN2x1ixEi.exit:                                  ; preds = %6, %1
  ret void
}

declare i32* @_ZN2x12x2Ev(%class.x1*) #1

declare i32* @_ZN2x12x3Ev(%class.x1*) #1

declare void @_Z2x0i(i32) #1

declare void @_ZN2x1C2Ei(%class.x1*, i32) #1

attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #2 = { nounwind }

!llvm.ident = !{!0}

!0 = metadata !{metadata !"clang version 3.5.0 (213048)"}

Build fails with missing terminating " character

I am trying to build souper on my Fedora 23 box. I followed the instructions and the build fails with:

Scanning dependencies of target souperInst
[  1%] Building CXX object CMakeFiles/souperInst.dir/lib/Inst/Inst.cpp.o
[  2%] Linking CXX static library libsouperInst.a
[  2%] Built target souperInst
Scanning dependencies of target kleeExpr
[  4%] Building CXX object CMakeFiles/kleeExpr.dir/third_party/klee/lib/Expr/Constraints.cpp.o
[  5%] Building CXX object CMakeFiles/kleeExpr.dir/third_party/klee/lib/Expr/ExprBuilder.cpp.o
[  6%] Building CXX object CMakeFiles/kleeExpr.dir/third_party/klee/lib/Expr/Expr.cpp.o
[  8%] Building CXX object CMakeFiles/kleeExpr.dir/third_party/klee/lib/Expr/ExprEvaluator.cpp.o
[  9%] Building CXX object CMakeFiles/kleeExpr.dir/third_party/klee/lib/Expr/ExprPPrinter.cpp.o
[ 11%] Building CXX object CMakeFiles/kleeExpr.dir/third_party/klee/lib/Expr/ExprSMTLIBPrinter.cpp.o
[ 12%] Building CXX object CMakeFiles/kleeExpr.dir/third_party/klee/lib/Expr/ExprUtil.cpp.o
[ 13%] Building CXX object CMakeFiles/kleeExpr.dir/third_party/klee/lib/Expr/ExprVisitor.cpp.o
[ 15%] Building CXX object CMakeFiles/kleeExpr.dir/third_party/klee/lib/Expr/Lexer.cpp.o
[ 16%] Building CXX object CMakeFiles/kleeExpr.dir/third_party/klee/lib/Expr/Parser.cpp.o
[ 18%] Building CXX object CMakeFiles/kleeExpr.dir/third_party/klee/lib/Expr/Updates.cpp.o
[ 19%] Linking CXX static library libkleeExpr.a
[ 19%] Built target kleeExpr
Scanning dependencies of target souperInfer
[ 20%] Building CXX object CMakeFiles/souperInfer.dir/lib/Infer/InstSynthesis.cpp.o
[ 22%] Linking CXX static library libsouperInfer.a
[ 22%] Built target souperInfer
Scanning dependencies of target souperExtractor
[ 23%] Building CXX object CMakeFiles/souperExtractor.dir/lib/Extractor/Candidates.cpp.o
[ 25%] Building CXX object CMakeFiles/souperExtractor.dir/lib/Extractor/KLEEBuilder.cpp.o
[ 26%] Building CXX object CMakeFiles/souperExtractor.dir/lib/Extractor/Solver.cpp.o
[ 27%] Linking CXX static library libsouperExtractor.a
[ 27%] Built target souperExtractor
Scanning dependencies of target souperKVStore
[ 29%] Building CXX object CMakeFiles/souperKVStore.dir/lib/KVStore/KVStore.cpp.o
[ 30%] Linking CXX static library libsouperKVStore.a
[ 30%] Built target souperKVStore
Scanning dependencies of target souperParser
[ 31%] Building CXX object CMakeFiles/souperParser.dir/lib/Parser/Parser.cpp.o
[ 33%] Linking CXX static library libsouperParser.a
[ 33%] Built target souperParser
Scanning dependencies of target souperSMTLIB2
[ 34%] Building CXX object CMakeFiles/souperSMTLIB2.dir/lib/SMTLIB2/Solver.cpp.o
[ 36%] Linking CXX static library libsouperSMTLIB2.a
[ 36%] Built target souperSMTLIB2
Scanning dependencies of target souperTool
[ 37%] Building CXX object CMakeFiles/souperTool.dir/lib/Tool/CandidateMapUtils.cpp.o
[ 38%] Linking CXX static library libsouperTool.a
[ 38%] Built target souperTool
Scanning dependencies of target souperClangTool
[ 40%] Building CXX object CMakeFiles/souperClangTool.dir/lib/ClangTool/Actions.cpp.o
[ 41%] Linking CXX static library libsouperClangTool.a
[ 41%] Built target souperClangTool
Scanning dependencies of target clang-souper
[ 43%] Building CXX object CMakeFiles/clang-souper.dir/tools/clang-souper.cpp.o
[ 44%] Linking CXX executable clang-souper
[ 44%] Built target clang-souper
Scanning dependencies of target extractor_tests
[ 45%] Building CXX object CMakeFiles/extractor_tests.dir/unittests/Extractor/ExtractorTests.cpp.o
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:109:241: warning: missing terminating " character
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:109:3: error: missing terminating " character
   ASSERT_TRUE(extractFromIR(R"m(
   ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:110:13: error: stray ‘@’ in program
 define void @f(i32 %p, i32 %q) {
             ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:118:1: error: stray ‘\’ in program
 )m"));
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:129:241: warning: missing terminating " character
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:129:3: error: missing terminating " character
   ASSERT_TRUE(extractFromIR(R"m(
   ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:130:13: error: stray ‘@’ in program
 define void @f(i32 %p, i32 %q) {
             ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:135:1: error: stray ‘\’ in program
 )m"));
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:144:241: warning: missing terminating " character
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:144:3: error: missing terminating " character
   ASSERT_TRUE(extractFromIR(R"m(
   ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:145:13: error: stray ‘@’ in program
 define void @f(i32 %p, i32 %q) {
             ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:162:1: error: stray ‘\’ in program
 )m"));
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:170:241: warning: missing terminating " character
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:170:3: error: missing terminating " character
   ASSERT_TRUE(extractFromIR(R"m(
   ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:171:13: error: stray ‘@’ in program
 define void @f(i32 %p, i32 %q) {
             ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:184:1: error: stray ‘\’ in program
 )m"));
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:186:237: warning: missing terminating " character
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:186:3: error: missing terminating " character
   EXPECT_TRUE(hasCandidate(R"c(%0:i32 = var ; phi
   ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:190:1: error: stray ‘\’ in program
 )c"));
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:195:241: warning: missing terminating " character
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:195:3: error: missing terminating " character
   ASSERT_TRUE(extractFromIR(R"m(
   ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:196:13: error: stray ‘@’ in program
 define void @f(i1 %p, i1 %q, i1 %r, i1 %s) {
             ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:219:1: error: stray ‘\’ in program
 )m"));
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:221:237: warning: missing terminating " character
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:221:3: error: missing terminating " character
   EXPECT_TRUE(hasCandidate(R"c(%0:i1 = var ; p
   ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:231:1: error: stray ‘\’ in program
 )c"));
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:233:237: warning: missing terminating " character
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:233:3: error: missing terminating " character
   EXPECT_TRUE(hasCandidate(R"c(%0:i1 = var ; s
   ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:237:1: error: stray ‘\’ in program
 )c"));
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp: In member function ‘virtual void ExtractorTest_Simple_Test::TestBody()’:
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:110:1: error: ‘define’ was not declared in this scope
 define void @f(i32 %p, i32 %q) {
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:109:28: error: return-statement with a value, in function returning 'void' [-fpermissive]
   ASSERT_TRUE(extractFromIR(R"m(
                            ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:109:29: error: expected ‘;’ before ‘)’ token
   ASSERT_TRUE(extractFromIR(R"m(
                             ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:109:29: error: expected primary-expression before ‘)’ token
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp: In member function ‘virtual void ExtractorTest_Nsw_Test::TestBody()’:
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:130:1: error: ‘define’ was not declared in this scope
 define void @f(i32 %p, i32 %q) {
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:129:28: error: return-statement with a value, in function returning 'void' [-fpermissive]
   ASSERT_TRUE(extractFromIR(R"m(
                            ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:129:29: error: expected ‘;’ before ‘)’ token
   ASSERT_TRUE(extractFromIR(R"m(
                             ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:129:29: error: expected primary-expression before ‘)’ token
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp: In member function ‘virtual void ExtractorTest_PhiCond_Test::TestBody()’:
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:145:1: error: ‘define’ was not declared in this scope
 define void @f(i32 %p, i32 %q) {
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:144:28: error: return-statement with a value, in function returning 'void' [-fpermissive]
   ASSERT_TRUE(extractFromIR(R"m(
                            ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:144:29: error: expected ‘;’ before ‘)’ token
   ASSERT_TRUE(extractFromIR(R"m(
                             ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:144:29: error: expected primary-expression before ‘)’ token
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp: In member function ‘virtual void ExtractorTest_PhiLoop_Test::TestBody()’:
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:171:1: error: ‘define’ was not declared in this scope
 define void @f(i32 %p, i32 %q) {
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:170:28: error: return-statement with a value, in function returning 'void' [-fpermissive]
   ASSERT_TRUE(extractFromIR(R"m(
                            ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:170:29: error: expected ‘;’ before ‘)’ token
   ASSERT_TRUE(extractFromIR(R"m(
                             ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:170:29: error: expected primary-expression before ‘)’ token
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:186:46: error: expected primary-expression before ‘(’ token
   EXPECT_TRUE(hasCandidate(R"c(%0:i32 = var ; phi
                                              ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:187:1: error: expected primary-expression before ‘%’ token
 %1:i32 = add 1:i32, %0
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:187:21: error: expected primary-expression before ‘%’ token
 %1:i32 = add 1:i32, %0
                     ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:188:20: error: expected primary-expression before ‘%’ token
 %2:i1 = eq 42:i32, %1
                    ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp: In member function ‘virtual void ExtractorTest_PathCondition_Test::TestBody()’:
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:196:1: error: ‘define’ was not declared in this scope
 define void @f(i1 %p, i1 %q, i1 %r, i1 %s) {
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:195:28: error: return-statement with a value, in function returning 'void' [-fpermissive]
   ASSERT_TRUE(extractFromIR(R"m(
                            ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:195:29: error: expected ‘;’ before ‘)’ token
   ASSERT_TRUE(extractFromIR(R"m(
                             ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:195:29: error: expected primary-expression before ‘)’ token
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:221:46: error: expected primary-expression before ‘(’ token
   EXPECT_TRUE(hasCandidate(R"c(%0:i1 = var ; p
                                              ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:222:1: error: expected primary-expression before ‘%’ token
 %1:i1 = var ; q
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:222:15: error: ‘q’ was not declared in this scope
 %1:i1 = var ; q
               ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:225:15: error: ‘r’ was not declared in this scope
 %3:i1 = var ; r
               ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:234:1: error: ‘pc’ was not declared in this scope
 pc %0 1:i1
 ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:233:46: error: expected primary-expression before ‘(’ token
   EXPECT_TRUE(hasCandidate(R"c(%0:i1 = var ; s
                                              ^
/home/julian.stecklina/src/souper/unittests/Extractor/ExtractorTests.cpp:235:18: error: expected primary-expression before ‘%’ token
 %1:i1 = eq 1:i1, %0
                  ^
CMakeFiles/extractor_tests.dir/build.make:62: recipe for target 'CMakeFiles/extractor_tests.dir/unittests/Extractor/ExtractorTests.cpp.o' failed
make[2]: *** [CMakeFiles/extractor_tests.dir/unittests/Extractor/ExtractorTests.cpp.o] Error 1
CMakeFiles/Makefile2:154: recipe for target 'CMakeFiles/extractor_tests.dir/all' failed
make[1]: *** [CMakeFiles/extractor_tests.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

compile error on mac

This problem has existed for a while, just getting around to reporting it now. This file:

klee/include/klee/util/ExprHashMap.h

contains this "tr1" stuff, which interferes with compilation on Mac but not Linux:

#include <tr1/unordered_map>

In the latest klee, this stuff is protected by a libcpp version check. So perhaps we just want to bump klee version.

select is wrong when there is undefined behavior

I'll just paste in a mail that I sent to the llvmdev list. The solution is for Raimondas to adapt his phi+UB patch to do something analogous for select, I think.

In the section about poison values, the LLVM language reference manual says:

"Values other than phi nodes depend on their operands."

This implies that a select instruction's output can be poisoned by its not-selected argument value. If select were poisoned only by its selected argument, we would expect this fact to be mentioned specifically, as it is for phi.

Next I'll show how poisoning by the not-selected value can introduce a problem. Take this C program:

int printf(const char *, ...);
int foo(int x0) {
  int x3 = x0 >> 27;
  int x4 = x3 - 27;
  int x2 = x4 ? x4 : (1 >> x4);
  int x1 = x2 != 0;
  return x1;
}
int arg;
int main (void) {
  int x1 = foo(arg);
  printf("%x\n", x1);
  return 0;
}

This program has no UB and it should print "1'.

Clang at -O2 turns foo() into this:

define i32 @foo(i32 %x0) #0 {
entry:
  %shr = ashr i32 %x0, 27
  %sub = add nsw i32 %shr, -27
  %tobool = icmp ne i32 %sub, 0
  %shr1 = lshr i32 1, %sub
  %cond = select i1 %tobool, i32 %sub, i32 %shr1
  %cmp = icmp ne i32 %cond, 0
  %conv = zext i1 %cmp to i32
  ret i32 %conv
}

Although the translation is very straightforward, there's a problem: %shr1 is a poison value and although it is not selected, it propagates through the rest of the function and poisons the return value.

Obviously things are not going well if the source program contains no UB and the emitted code returns a poison value. Either there has been a miscompilation or else the LLVM documentation should be changed to make it clear that the select instruction only has a value dependence on the value that is selected.

The immediate problem here is that both Souper and Alive are happy for foo() to return 0 instead of 1. Souper actually does this optimization -- legally, of course, according to the reference manual.

Error undefined symbol FoldingSetImpl

The command I'm using:

$ /path/to/opt -load /path/to/souper/build/libsouperPass.so -souper -z3-path=/path/to/z3/bin/z3 solveme.bc -o opt_solveme.bc

The error I'm getting:

Error opening '/path/to/souper/build/libsouperPass.so':  /path/to/souper/build/libsouperPass.so: undefined symbol: _ZN4llvm14FoldingSetImpl6anchorEv

I get the same error with the clang command listed in the google/souper README. Running souper seems to output the optimizations I want but I can't them to apply to the bitcode. The drop in sclang compiles the bitcode but I do not see the optimizations when disassembling.

This is the optimization that souper spits out:

; Static profile 1
; Function: do_something
%0:i32 = var
%1:i32 = and 7:i32, %0
%2:i32 = or 7:i32, %1
%3:i32 = mul %2, %2
%4:i32 = mul 2:i32, %3
%5:i32 = or 4:i32, %1
%6:i32 = mul %5, %5
%7:i32 = mul 83:i32, %6
%8:i1 = eq %4, %7
cand %8 0:i1

If I'm reading it correctly, it optimized all that stuff to 0. That is exactly what I was hoping it would do.

internal cache should index by pointer

Currently the internal cache is indexed by IR string. The other day Peter suggested it could be sped up by using an Inst * as the index. This strategy demands that we never delete an Inst *, which I believe is already the case. What else is required to make this work?

This modification is interesting because it's one of the few opportunities we have to speed up the case where the external cache always hits. In contrast, the performance of the cold-cache case seems less interesting since we can spend arbitrary amounts of time waiting for solvers, trying different synthesis strategies, etc.

Remove n-ary associative operator from ExprBuilder::addPathConditions

I noticed that there is still one place where an instruction with an n-ary associative operator, namely and, is created:

https://github.com/google/souper/blob/master/lib/Extractor/Candidates.cpp#L494

This should be removed.

@chenyang78 could you please have a look at it?

OOM in getUBPhiPaths

While building current LLVM, getUBPhiPaths recurses to depth ~1000, eventually exhausting RAM on the machine I'm using. I've read the code and we already have some attempts to use less resources in this part of Souper. My current thought is that we simply permit Souper to fail (by bailing on this particular expression) when such a large depth of phi nodes is found, what do people think?

teaching souper about memory

I'm wondering whether there are some cases where it's easy and profitable to teach Souper about memory. For example, below is a reasonably fast popcount that uses table lookup. It does not seem that difficult to recognize an inbounds GEP followed by a load from a constant array, and then construct a map from address to values satisfying the load.

; ModuleID = 'popcount2.c'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@BitsSetTable256 = internal unnamed_addr constant [256 x i8] c"\00\01\01\02\01\02\02\03\01\02\02\03\02\03\03\04\01\02\02\03\02\03\03\04\02\03\03\04\03\04\04\05\01\02\02\03\02\03\03\04\02\03\03\04\03\04\04\05\02\03\03\04\03\04\04\05\03\04\04\05\04\05\05\06\01\02\02\03\02\03\03\04\02\03\03\04\03\04\04\05\02\03\03\04\03\04\04\05\03\04\04\05\04\05\05\06\02\03\03\04\03\04\04\05\03\04\04\05\04\05\05\06\03\04\04\05\04\05\05\06\04\05\05\06\05\06\06\07\01\02\02\03\02\03\03\04\02\03\03\04\03\04\04\05\02\03\03\04\03\04\04\05\03\04\04\05\04\05\05\06\02\03\03\04\03\04\04\05\03\04\04\05\04\05\05\06\03\04\04\05\04\05\05\06\04\05\05\06\05\06\06\07\02\03\03\04\03\04\04\05\03\04\04\05\04\05\05\06\03\04\04\05\04\05\05\06\04\05\05\06\05\06\06\07\03\04\04\05\04\05\05\06\04\05\05\06\05\06\06\07\04\05\05\06\05\06\06\07\05\06\06\07\06\07\07\08", align 16

; Function Attrs: nounwind readnone uwtable
define i32 @popcount7(i32 %v) #0 {
  %1 = and i32 %v, 255
  %2 = zext i32 %1 to i64
  %3 = getelementptr inbounds [256 x i8]* @BitsSetTable256, i64 0, i64 %2
  %4 = load i8* %3, align 1, !tbaa !1
  %5 = zext i8 %4 to i32
  %6 = lshr i32 %v, 8
  %7 = and i32 %6, 255
  %8 = zext i32 %7 to i64
  %9 = getelementptr inbounds [256 x i8]* @BitsSetTable256, i64 0, i64 %8
  %10 = load i8* %9, align 1, !tbaa !1
  %11 = zext i8 %10 to i32
  %12 = add nuw nsw i32 %11, %5
  %13 = lshr i32 %v, 16
  %14 = and i32 %13, 255
  %15 = zext i32 %14 to i64
  %16 = getelementptr inbounds [256 x i8]* @BitsSetTable256, i64 0, i64 %15
  %17 = load i8* %16, align 1, !tbaa !1
  %18 = zext i8 %17 to i32
  %19 = add nuw nsw i32 %12, %18
  %20 = lshr i32 %v, 24
  %21 = zext i32 %20 to i64
  %22 = getelementptr inbounds [256 x i8]* @BitsSetTable256, i64 0, i64 %21
  %23 = load i8* %22, align 1, !tbaa !1
  %24 = zext i8 %23 to i32
  %25 = add nuw nsw i32 %19, %24
  ret i32 %25
}

attributes #0 = { nounwind readnone uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.ident = !{!0}

!0 = metadata !{metadata !"clang version 3.5.0 (tags/RELEASE_350/final)"}
!1 = metadata !{metadata !2, metadata !2, i64 0}
!2 = metadata !{metadata !"omnipotent char", metadata !3, i64 0}
!3 = metadata !{metadata !"Simple C/C++ TBAA"}

crash bug

regehr@regehr-M51AC:reduce$ cat small.c
a, c;
volatile b;
fn1();
fn2() {
  for (; fn1 != 1;) {
    long *d = a = b < 0 <= fn2;
    c = *d;
  }
}
regehr@regehr-M51AC:reduce$ ~/souper-regehr/third_party/llvm/Debug/bin/clang -O -c -emit-llvm small.c -w
regehr@regehr-M51AC:reduce$ ~/souper-regehr/build/souper -z3-path=/usr/bin/z3 small.bc
souper: /home/regehr/souper-regehr/lib/Extractor/KLEEBuilder.cpp:486: ref<klee::Expr> (anonymous namespace)::ExprBuilder::getUBInstCondition(): Assertion `PredExpr.size() && "there must be path predicates for the UBs"' failed.
0  souper          0x0000000000a0b31e llvm::sys::PrintStackTrace(_IO_FILE*) + 46
1  souper          0x0000000000a0b5fb
2  souper          0x0000000000a0cd5e
3  libpthread.so.0 0x00007f858087f340
4  libc.so.6       0x00007f857faa6bb9 gsignal + 57
5  libc.so.6       0x00007f857faa9fc8 abort + 328
6  libc.so.6       0x00007f857fa9fa76
7  libc.so.6       0x00007f857fa9fb22
8  souper          0x000000000054fe9d
9  souper          0x000000000054f79a souper::GetCandidateExprForReplacement(std::vector<souper::InstMapping, std::allocator<souper::InstMapping> > const&, souper::InstMapping) + 410
10 souper          0x0000000000595cd6 souper::AddToCandidateMap(std::map<std::string, souper::CandidateMapEntry, std::less<std::string>, std::allocator<std::pair<std::string const, souper::CandidateMapEntry> > >&, souper::CandidateReplacement const&) + 70
11 souper          0x000000000056ce78 souper::AddModuleToCandidateMap(souper::InstContext&, souper::ExprBuilderContext&, std::map<std::string, souper::CandidateMapEntry, std::less<std::string>, std::allocator<std::pair<std::string const, souper::CandidateMapEntry> > >&, llvm::Module*) + 376
12 souper          0x00000000005445d4 main + 660
13 libc.so.6       0x00007f857fa91ec5 __libc_start_main + 245
14 souper          0x0000000000544274
Stack dump:
0.  Program arguments: /home/regehr/souper-regehr/build/souper -z3-path=/usr/bin/z3 small.bc 
Aborted (core dumped)

UB not handled correctly on the RHS

Souper is almost ready to do real instruction synthesis. One obstacle is its failure to share the PrintContext between the LHS and RHS parsing and printing functions. I'll have that fixed soon. The final obstacle relates to the role of UB. On the LHS, it makes optimization easier. On the RHS, it makes optimization more difficult. In a nutshell, add nsw x, y can be transformed to add x, y, but not the other way around. The solution is to teach Souper to do a real refinement check between the LHS and RHS. I believe that after Raimondas's UB work over the last couple of months, this isn't too difficult. The desired behavior (this is from an email Peter sent me a while ago) is:

LHS UB assertion => (LHS == RHS) && RHS UB assertion

Return smallest optimizations when costs are ignored

The expectation here is that souper finds the slt instead of the longer version.

%0:i8 = var
%1:i1 = slt %0, 0:i8
infer %1
%2:i8 = lshr %0, 7:i8
%3:i1 = trunc %2
result %3

To implement this, we have to disable longer wirings for the current number of components, and, if a smaller version is not found, re-enable them in the next round and so on.

Add KnownBits support for souper2llvm

how to do synthesis

Code synthesis is the next major piece of work for Souper. For each iN in a program, Souper wants to see if it can be computed more cheaply. These are its building blocks:

synthesized constants, including undef
variables from the LHS
instructions
Souper already knows how to do 1 (though not undef yet). 2 is easy. 3 is the main trick. Here's how we'll do it.

We start out with a cost function. Although you can imagine many ways to do this, to a first approximation the cost of a RHS will be the number of instructions. So 1 and 2 in the list above represent zero-cost replacements. The set of cost-1 replacements is just the collection of LLVM instructions that input one or more synthesized constants and LHS variables, and output the desired iN, under the constraint that the LLVM type system is happy. Mostly the type system just wants bitwidths to match up, so for example if we're synthesizing an i32, there's no way that an icmp instruction can produce it. Another constraint is that we don't want to try to synthesize anything that will trivially constant-fold, such as a zext with a constant input or an add with two constant inputs.

Given a cost function, we batch up solutions in such a way that our first query will succeed iff there exists a cost-0 replacement, our second query will succeed iff there's a cost-1 replacement, etc. We issue these queries sequentially and stop when we find a solution (which is guaranteed to be the cheapest one) or else we get bored. It is easy to see that as the cost increases, the number of possible solutions grows exponentially. We need a way to prevent the query size from exploding. This slide deck outlines an encoding that will probably work for us:

http://www.eecs.berkeley.edu/~jha/pres/pldi11.pdf

The idea is to push the exponential search into the solver (which is doing a huge search anyway, so why not get it to consider a few more bits of search space).

cache design

At the moment Souper has two kinds of cache entry:

Query entries where the key is the LHS+RHS of a valid replacement and the value is empty.
Profile entries where the key is a concatenation of the string "profile\n", the srcloc with a newline, and the valid replacement. The value is the dynamic count.

These kinds of cache entry can be left alone or one or both could be eliminated and/or combined with the following new kind of entry.

To support inference, we need a new kind of cache entry where the key is the query LHS. I was thinking to structure the RHS of these entries like this, with each field except the last occupying its own line (the last field occupies as many lines as it needs):

a mandatory result code indicating at least valid/invalid, but also leaving room to indicate error conditions such as solver timeout, disagreement amongst solvers, etc.
zero or more srclocs combined with static counts (that is, number of times this replacement has fired at compile time, as opposed to runtime), with each one perhaps occupying its own line
if the result code indicates validity, a mandatory query RHS

Feedback appreciated.

blockpc is unsound

The behavior of the test case below is changed by Souper even with UB turned off. Of course it could be an LLVM bug exposed by Souper, but we need to take a look.

$ cat small.c
int printf(const char *, ...);
int x0, x1, x2, x3, x4, x5;
int main() {
  if (x1 <= 0 == 0) {
    x2 = x5 = x0 >= 2 ?: 0 < x0;
    if (x1 >= x5)
      x3 = 0;
  }
  x4 = x2;
  x1 || printf("x\n");
  return 0;
}
$ /home/regehr/f/souper-regehr/third_party/llvm/Release/bin/clang -O3 small.c -o small2 -Xclang -load -Xclang /home/regehr/f/souper-regehr/build/libsouperPass.so -mllvm -z3-path=/usr/bin/z3 -mllvm -solver-timeout=15 -mllvm -souper-debug -mllvm -souper-external-cache -mllvm -souper-exploit-ub=false 

; Listing applied replacements for main
; Using solver: Z3 + external cache + internal cache

; Listing applied replacements for main
; Using solver: Z3 + external cache + internal cache

; Listing applied replacements for main
; Using solver: Z3 + external cache + internal cache

; Replacing "  %12 = icmp slt i32 %1, %11"
; from ""
; with "i1 false" in:
%0 = block 2
%1:i32 = var
%2:i1 = slt %1, 1:i32
blockpc %0 0 %2 0:i1
%3:i32 = var
%4:i1 = slt 1:i32, %3
blockpc %0 0 %4 0:i1
blockpc %0 1 %2 0:i1
%5:i1 = slt 0:i32, %3
%6:i32 = zext %5
%7:i32 = zext %4
%8:i32 = phi %0, %6, %7
%9:i1 = slt %1, %8
cand %9 0:i1

; Replacing "  %15 = icmp eq i32 %1, 0"
; from ""
; with "i1 false" in:
%0 = block 3
%1:i32 = var
%2:i1 = slt %1, 1:i32
blockpc %0 0 %2 1:i1
%3 = block 2
blockpc %3 0 %2 0:i1
blockpc %3 1 %2 0:i1
%4:i32 = var
%5:i1 = slt 0:i32, %4
%6:i32 = zext %5
%7:i1 = slt 1:i32, %4
%8:i32 = zext %7
%9:i32 = phi %3, %6, %8
%10:i1 = slt %1, %9
blockpc %0 2 %10 0:i1
%11:i1 = eq 0:i32, %1
cand %11 0:i1

; Listing applied replacements for main
; Using solver: Z3 + external cache + internal cache

; Listing applied replacements for main
; Using solver: Z3 + external cache + internal cache
$ ./small2
$ clang small.c -o small2
$ ./small2
x

souper doesn't exploit assume()

Souper should find the optimization here, but doesn't. Perhaps LLVM drops the branch to unreachable before Souper can use it, or perhaps we somehow fail to get the appropriate path condition out of it.

static inline void assume (int x) {
  if (!x) __builtin_unreachable();
}

int odd (int x) {
  return x & 1;
}

int even (int x) {
  return (x & 1) == 0;
}

int add1 (int x, int y) {
  assume(odd(x));
  assume(odd(y));
  return even(x + y);
}

crash bug with dynamic profiling

Does anyone have time to take a look at this one? This crash comes from compiling gcc from SPEC CPU 2006 with dynamic optimization profiling. The crash happens in InstCombine, not in Souper itself -- so this may not be our bug.

$ cat small.c
enum e { x1, x2, x3 } x4;

int x7(int);

void x5(void) {
  enum e x6 = x4 > 0;
  switch (x6) {
  case 1:
    x6 = x3;
    break;
  case 0:
    x6 = 1;
  }
  (x6 == 1 || x6 == x3) && (x6 == 1 || x6 == x3) && x4 && x7(x6);
}
$ /home/regehr/f/souper-regehr/third_party/llvm/Debug/bin/clang -c -w -O small.c -Xclang -load -Xclang /home/regehr/f/souper-regehr/build/libsouperPass.so -mllvm -z3-path=/usr/bin/z3 -mllvm -souper-exploit-ub=false -mllvm -souper-dynamic-profile
clang-3.7: ../include/llvm/Support/Casting.h:223: typename std::enable_if<!is_simple_type<Y>::value, typename cast_retty<X, const Y>::ret_type>::type llvm::cast(const Y &) [X = llvm::PHINode, Y = llvm::ilist_iterator<llvm::Instruction>]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
#0 0x165cdce llvm::sys::PrintStackTrace(_IO_FILE*) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/Support/Unix/Signals.inc:422:15
#1 0x165db7b PrintStackTraceSignalHandler(void*) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/Support/Unix/Signals.inc:481:1
#2 0x1660143 SignalHandler(int) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/Support/Unix/Signals.inc:198:60
#3 0x7f233ae2b340 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x10340)
#4 0x7f233a041cc9 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x36cc9)
#5 0x7f233a0450d8 abort (/lib/x86_64-linux-gnu/libc.so.6+0x3a0d8)
#6 0x7f233a03ab86 (/lib/x86_64-linux-gnu/libc.so.6+0x2fb86)
#7 0x7f233a03ac32 (/lib/x86_64-linux-gnu/libc.so.6+0x2fc32)
#8 0xd92916 _ZN4llvm4castINS_7PHINodeENS_14ilist_iteratorINS_11InstructionEEEEENSt9enable_ifIXntsr14is_simple_typeIT0_EE5valueENS_10cast_rettyIT_KS6_E8ret_typeEE4typeERS9_ /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../include/llvm/Support/Casting.h:224:10
#9 0x139baf3 llvm::InstCombiner::visitPHINode(llvm::PHINode&) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/Transforms/InstCombine/InstCombinePHI.cpp:870:12
#10 0x13265d8 llvm::InstVisitor<llvm::InstCombiner, llvm::Instruction*>::visitPHI(llvm::PHINode&) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../include/llvm/IR/Instruction.def:164:1
#11 0x13252d7 llvm::InstVisitor<llvm::InstCombiner, llvm::Instruction*>::visit(llvm::Instruction&) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../include/llvm/IR/Instruction.def:164:1
#12 0x132476c llvm::InstCombiner::run() /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/Transforms/InstCombine/InstructionCombining.cpp:2695:22
#13 0x1325abc combineInstructionsOverFunction(llvm::Function&, llvm::InstCombineWorklist&, llvm::AssumptionCache&, llvm::TargetLibraryInfo&, llvm::DominatorTree&, llvm::DataLayout const*, llvm::LoopInfo*) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/Transforms/InstCombine/InstructionCombining.cpp:2941:9
#14 0x132af3c (anonymous namespace)::InstructionCombiningPass::runOnFunction(llvm::Function&) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/Transforms/InstCombine/InstructionCombining.cpp:3015:3
#15 0x12203ed llvm::FPPassManager::runOnFunction(llvm::Function&) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/IR/LegacyPassManager.cpp:1538:23
#16 0x2151dac (anonymous namespace)::CGPassManager::RunPassOnSCC(llvm::Pass*, llvm::CallGraphSCC&, llvm::CallGraph&, bool&, bool&) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/Analysis/IPA/CallGraphSCCPass.cpp:151:20
#17 0x215174e (anonymous namespace)::CGPassManager::RunAllPassesOnSCC(llvm::CallGraphSCC&, llvm::CallGraph&, bool&) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/Analysis/IPA/CallGraphSCCPass.cpp:416:16
#18 0x215104b (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/Analysis/IPA/CallGraphSCCPass.cpp:472:18
#19 0x1220db9 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/IR/LegacyPassManager.cpp:1616:23
#20 0x12209ae llvm::legacy::PassManagerImpl::run(llvm::Module&) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/IR/LegacyPassManager.cpp:1723:16
#21 0x1221381 llvm::legacy::PassManager::run(llvm::Module&) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../lib/IR/LegacyPassManager.cpp:1756:3
#22 0x1cd1f75 (anonymous namespace)::EmitAssemblyHelper::EmitAssembly(clang::BackendAction, llvm::raw_ostream*) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../tools/clang/lib/CodeGen/BackendUtil.cpp:635:5
#23 0x1cd1902 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::raw_ostream*) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../tools/clang/lib/CodeGen/BackendUtil.cpp:652:3
#24 0x1cb81f3 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../tools/clang/lib/CodeGen/CodeGenAction.cpp:178:7
#25 0x2984eeb clang::ParseAST(clang::Sema&, bool, bool) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../tools/clang/lib/Parse/ParseAST.cpp:151:3
#26 0x188d489 clang::ASTFrontendAction::ExecuteAction() /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../tools/clang/lib/Frontend/FrontendAction.cpp:537:1
#27 0x1cb770d clang::CodeGenAction::ExecuteAction() /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../tools/clang/lib/CodeGen/CodeGenAction.cpp:726:1
#28 0x188cc40 clang::FrontendAction::Execute() /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../tools/clang/lib/Frontend/FrontendAction.cpp:441:7
#29 0x184bebb clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../tools/clang/lib/Frontend/CompilerInstance.cpp:803:7
#30 0x19df998 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../tools/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:222:8
#31 0x9460f7 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../tools/clang/tools/driver/cc1_main.cpp:110:3
#32 0x93ba43 ExecuteCC1Tool(llvm::ArrayRef<char const*>, llvm::StringRef) /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../tools/clang/tools/driver/driver.cpp:369:5
#33 0x93aa17 main /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/../tools/clang/tools/driver/driver.cpp:415:5
#34 0x7f233a02cec5 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec5)
#35 0x93a4c4 _start (/mnt/ssd/f/souper-regehr/third_party/llvm/Debug/bin/clang-3.7+0x93a4c4)
Stack dump:
0.  Program arguments: /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/bin/clang-3.7 -cc1 -triple x86_64-unknown-linux-gnu -emit-obj -disable-free -main-file-name small.c -mrelocation-model static -mthread-model posix -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu x86-64 -momit-leaf-frame-pointer -dwarf-column-info -coverage-file /home/regehr/tmp/build_base_gcc43-64bit.0000/small.c -resource-dir /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/bin/../lib/clang/3.7.0 -internal-isystem /usr/local/include -internal-isystem /mnt/ssd/f/souper-regehr/third_party/llvm/Debug/bin/../lib/clang/3.7.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -O2 -w -fdebug-compilation-dir /home/regehr/tmp/build_base_gcc43-64bit.0000 -ferror-limit 19 -fmessage-length 100 -mstackrealign -fobjc-runtime=gcc -fdiagnostics-show-option -fcolor-diagnostics -vectorize-loops -vectorize-slp -load /home/regehr/f/souper-regehr/build/libsouperPass.so -mllvm -z3-path=/usr/bin/z3 -mllvm -souper-exploit-ub=false -mllvm -souper-dynamic-profile -o small.o -x c small.c 
1.  <eof> parser at end of file
2.  Per-module optimization passes
3.  Running pass 'CallGraph Pass Manager' on module 'small.c'.
4.  Running pass 'Combine redundant instructions' on function '@x5'
clang-3.7: error: unable to execute command: Aborted
clang-3.7: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 3.7.0 (trunk 228761)
Target: x86_64-unknown-linux-gnu
Thread model: posix
clang-3.7: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and associated run script.
clang-3.7: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-3.7: note: diagnostic msg: /tmp/small-3cf596.c
clang-3.7: note: diagnostic msg: /tmp/small-3cf596.sh
clang-3.7: note: diagnostic msg: 

********************
$

Add synthesis documentation

Explain in README a collection of command line options that give good basic synthesis functionality

Extend souper2llvm with LHS/RHS instruction printing

There should be two options, -print-lhs and -print-rhs.

Implement GEP instruction

The GEP shouldn't be unrolled using Souper instructions, but rather in the KLEEBuilder backend.

sclang does not support atomics

When I try to initiate a cmake build of LLVM using the sclang and sclang++ drivers, cmake complains that they do not support std::atomic:

-- Performing Test HAVE_CXX_ATOMICS_WITHOUT_LIB
-- Performing Test HAVE_CXX_ATOMICS_WITHOUT_LIB - Failed
-- Looking for __atomic_fetch_add_4 in atomic
-- Looking for __atomic_fetch_add_4 in atomic - found
-- Performing Test HAVE_CXX_ATOMICS_WITH_LIB
-- Performing Test HAVE_CXX_ATOMICS_WITH_LIB - Failed
CMake Error at cmake/modules/CheckAtomic.cmake:33 (message):
  Host compiler must support std::atomic!
Call Stack (most recent call first):
  cmake/config-ix.cmake:296 (include)
  CMakeLists.txt:409 (include)


-- Configuring incomplete, errors occurred!
See also "/Users/aarzee/llvmbuildsouper/CMakeFiles/CMakeOutput.log".
See also "/Users/aarzee/llvmbuildsouper/CMakeFiles/CMakeError.log".

missed optimizations due to missing path conditions on switch cases

Here's an example that Souper misses, that I feel it should be able to get:

int foo (int a) {
  int c;
  switch (a) {
  case 0:  c = 10+a; break;
  case 1:  c = 9+a;  break;
  case 2:  c = 8+a;  break;
  case 3:  c = 7+a;  break;
  case 4:  c = 6+a;  break;
  case 5:  c = 5+a;  break;
  case 6:  c = 4+a;  break;
  case 7:  c = 3+a;  break;
  case 8:  c = 2+a;  break;
  case 9:  c = 1+a;  break;
  case 10: c = 0+a;  break;
  default: c = 10;
  }
  return c==10;
}

Both LLVM and GCC change this into "return 1" but if we compile it using only SROA we get this:

define i32 @foo(i32 %a) #0 {
entry:
  switch i32 %a, label %sw.default [
    i32 0, label %sw.bb
    i32 1, label %sw.bb1
    i32 2, label %sw.bb3
    i32 3, label %sw.bb5
    i32 4, label %sw.bb7
    i32 5, label %sw.bb9
    i32 6, label %sw.bb11
    i32 7, label %sw.bb13
    i32 8, label %sw.bb15
    i32 9, label %sw.bb17
    i32 10, label %sw.bb19
  ]

sw.bb:                                            ; preds = %entry
  %add = add nsw i32 10, %a
  br label %sw.epilog

sw.bb1:                                           ; preds = %entry
  %add2 = add nsw i32 9, %a
  br label %sw.epilog

sw.bb3:                                           ; preds = %entry
  %add4 = add nsw i32 8, %a
  br label %sw.epilog

sw.bb5:                                           ; preds = %entry
  %add6 = add nsw i32 7, %a
  br label %sw.epilog

sw.bb7:                                           ; preds = %entry
  %add8 = add nsw i32 6, %a
  br label %sw.epilog

sw.bb9:                                           ; preds = %entry
  %add10 = add nsw i32 5, %a
  br label %sw.epilog

sw.bb11:                                          ; preds = %entry
  %add12 = add nsw i32 4, %a
  br label %sw.epilog

sw.bb13:                                          ; preds = %entry
  %add14 = add nsw i32 3, %a
  br label %sw.epilog

sw.bb15:                                          ; preds = %entry
  %add16 = add nsw i32 2, %a
  br label %sw.epilog

sw.bb17:                                          ; preds = %entry
  %add18 = add nsw i32 1, %a
  br label %sw.epilog

sw.bb19:                                          ; preds = %entry
  %add20 = add nsw i32 0, %a
  br label %sw.epilog

sw.default:                                       ; preds = %entry
  br label %sw.epilog

sw.epilog:                                        ; preds = %sw.default, %sw.bb19, %sw.bb17, %sw.bb15, %sw.bb13, %sw.bb11, %sw.bb9, %sw.bb7, %sw.bb5, %sw.bb3, %sw.bb1, %sw.bb
  %c.0 = phi i32 [ 10, %sw.default ], [ %add20, %sw.bb19 ], [ %add18, %sw.bb17 ], [ %add16, %sw.bb15 ], [ %add14, %sw.bb13 ], [ %add12, %sw.bb11 ], [ %add10, %sw.bb9 ], [ %add8, %sw.bb7 ], [ %add6, %sw.bb5 ], [ %add4, %sw.bb3 ], [ %add2, %sw.bb1 ], [ %add, %sw.bb ]
  %cmp = icmp eq i32 %c.0, 10
  %conv = zext i1 %cmp to i32
  ret i32 %conv
}

attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.ident = !{!0}

!0 = metadata !{metadata !"clang version 3.6.0 (219188)"}

So far so good. But here is Souper's version:

; Function: foo
%0 = block
%1:i32 = var
%2:i32 = addnsw 0:i32, %1
%3:i32 = addnsw 1:i32, %1
%4:i32 = addnsw 2:i32, %1
%5:i32 = addnsw 3:i32, %1
%6:i32 = addnsw 4:i32, %1
%7:i32 = addnsw 5:i32, %1
%8:i32 = addnsw 6:i32, %1
%9:i32 = addnsw 7:i32, %1
%10:i32 = addnsw 8:i32, %1
%11:i32 = addnsw 9:i32, %1
%12:i32 = addnsw 10:i32, %1
%13:i32 = phi %0, 10:i32, %2, %3, %4, %5, %6, %7, %8, %9, %10, %11, %12
%14:i1 = eq 10:i32, %13
infer %14

The desired conclusion cannot be reached because the path condition guarding the switch cases haven't been extracted.

Documentation for infer and result

docs/InstRef.rst makes no mention of infer or result.

missed optimization

This one assumes that we take Yang's fix to #154.

The program below has two addition functions that do overflow checking. They are exactly equivalent. Souper is able to prove that check_result() never throws an assertion but for some reason it does not prove that check_overflow() never throws an assertion. Can someone look into it?

Result is independent of solver choice and I'm compiling like this:

~/f/souper-yang/build/sclang -O3 add.c -Wall -S -emit-llvm

C code:

#include <limits.h>
#include <stdint.h>
#include <assert.h>
#include <stdio.h>

static const int BITS = CHAR_BIT * sizeof(int16_t);

static int checked_add_2(int16_t a, int16_t b, int16_t *rp) {
  int16_t r = (uint16_t)a + (uint16_t)b;
  *rp = r;
  return (a < 0 && b < 0 && r >= 0) || (a >= 0 && b >= 0 && r < 0);
}

static int checked_add_3(int16_t a, int16_t b, int16_t *rp) {
  uint16_t ur = (uint16_t)a + (uint16_t)b;
  uint16_t sr = ur >> (BITS - 1);
  uint16_t sa = (uint16_t)a >> (BITS - 1);
  uint16_t sb = (uint16_t)b >> (BITS - 1);
  *rp = ur;
  return (sa && sb && !sr) || (!sa && !sb && sr);
}

void check_result(long x, long y) {
  int16_t r1, r2;
  checked_add_2(x, y, &r1);
  checked_add_3(x, y, &r2);
  assert(r1 == r2);
}

void check_overflow(long x, long y) {
  int16_t r1, r2;
  int o1 = checked_add_2(x, y, &r1);
  int o2 = checked_add_3(x, y, &r2);
  assert(o1 == o2);
}

int main (void)
{
  long x, y, c;
  c = 0;
  for (x=INT16_MIN; x<=INT16_MAX; x++) {
    for (y=INT16_MIN; y<=INT16_MAX; y++) {
      check_result(x, y);
      check_overflow(x, y);
      c++;
    }
  }
  printf ("ran 0x%lx tests\n", c);
  return 0;
}

Build problems

After ./update_deps.sh my terminal crashed, so the llvm-config location was not in the shell env. It was not documented in the README that I actually have to do cmake -DCMAKE_BUILD_TYPE=Debug -DLLVM_CONFIG_EXECUTABLE=/path_to_souper/third_party/llvm/Debug/bin/llvm-config /path_to_souper/ To make it work. This is needed for every time I want to build Souper from scratch (i.e. rm -rf build, mkdir build, cd build, cmake .., make). So, it should be there as an example, since it must be used multiple times and even the first time if it's e.g. not done from the same terminal.

Also, it may be good to warn the users that "Debug" is the default build in update_deps.sh. clang is souper slow (pun intended) in Debug mode.

Enforce UB constraints on the RHS in synthesis mode

Prevent following optimizations:

%0:i32 = var
%1:i1 = ne 0:i32, %0
%2:i32 = select %1, 2:i32, 0:i32
infer %2
%3:i32 = zext %1
%4:i32 = shl %3, %3
result %4

"make check" doesn't support valgrind?

The LLVM testing guide says to do this to run the regression tests under Valgrind:

make check LIT_ARGS="-v --vg --vg-leak"

However this doesn't seem to work in Souper. It just runs the test suite. Probably we can just grab a bit of makefile/lit support from LLVM and this will work. I haven't looked into it yet.

Assertion `RHS.VAL != 0 && "Divide by zero?"' failed.

The crash is in LLVM but called from our code (via Klee).

regehr@regehr-M51AC:reduce_crash_189061$ cat small.c
void fn1(void) {
  short a;
  1 / a ?: 1;
}
regehr@regehr-M51AC:reduce_crash_189061$ $SOUPER/third_party/llvm/Release/bin/clang -c small.c -w
regehr@regehr-M51AC:reduce_crash_189061$ $SOUPER/third_party/llvm/Release/bin/opt -sccp -inline -sroa small.bc -o small2.bc
regehr@regehr-M51AC:reduce_crash_189061$ $SOUPER/third_party/llvm/Release/bin/opt -load ${SOUPER}/build/libsouperPass.so -souper -external-cache-souper -stp-path=/usr/local/bin/stp -solver-timeout=10 small2.bc
WARNING: You're attempting to print out a bitcode file.
This is inadvisable as it may cause display problems. If
you REALLY want to taste LLVM bitcode first-hand, you
can force output with the `-f' option.

opt: /home/regehr/f/souper-regehr/third_party/llvm/lib/Support/APInt.cpp:1848: llvm::APInt llvm::APInt::udiv(const llvm::APInt &) const: Assertion `RHS.VAL != 0 && "Divide by zero?"' failed.
0  opt              0x0000000000d5c975 llvm::sys::PrintStackTrace(_IO_FILE*) + 37
1  opt              0x0000000000d5d21b
2  libpthread.so.0  0x00007f9331b11340
3  libc.so.6        0x00007f9330d38bb9 gsignal + 57
4  libc.so.6        0x00007f9330d3bfc8 abort + 328
5  libc.so.6        0x00007f9330d31a76
6  libc.so.6        0x00007f9330d31b22
7  opt              0x0000000000d0df25 llvm::APInt::udiv(llvm::APInt const&) const + 1573
8  opt              0x0000000000d128cf llvm::APInt::sdiv(llvm::APInt const&) const + 415
9  libsouperPass.so 0x00007f93326fd7bf klee::ConstantExpr::SDiv(klee::ref<klee::ConstantExpr> const&) + 31
10 libsouperPass.so 0x00007f93326fb29b klee::SDivExpr::create(klee::ref<klee::Expr> const&, klee::ref<klee::Expr> const&) + 91
11 libsouperPass.so 0x00007f9332720b87
12 libsouperPass.so 0x00007f93327202bc
13 libsouperPass.so 0x00007f93327215fa
14 libsouperPass.so 0x00007f93327202bc
15 libsouperPass.so 0x00007f933271e180 souper::GetCandidateExprForReplacement(std::vector<souper::InstMapping, std::allocator<souper::InstMapping> > const&, souper::InstMapping) + 288
16 libsouperPass.so 0x00007f933271887d souper::AddToCandidateMap(std::map<std::string, souper::CandidateMapEntry, std::less<std::string>, std::allocator<std::pair<std::string const, souper::CandidateMapEntry> > >&, souper::CandidateReplacement const&) + 45
17 libsouperPass.so 0x00007f933272fc1b
18 opt              0x0000000000a5e35b llvm::FPPassManager::runOnFunction(llvm::Function&) + 539
19 opt              0x0000000000a5e5cb llvm::FPPassManager::runOnModule(llvm::Module&) + 43
20 opt              0x0000000000a5eb47 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 967
21 opt              0x0000000000535ea9 main + 6297
22 libc.so.6        0x00007f9330d23ec5 __libc_start_main + 245
23 opt              0x0000000000524da2
Stack dump:
0.  Program arguments: /home/regehr/f/souper-regehr/third_party/llvm/Release/bin/opt -load /home/regehr/f/souper-regehr/build/libsouperPass.so -souper -external-cache-souper -stp-path=/usr/local/bin/stp -solver-timeout=10 small2.bc 
1.  Running pass 'Function Pass Manager' on module 'small2.bc'.
2.  Running pass 'Souper super-optimizer pass' on function '@fn1'
Aborted (core dumped)

possible bug

I think we have a nasty bug. The checked_add_4() function in the code below is -- I believe -- a solid implementation of a 32-bit addition that checks for signed overflow. Souper proves that it always returns 0, which is nonsense. Here's the optimization:

%0 = block 3
%1:i32 = var
%2:i1 = slt 0:i32, %1
blockpc %0 0 %2 1:i1
%3:i32 = subnsw 2147483647:i32, %1
%4:i32 = var
%5:i1 = slt %3, %4
blockpc %0 0 %5 1:i1
%6:i1 = slt %1, 0:i32
blockpc %0 1 %6 1:i1
%7:i32 = subnsw 2147483648:i32, %1
%8:i1 = slt %4, %7
blockpc %0 1 %8 1:i1
%9:i32 = phi %0, 1:i32, 1:i32, 0:i32
cand %9 0:i32

I believe something has gone wrong here. The full code is below. It is supposed to print "-202 1", but when compiled with souper it prints "-202 0".

#define MIN ((int)(1U<<31))
#define MAX ((int)((1U<<31)-1))

int checked_add_4(int a, int b, int *rp) {
  if (b > 0 && a > MAX - b) {
    *rp = (a + MIN) + (b + MIN);
    return 1;
  }
  if (b < 0 && a < MIN - b) {
    *rp = (a - MIN) + (b - MIN);
    return 1;
  }
  *rp = a + b;
  return 0;
}

#ifdef TEST

#include <limits.h>
#include <assert.h>
#include <stdio.h>

int main (void)
{
  assert(MIN==INT_MIN);
  assert(MAX==INT_MAX);
  volatile int x = INT_MAX - 100;
  volatile int y = INT_MAX - 100;
  int r, o;
  o = checked_add_4(x, y, &r);
  printf ("%d %d\n", r, o);
  return 0;
}

#endif

Bump KLEE

and update @rsas's KLEE repo.

end-to-end test of redis subsystem

I was going to write an end-to-end test of Souper's use of redis, but I'm not totally sure what to test. Probably what I'll do is:

run souper over a large .bc file that comes from building LLVM
run souper-check on all optimizations that Redis has cached

This is easy but I'm not sure about the test oracle. It's tedious to specify the exact set of optimizations found by Souper, so maybe we can just assert that a certain number of optimizations is found. But of course Souper is allowed to get better over time, for example by finding optimizations that it missed earlier. Should it be a test fail when souper finds more optimizations? My initial opinion is that Souper won't get fundamentally better in this sense very often, so it's fine to fail the test and insist that someone update the count of expected optimizations. Reasonable?

possible blockpc bug

I'd have expected that blockpc would let Souper optimize the code below, but it doesn't. The thing that Souper is supposed to see is that %8 == 10.

; ModuleID = 'cprop8test.c'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: nounwind uwtable
define void @cprop_test8(i32* nocapture %data) #0 {
  %1 = load i32* %data, align 4, !tbaa !1
  %2 = icmp eq i32 %1, 0
  br i1 %2, label %5, label %3

; <label>:3                                       ; preds = %0
  %4 = getelementptr inbounds i32* %data, i64 1
  store i32 4, i32* %4, align 4, !tbaa !1
  br label %7

; <label>:5                                       ; preds = %0
  %6 = getelementptr inbounds i32* %data, i64 2
  store i32 3, i32* %6, align 4, !tbaa !1
  br label %7

; <label>:7                                       ; preds = %5, %3
  %j.0 = phi i32 [ 4, %3 ], [ 7, %5 ]
  %k.0 = phi i32 [ 6, %3 ], [ 3, %5 ]
  %8 = add nuw nsw i32 %k.0, %j.0
  %9 = mul nsw i32 %8, 21
  %10 = getelementptr inbounds i32* %data, i64 3
  store i32 %9, i32* %10, align 4, !tbaa !1
  ret void
}

attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.ident = !{!0}

!0 = metadata !{metadata !"clang version 3.6.0 (trunk 223586)"}
!1 = metadata !{metadata !2, metadata !2, i64 0}
!2 = metadata !{metadata !"int", metadata !3, i64 0}
!3 = metadata !{metadata !"omnipotent char", metadata !4, i64 0}
!4 = metadata !{metadata !"Simple C/C++ TBAA"}

Fix -souper-exploit-ub=false mode in synthesis

Currently, the -souper-exploit-ub=false mode doesn't work in synthesis and should not be used. The yet-to-be-found fix is not trivial, thus, it was not addressed in 21ed266.

parser bug?

Take this query:

%0:i8 = var
%1:i1 = eq %0, 1
%2:i1 = eq %0, 2
%3:i1 = eq %0, 4
%4:i1 = eq %0, 8
%5:i1 = eq %0, 16
%6:i1 = eq %0, 32
%7:i1 = eq %0, 64
%8:i1 = eq %0, 128
%9:i1 = or %1, %2, %3, %4, %5, %6, %7, %8
infer %9

%20:i8 = ctpop %0
%21:i1 = eq %20, 1
result %21

Parse and print it, and the result is:

%0:i8 = var
%1:i1 = eq 1:i8, %0
%2:i1 = eq 2:i8, %0
%3:i1 = eq 4:i8, %0
%4:i1 = eq 16:i8, %0
%5:i1 = eq 32:i8, %0
%6:i1 = eq 64:i8, %0
%7:i1 = eq 128:i8, %0
%8:i1 = eq 248:i8, %0
%9:i1 = or %1, %2, %3, %4, %5, %6, %7, %8
%10:i8 = ctpop %0
%11:i1 = eq 1:i8, %10
cand %9 %11

canonicalization opportunties

Operands of instructions like "and" and "or" seem to be mostly sorted, but they aren't always sorted. Sorting them would give a bit of extra canonicalization of IR, with benefits for caching.

As Raimondas observed, operands of phi nodes are neither sorted nor deduplicated. We see some heavy duplication of phi operands sometimes while building LLVM. Deduplication would give a small readability win, a small cache win, and also some speedup for the new phi+UB code.