svf-tools / svf Goto Github PK
View Code? Open in Web Editor NEWStatic Value-Flow Analysis Framework for Source Code
Home Page: http://svf-tools.github.io/SVF/
License: Other
Static Value-Flow Analysis Framework for Source Code
Home Page: http://svf-tools.github.io/SVF/
License: Other
Hello,
I am very new to SVF and somewhat naively I took a quick spin in the 3.8 LLVM code base (as shipped so to speak), and liking the code base, and capabilities, I then started towards migrating SVF towards the LLVM 4.0 release. Perhaps not unsurprisingly I ran into a few problems, but primarily and this was one of the things that attracted me to SVF, the GraphTraits in 4.0 looks like it has some changes that are marked as a fixme:
typedef typename GTraits::NodeRef NodeRef;
typedef typename GTraits::nodes_iterator node_iterator;
typedef typename GTraits::ChildIteratorType child_iterator;
DOTTraits DTraits;
static_assert(std::is_pointer::value,
"FIXME: Currently GraphWriter requires the NodeRef type to be "
"a pointer.\nThe pointer usage should be moved to "
"DOTGraphTraits, and removed from GraphWriter itself.”);
(that’s in the GraphWriter.h from LLVM 4.0 release).
So, is there anywhere, a paper perhaps, that you could refer me towards for understanding some of the details as well as the high level implementation of the graphing algorithms being used? I come from a numerical background so I’m used to sparse solvers such as GMRES, etc, but the LLVM implementation (while I’m somewhat familiar with a few parts of LLVM) is somewhat new to me - I’ve generally re-used the SCC and other graph related algorithms without needing to dive in headfirst - that said, that looks like it’s on tap for my weekend :)
Another question I had was in the Memory separation capabilities, does this require disjoint regions? It seems to read as if “no”? But then that does limit any typical Formal Methods applications where analysis (that I’ve seen at least) typically requires disjoint regions. For example I’ve seen several FM related LLVM papers replacing PHI nodes with simplifications.
Thanks!
Hello,
Maybe this is a silly question. After constructing the SVFG, how to get all the Defs for each value in IR?
I only found getDefSVFGNode. But this only returns one node. But as a conservative static analysis, there should be several instructions that can possibly write to the same memory location. So how can I get all these instructions?
Thanks!
The two wiki pages that describe using SVF to write
seem to be out of sync with some of the code. I am guessing there will be no further updates, is this the case? If I could figure out what's what, I would be glad to contribute back.
For example, if its argument is "_Z8printtttPj". (Original name "printttt(unsigned int*)")
"cppUtil::DemangledName cppUtil::demangle " cannot demangle it.
Using latest master (0800cd1), I just added 'valgrind' before the invocation of 'wpa' in PTABen's run.sh
and am seeing the following:
In particular:
fi_tests/spec_tests/gap.c
@@@analyzing fi_tests/spec_tests/gap.c with testwpa.sh
==331== Memcheck, a memory error detector
==331== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==331== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==331== Command: /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa -ander -vgep=true -stat=false fi_tests/spec_tests/gap.opt
==331==
==331== Invalid read of size 4
==331== at 0x4B477B: ConstraintGraph::moveInEdgesToRepNode(ConstraintNode*, ConstraintNode*) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x471E33: Andersen::mergeNodeToRep(unsigned int, unsigned int) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x483098: AndersenWaveDiff::mergeNodeToRep(unsigned int, unsigned int) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x47061B: Andersen::mergeSccNodes(unsigned int, llvm::SparseBitVector<128u>&) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x47344B: Andersen::mergeSccCycle() (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x4755A0: Andersen::SCCDetect() (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x47083C: Andersen::analyze(llvm::Module&) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x42CF22: WPAPass::runPointerAnalysis(llvm::Module&, unsigned int) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x42D565: WPAPass::runOnModule(llvm::Module&) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x61302E: llvm::legacy::PassManagerImpl::run(llvm::Module&) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x417D6D: main (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== Address 0x66eaa58 is 24 bytes inside a block of size 72 free'd
==331== at 0x4C2C2EB: operator delete(void*) (in /nix/store/cl1jd45s910gq4jzsd0irnis14p2vmj4-valgrind-3.12.0/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==331== by 0x4B40E7: ConstraintGraph::removeDirectEdge(ConstraintEdge*) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x4B477A: ConstraintGraph::moveInEdgesToRepNode(ConstraintNode*, ConstraintNode*) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x471E33: Andersen::mergeNodeToRep(unsigned int, unsigned int) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x483098: AndersenWaveDiff::mergeNodeToRep(unsigned int, unsigned int) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x47061B: Andersen::mergeSccNodes(unsigned int, llvm::SparseBitVector<128u>&) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x47344B: Andersen::mergeSccCycle() (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x4755A0: Andersen::SCCDetect() (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x47083C: Andersen::analyze(llvm::Module&) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x42CF22: WPAPass::runPointerAnalysis(llvm::Module&, unsigned int) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x42D565: WPAPass::runOnModule(llvm::Module&) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x61302E: llvm::legacy::PassManagerImpl::run(llvm::Module&) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== Block was alloc'd at
==331== at 0x4C2B22F: operator new(unsigned long) (in /nix/store/cl1jd45s910gq4jzsd0irnis14p2vmj4-valgrind-3.12.0/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==331== by 0x4AF478: ConstraintGraph::addNormalGepCGEdge(unsigned int, unsigned int, LocationSet const&) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x4B06C5: ConstraintGraph::buildCG() (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x4771DC: Andersen::initialize(llvm::Module&) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x4707C6: Andersen::analyze(llvm::Module&) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x42CF22: WPAPass::runPointerAnalysis(llvm::Module&, unsigned int) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x42D565: WPAPass::runOnModule(llvm::Module&) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x61302E: llvm::legacy::PassManagerImpl::run(llvm::Module&) (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
==331== by 0x417D6D: main (in /nix/store/cwki8ybl3g876zia6m1m7g577rxpzpw1-SVF-3.8.1-2017.05.01/bin/wpa)
(apologies for lack of debug info in those traces)
If you could confirm whether this matches in your build/version that would be useful.
Let me know if you need any more information!
Hello!
I'm currently using the SVF analysis for a project and I run into an issue using the analysis
for newly inserted instructions. In short, I get the following error whenever I try to check
whether the newly inserted instruction (AllocaInst) aliases with another value :
SVF/include/MemoryModel/MemModel.h:554:
SymID SymbolTableInfo::getValSym(const llvm::Value*): Assertion `iter!=valSymMap.end() &&"value sym not found"' failed.
Below is a test pass that should throw this error for any loop containing a load.
Weirdly, this only happens if I compile Svf & the pass in Debug mode, but not if I build it in
MinSizeRel mode.
In general I believe that I need to update the pointer analysis after having modified the code, however, I'm not sure how to properly update the analysis. The test pass below contains some of my failed trials of naively re-running it (see commented lines).
What I am doing wrong? How do I properly update/re-run the analysis? And, do you know why this is
only happening if compiled in Debug mode?
#include "llvm/Analysis/LoopPass.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/IR/IRBuilder.h"
#include "MemoryModel/PointerAnalysis.h"
#include "WPA/Andersen.h"
using namespace llvm;
namespace {
struct TestSvf : public LoopPass {
static char ID;
TestSvf() : LoopPass(ID) {}
virtual void getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<LoopInfoWrapperPass>();
}
virtual bool runOnLoop(Loop *L, LPPassManager &LPM);
};
}
bool TestSvf::runOnLoop(Loop *L, LPPassManager &LPM) {
// Create AA before changing loop
Module &M = *L->getHeader()->getParent()->getParent();
AndersenWaveDiff *AA = AndersenWaveDiff::createAndersenWaveDiff(M);
// Change loop: add an alloca instruction
BasicBlock *H = L->getHeader();
IRBuilder<> Builder(&*(H->getFirstInsertionPt()));
AllocaInst *Alloca = Builder.CreateAlloca(Type::getInt1Ty(getGlobalContext()), 0, "test_alloca");
// Try to recalculate AA
// 1st try: AA->analyze(M);
// 2nd try:
// AndersenWaveDiff::releaseAndersenWaveDiff();
// AA = AndersenWaveDiff::createAndersenWaveDiff(M);
// 3rd try:
// delete(AA);
// AA = AndersenWaveDiff::createAndersenWaveDiff(M);
// Find first load in loop
for (BasicBlock *BB : L->getBlocks()) {
for (Instruction &I : *BB) {
if (LoadInst *Load = dyn_cast<LoadInst>(&I)) {
Value *LoadedVal = Load->getPointerOperand();
// Check AA information between loaded value & alloca instruction
// throws error
errs() << AA->alias(Alloca, LoadedVal) << "\n";
}
}
}
return false;
}
char TestSvf::ID = 0;
static RegisterPass<TestSvf> X("test-svf", "TestSvf_pass", false, true);
When I conducted PTA, some of the source location is duplicated like
!!Target NodeID 16735 [<__cleanup_mnt> Source Loc: in line: 1072 file:
/home/workspace/llvmlinux/targets/x86_64/src/linux/fs/namespace.cin line: 1072 file:
/home/workspace/llvmlinux/targets/x86_64/src/linux/fs/namespace.cin line: 1072 file:
...
...
/home/workspace/llvmlinux/targets/x86_64/src/linux/fs/namespace.cin line: 1072 file:
/home/workspace/llvmlinux/targets/x86_64/src/linux/fs/namespace.c]
Is this a bug? Please, can you check this?
In the following code:
int f() { char buf[10] = {0,}; int x = buf[0]; int y = buf[1]; return x + y; }
The GEP instructions which correspond to x and y are both constant,
but when we check aliasing, we get MayAlias.
Actually, in both cases, the LocationSet of the GepObjPN is constant and equal to 0.
Is this a bug or SVF handles buffers always in an index insensitive way?
can you tell me how can I get the spec cpu2000? what's the website?
Don't mean to nag, but just a reminder that merging in SUPA implementation would be great and much appreciated! 👍
The SUPA website suggests it's on your TODO, and idea when that will happen? Or is there anything blocking the effort that perhaps folks (such as myself) could help with?
Thanks!
A few functions and strings are using "Souce" instead of "Source". Guessing this is a typo, so made a page and a pull req.
The implementation of lazy cycle detection is described as 'incomplete' in Andersen.h,
is the implemented approach documented somewhere?
If not, could this be documented?
I suppose I'm unsure in what way what's implemented is "lazy cycle detection", is it?
Hi, I referred to https://github.com/unsw-corg/SVF/wiki/Write-a-source-sink-analyzer
In my case, I have two types of sources.
My target program runs in a loop, and class variable will go out via sinks.
This class variable is computed with class variable, arithmetic operations and source (external source).
In this case, is there a way to designate "class variable" as sources?
From my observation, some node contains variable names. So, if I could get such node in some way, there may be some way to do. But I am not sure.
Note that class variable can be changed directly via arithmetic operations (e.g., classA.a = classA.a + 30)
Hi, I just got to know this library. I am in need of flow sensitive intra-procedural analysis and flow-insensitive context in-sensitive inter-procedural analysis. Can any one give some tips about how to implement these two analyses based on this tool. This may save a lot of time. Thanks!
Hi,
I am trying to detect all indirect function calls via points-to analysis.
However, some of them are missing from my observation.
I have tried it with Andersen, AndersenWave, AndersenWaveDiff, AndersenWaveDiffWithType, AndersenLCD with LLVM 4.0.
When I tried it with Andersen, I used the following codes.
Andersen* pt = new Andersen();
pt->analyze(*module);
Then, when I get callsite, I used the following set of codes.
if(pt->hasIndCSCallees((CallInst*)cinst)){ // indirect
set <const llvm::Function*> indFuncSet = pt->getIndCSCallees((CallInst*)cinst);
for(auto ifs = indFuncSet.begin(); ifs != indFuncSet.end(); ++ifs){
calleeFunc = *ifs; //This is an indirect callee candidate
......
}
}
Am I doing something wrong?
If so, could you tell me what is the most accurate way to detect all indirect callee?
Hi,
What are "black hole" and "variant GEP" edges (black hole especially) and what roles do they play in the precision/performance of the various analyses?
Hi,
Firstly, Awesome project! Reading through the source code I had a question about this enum:
SVF/include/MemoryModel/PointerAnalysis.h
Line 58 in 3038078
A number of analysis I'm interested are listed. However, it seems like these are yet to be implemented? Or is there source code available somewhere else that hasn't been merged?
Hi,
I recently tested SVF (commit 5355fc2). Great piece of work from my point of view!
Unfortunatly I have problems using the API correctly and I would be pleased if you could guide me a little.
I initialize SVF with the following Instuctions:
bool runOnModule(Module &m) override {
FlowSensitive* fspa = FlowSensitive::createFSWPA(m);
SVFG* svfg = fspa->getSVFG();
PAG* pag = fspa->getPAG();
PTACallGraph* ptaCallGraph = fspa->getPTACallGraph();
Later I obtain a llvm::CallSite and want to access the SVFGNodes corresponding to the arguments of that CallSite with
if (svfg->hasActualINSVFGNodes(callSite)) { // why never true?
auto set = svfg->getActualINSVFGNodes(callSite);
int i = 0;
for(auto it = set.begin(); it!=set.end(); ++it) {
errs() << "param No. " << i++;
errs() << "node id: " << *it << "\n";
}
} else {
errs() << "no actual INSVFGNodes\n";
}
But the SVFG::callSiteToActualINMap (include/MSSA/SVFG.h:100) is empty everytime. What am I missing here? Do I have wrong initialization steps?
I attached the code of my LLVM Pass as well as source code and LLVM IR of the module under test.
Logger.zip
See LoggerOO.cpp: my ultimate goal is to track back the value of parameter 1 of Logger::log2() [line 75] so that SVF reports its value either originates as return value of Encryptor::encrypt() [line 69] or as output parameter of assign() [line 71]
It would be nice if you could help me with this.
Thank you.
Hi,
I have seen the data flow from defined pointer to used pointer.
Do you have any API in SVF to get the data flow between used pointer?
For example, in Use-after free I wantto know the flow from the memory malloc to free, and its use.
I do not know whether can we derive the flow from memory free to its use?
Thanks!
Hi,
I observed that after running a pointer analysis (AnderesenWaveDiff), the number of instructions in an LLVM module increases. From what I gather the analsysis itself, never changes the module it analyses. Is that true or did I miss something? Another thing I was considering is that SVF registers itself as an LLVM alias analysis and then another transformation uses it's more precise results to emit different code. However I'm not running any other pass after the SVF analysis. Any other ideas?
Hi,
I've been using WPA in SVF to analyze a library using Andersen's algorithm. The library I'm using is musl-libc version 1.1.15 since it can be compiled using LLVM.
I notice that, in musl libc, there is an indirect call from function vfprintf to sn_write which is not captured in WPA's output. Particularly, when a program invokes vsnprintf, it prepares a FILE struct with a pointer to sn_write function. "vsnprintf" then issues a direct call to vfprintf with a pointer to this struct as an argument. Finally, vfprintf invokes sn_write at an indirect callsite.
Source codes for vsnprintf and vfprintf.
Attached musl.tar.gz contains bitcode file and LLVM assembly file of musl generated by LLVM gold plugin.
Thank you for your help.
Hi.
I'm trying to do points-to analysis on the Linux kernel with llvmlinux and your SVF tool.
I built vmlinux.bc and run wpa tool. but the wpa is terminated abnormally, and it prints only the 'Killed' message.
I think the problem is caused by out-of-memory.
Have you tried to do analysis on the Linux kernel? and do you have any tips for me?
I built vmlinux.bc by
My wpa command is
Thank you for great tools and it will be very helpful if you give me comments.
your saber is flow-sensitive and field-sensitive and Context-Sensitive Interprocedural pointer analysis or not ?
I have read some documents related to the gold plugin and LTO, but still not clear for me.
As far as I understood, the reason why we use llvm-gold plugin is becuase
clang -flto
, I should give all bitcodes/objects as arguments altogether.Am I right?
Based on what I guess, I tried to follow your suggestion.
But it looks building Linux kernel is not a trivial. I tried a few things as follows and all fails.
I first tried to build a vmlinux bitcode file with llvmlinux project.
I used a script as a CC instead of clang or gcc.
It emits both bitcode files and object files for all clang command.
I think it was successful so far. So I was able to generate bitcode files for individual C codes.
And then I ran clang -flto $(find . -name '*.bc') -o $OUTPUT
.
I knew there might be some missing files but I wanted to see what would happen.
It generated a lot of multiple definition errors as follows.
...
/usr/bin/ld: error: /tmp/intel_audio-5ce78a.o: multiple definition of 'intel_audio_codec_disable'
/usr/bin/ld: /tmp/built-in-02b6ce.o: previous definition here
/usr/bin/ld: error: /tmp/intel_audio-5ce78a.o: multiple definition of 'intel_init_audio'
/usr/bin/ld: /tmp/built-in-02b6ce.o: previous definition here
/usr/bin/ld: error: /tmp/intel_audio-5ce78a.o: multiple definition of 'i915_audio_component_init'
/usr/bin/ld: /tmp/built-in-02b6ce.o: previous definition here
/usr/bin/ld: error: /tmp/intel_audio-5ce78a.o: multiple definition of 'i915_audio_component_cleanup'
/usr/bin/ld: /tmp/built-in-02b6ce.o: previous definition here
...
I also tried to build it in the way suggested in quickstart-for-using-lto-with-autotooled-projects without using another project.
It generated following errors.
...
In file included from arch/x86/kernel/asm-offsets.c:8:
In file included from include/linux/crypto.h:24:
In file included from include/linux/slab.h:14:
In file included from include/linux/gfp.h:5:
include/linux/mmzone.h:345:22: error: use of undeclared identifier 'MAX_NR_ZONES'; did you mean
'__MAX_NR_ZONES'?
long lowmem_reserve[MAX_NR_ZONES];
^~~~~~~~~~~~
__MAX_NR_ZONES
...
I found the other project linux-misc whose a purpose is building LTO-applied Linux kernel (based on gcc).
But combining this project and llvmlinux is a little bit confusing me, and I think it will not work properly.
If you have the experience analyzing Linux kernel, please give me an advice how I can link Linux kernel in a proper way.
Thanks!
The current code is remarkably close to being free of these singletons but requires some careful work to remove them while preserving the overall architecture. I tried locally but it was a bit of a mess O:).
Is this something you could look at?
I'm trying to use the SVF pointer alias analysis to partition the all abstract memory objects into disjoint sets. So I'm doing something like:
for(auto& idToType : *pag) {
if(ObjPN* opn = dyn_cast<ObjPN>(idToType.second)) {
unsigned nodeId = idToType.first;
PointsTo& ptsToOrIsPointedTo = _pta->getPts(nodeId);
ptsToOrIsPointedTo |= _pta->getRevPts(nodeId);
ptsToOrIsPointedTo &= memObjects;
if(!ptsToOrIsPointedTo.empty()) {
ptsToOrIsPointedTo.set(nodeId);
auto foundElem = std::find_if(disjointObjects.begin(), disjointObjects.end(),
[&ptsToOrIsPointedTo](const PointsTo& e)
{return e.intersects(ptsToOrIsPointedTo);});
if( foundElem == disjointObjects.end()) {
disjointObjects.push_front(ptsToOrIsPointedTo);
} else {
*foundElem |= ptsToOrIsPointedTo;
}
}
}
}
However when considering something like:
char* p = cond ? "hello" : "world";
char* q = "some other string"
I get disjoint set like: [ 1(constObjId) idOfP ifOfQ]
, which makes sense because SVF considers all constants as a single object. So I tried hacking a bit and changed isConstantObjSym
to always return false. Which kind of gave me the right result in producing 2 disjoint sets: [idOfP idOfhello idOfWorld]
and [idOfQ idOfsomeotherstring]
. But it seems that it has assigned multiple ids to the constant string.
I don't understand SVF very well, so I'm wondering if there is a deeper reason (aside performance?) as to why constants are all a single object?
Thanks!
how can I get the results of pointer analysis? I want to analysis something from the results (eg. a struct to store the information), I didn't need the .dot file.
if I want to write a flow and field insensitive pointer analysis, can you tell me the detail, I don't know how to make and run the file that I write.
Hi, I am trying to understand and check SVF capabilities.
However, I found some error with a toy code.
I also attached error messages.
--------------------------- Code ------------------------------------------------
#include
#include
using namespace std;
class Profile
{
public:
void printProfile()
{
cout << "Name : " << _name.c_str() << endl;
cout << "Phone Number : " << _phoneNumber.c_str() <<endl;
}
void setName(string name)
{
_name = name;
}
void setPhoneNumber(string phoneNumber)
{
_phoneNumber = phoneNumber;
}
private:
string _name;
string _phoneNumber;
};
int main()
{
Profile myProfile;
myProfile.setName("Hong");
myProfile.setPhoneNumber("012319562");
myProfile.printProfile();
return 0;
}
------------------------ Error message --------------------------------------
Writing 'ander_svfg.dot'...#0 0x0000000000e7870b llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/SVF/build/bin/wpa+0xe7870b)/SVF/build/bin/wpa+0xe78a20)
#1 0x0000000000e78a20 PrintStackTraceSignalHandler(void*) (
#2 0x0000000000e7706d llvm::sys::RunSignalHandlers() (/SVF/build/bin/wpa+0xe7706d)/SVF/build/bin/wpa+0xe78181)
#3 0x0000000000e78181 SignalHandler(int) (
#4 0x00007fe76b13b330 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x10330)
#5 0x000000000045ed78 MRVer::getSSAVersion() const (/SVF/build/bin/wpa+0x45ed78)/SVF/build/bin/wpa+0x463d4f)
#6 0x0000000000463d4f llvm::DOTGraphTraits<SVFG*>::getCompleteNodeLabel(SVFGNode*, SVFG*) (
#7 0x0000000000462e08 llvm::DOTGraphTraits<SVFG*>::getNodeLabel(SVFGNode*, SVFG*) (/SVF/build/bin/wpa+0x462e08)/SVF/build/bin/wpa+0x47bec9)
#8 0x000000000047bec9 llvm::GraphWriter<SVFG*>::writeNode(SVFGNode*) (
#9 0x0000000000479fe7 llvm::GraphWriter<SVFG*>::writeNodes() (/SVF/build/bin/wpa+0x479fe7)/SVF/build/bin/wpa+0x4769a3)
#10 0x00000000004769a3 llvm::GraphWriter<SVFG*>::writeGraph(std::string const&) (
#11 0x000000000047095d llvm::raw_ostream& llvm::WriteGraph<SVFG*>(llvm::raw_ostream&, SVFG* const&, bool, llvm::Twine const&) (/SVF/build/bin/wpa+0x47095d)/SVF/build/bin/wpa+0x46a10b)
#12 0x000000000046a10b void llvm::GraphPrinter::WriteGraphToFile<SVFG*>(llvm::raw_ostream&, std::string const&, SVFG* const&, bool) (
#13 0x000000000045c593 SVFG::dump(std::string const&, bool) (/SVF/build/bin/wpa+0x45c593)/SVF/build/bin/wpa+0x409bc6)
#14 0x0000000000409bc6 WPAPass::runPointerAnalysis(llvm::Module&, unsigned int) (
#15 0x0000000000409964 WPAPass::runOnModule(llvm::Module&) (/SVF/build/bin/wpa+0x409964)/SVF/build/bin/wpa+0x7b92b6)
#16 0x00000000007b92b6 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) (
#17 0x00000000007b9a20 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/SVF/build/bin/wpa+0x7b9a20)/SVF/build/bin/wpa+0x7b9c61)
#18 0x00000000007b9c61 llvm::legacy::PassManager::run(llvm::Module&) (
#19 0x000000000040772b main (/SVF/build/bin/wpa+0x40772b)/SVF/build/bin/wpa+0x407289)
#20 0x00007fe76a34ef45 __libc_start_main /build/eglibc-oGUzwX/eglibc-2.19/csu/libc-start.c:321:0
#21 0x0000000000407289 _start (
Stack dump:
0. Program arguments: wpa -ander -svfg -dump-svfg c++.bc
Hello,
I want to use PAG and SVFG graphs to collect instructions reachability information. Particularly for every function I need to find instructions reachable by each of function arguments. Do I understand it correctly that first I’ll need to extend PAG to include all the instructions as now PAGBuilder doesn’t process all types of instructions. Then traverse extended PAG to find instructions reachable from function arguments. Will PAG reflect aliasing information for pointers, or do I need to use SVFG for it?
Thanks!
Hi,
Do you have any map from svfgnode or pagnode to llvm instruction, or its verse.
I'd like to use the result in SVF to other task.
Thanks
Hi, yulei,
Is there any way that we can transform llvm IR (Instruction) to SVFGNode?
I have found that: “pag->getValueNode”, and “pag->getPAGNode“
we can obtain PAGNode from llvm IR.
But I don't know how to obtain SVFGNode.
Besides, the NodeID in PAG and SVFG is different?
thanks
Hello,
After I have installed SVF as https://github.com/SVF-tools/SVF/wiki/Setup-Guide-(CMake),
I run the example in https://github.com/SVF-tools/SVF/wiki/Analyze-a-Simple-C-Program .
When we run wpa -ander -svfg swap.bc,
the program outputs some information, but at last it reported the abort as follows:
wpa: $SVF_Home/lib/CUDD/cuddTable.c:343: cuddAllocNode: Assertion `((ptruint) mem & (sizeof(DdNode) - 1)) == 0' failed.
How to solve this problem? thanks
Dear SVF authors,
I followed your tutorial to build and install LLVM gold plugin on Ubuntu 14.04 (one difference from yours is that I built the plugin under LLVM 3.4 since my LLVM pass is based on 3.4). I successfully built binutils and LLVMgold.so and installed them to /usr/bin and /usr/lib, respectively. However, when I tried with the example code on the official site, an error “ar: /usr/lib/bfd-plugins/libLTO.a: invalid ELF header” occurred when ar q a.a a.o
is run.
I checked the ELF header of libLTO.a using readelf and it seems nothing abnormal with it. I would much appreciate if you could provide some clue on the error. Thanks very much!
$ readelf -h /usr/lib/bfd-plugins/libLTO.a
File: /usr/lib/bfd-plugins/libLTO.a(LTODisassembler.cpp.o)
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 856 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 64 (bytes)
Number of section headers: 24
Section header string table index: 21
File: /usr/lib/bfd-plugins/libLTO.a(lto.cpp.o)
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 174848 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 64 (bytes)
Number of section headers: 2561
Section header string table index: 2558
LLVM deprecated autoconf in favor of CMake around the 3.8 release, which means projects must move to CMake in order to be used with newer versions of LLVM.
Mostly filing issue to track the status of CMake support in SVF :).
Any plans on adding it in order to support continued use with LLVM?
Hello,
I am interested in figuring out which instance of a memory allocation function (such as malloc) in a program could have allocated memory pointed to by pointers that are passed as arguments to certain functions. Is is possible to extract this information at the source code level with SVF? I understand that SVF is implemented to use the LLVM IR, but curious to know if there is anyway to extract the underlying source code/AST information.
hi,
sir,does this tool support object-c?is there a plan for it?
eg.
typedef struct NODE{
int data;
int c;
struct NODE next;
}NODE;
void swap(){
NODE ni;
ni.c = (int )malloc(sizeof(int));
ni.data = 1;
}
I can get the IR like this
define void @swap() #0 !dbg !10 {
%1 = alloca %struct.NODE, align 8
call void @llvm.dbg.declare(metadata %struct.NODE %1, metadata !13, metadata !DIExpression()), !dbg !21
%2 = call noalias i8* @malloc(i64 4) #3, !dbg !22
%3 = bitcast i8* %2 to i32*, !dbg !23
%4 = getelementptr inbounds %struct.NODE, %struct.NODE* %1, i32 0, i32 1, !dbg !24
store i32* %3, i32** %4, align 8, !dbg !25
%5 = getelementptr inbounds %struct.NODE, %struct.NODE* %1, i32 0, i32 0, !dbg !26
store i32 1, i32* %5, align 8, !dbg !27
ret void, !dbg !28
}
the instruction is %2 = call noalias i8* @malloc(i64 4) #3, !dbg !22 , how to judge the malloc the heap is a struct ? can you give me the code ,thank you !
I am trying to get points to information of a dynamically created object.
Source code:
void indirect_allocator(char **ptr, int s) {
if(*ptr == NULL) {
*ptr = malloc(s);
}
}
int main() {
int h;
static char *global_ptr;
scanf("%d", &h);
if(h < 4) {
indirect_allocator(&global_ptr, h*3);
} else {
indirect_allocator(&global_ptr, h*5);
}
global_ptr[0] = 'f';
global_ptr[1] = 'q';
global_ptr[2] = 'a' + h;
global_ptr[3] = '\0';
printf("%s", global_ptr);
}
Here the goal is to identify that global_ptr[*]
instructions in the main
function are aliases to the object allocated by the malloc
call of indirect_allocator
function.
I am using the following code:
PointerAnalysis* currPta = new AndersenWaveDiffWithType();
currPta->analyze(svfModule);
// get the PAG
PAG *currentPAG = currPta->getPAG();
// Get the top-level variable
GlobalVariable *targetGlobVar = targetModule->getGlobalVariable("main.global_ptr", true);
// get points_to
NodeID targetNode = currentPAG->getValueNode(targetGlobVar);
PointsTo& objs = currPta->getPts(targetNode);
// here objs contains only one node, which is right i.e., @main.global_ptr = internal global i8* null, align 8
// Now, let's get objects pointed by this node.
PointsTo& objs2 = currPta->getPts(objs.find_first());
// here objs2 contains only one node, which is correct i.e., %call = call i8* @malloc(i64 %conv)
// Now, when I try to get all aliases to the malloced object i.e., (the above %call..)
std::set<NodeID> targetAliases;
targetAliases.clear();
for (NodeBS::iterator nIter = currPta->getAllValidPtrs().begin();
nIter != currPta->getAllValidPtrs().end();
++nIter) {
if(currPta->alias(*nIter, objs2.find_first()) != NoAlias) {
if(targetAliases.find(*nIter) == targetAliases.end()) {
targetAliases.insert(*nIter);
}
}
}
// I get NO aliases..
I expect to see the load
and getelementptr
instructions of the main function..but I do not see anything.
Am I missing something here?
Attached is the source file global_ptr_head_obj.c
and the corresponding bitcode file (global_ptr_head_obj.mem2reg.bc
) on which I am trying to run the analysis.
Hi,
When I use saber to try the use after free detection, but I get the segmantation fault.
The command is : ./saber -uaf $file.bc
I guess that i need use other options together, but I don't know which options should I use?
Do you have any documnets about this?
Is it possible to dump Program Dependence Graph (PDG) of an entrypoint (a method in a bitcode file) to a DOT file?
Thanks!
Hello again,
I create a pass to use your tool, so i integrate your tools in llvm and i can't figure out how to call WPAPass::runOnModule().
I want to call this function to get PTDataTy for flow sensitive pointer analysis.
But this function requires an svfModule and i have no idea how to instanciate it.
Any ideas?..
Is there any way to get all the pointers that are aliases to a given pointer or object (ex: alloca).
The way I can think of is..get all the pointers in the IR, do getPts
and compute the intersection.
Is there any other easier or recommended way?
Hello,
I wrote a pass in LLVM that required yours so i have theorically access to every object that i want in your tool. Do you know if there is a way to have information about the pointsTo set for a value at a given instruction (or a block) of an LLVM program?
Thanks!
Hello, i'm trying to integrate your work as an LLVM pass to reuse the result for an other pass.
I try to build it with llvm 6 but during the build i have severals errors like:
llvm-6.0.0.src/include/llvm/Util/DataFlowUtil.h:210:85: error: wrong number of template arguments (1, should be 2) class IteratedDominanceFrontier: public llvm::DominanceFrontierBase<llvm::BasicBlock> {
OR
error: ‘iterator’ does not name a type iterator getIDFSet(llvm::BasicBlock *B) {
Any ideas?
you say we can get the image SVF.ova and we can test the spec2000 and other benchmarks , but i can't connect the image SVF.ova, so I want to know the reason .
Thank you!
Hi, I'm using the default AndersonWaveDiff solver to run pointer analysis for c program. But I found the result is not precise, since a lot of nodes collapse to field-insensitive. By checking the code, it seems to be related to PWC Node. Could you give me more information about PWC Node , e.g., when it would be created? Besides, can we avoid node from collapsing to field-insensitive or any other ways to make the result more precise? Thanks!
Hello, when reading the source code of PointAnalysis, I don't quite understand the meaning of DummyValPN & DummyObjPN. The technical documentation said that
... represents an introduced dummy node to achieve field sensitivity when handling external library calls (e.g., memcpy, where pointers (LLVM Values) that point to the fields of an struct do not explicitly appear at an instruction)
So in field-insensitive analysis, we can entirely ignore these two nodes when trying to get the pts for each node?
Hi,
I check your -leak function in Saber.
I find the function "isInAWrapper", but I don't know what situation it handles.
For example, the following is not your case.
char * malloc_wrap(int n){
char* ptr = malloc(n);
return ptr;
}
Can you give me an example to illustrate its application?
Thanks!
I wrote a pass that leverages SVF, but I got some problems when running this pass.
Initially have the following line in my CMAKE file:
target_link_libraries(${PROJECT_NAME} LLVMSvf LLVMCudd ${llvm_libs})
This is similar to what I saw in the WPA tool of SVF. But this has the following linking issues. The reason is that WPA is an executable, and LLVMSvf and LLVMCudd are static libraries.
/usr/bin/ld: error: /mnt/data/Research/Library/SVF/build/lib/CUDD/libLLVMCudd.a(cuddExact.c.o): requires dynamic R_X86_64_PC32 reloc against 'free' which may overflow at runtime; recompile with -fPIC
/usr/bin/ld: error: /mnt/data/Research/Library/SVF/build/lib/CUDD/libLLVMCudd.a(cuddAnneal.c.o): requires dynamic R_X86_64_PC32 reloc against 'cuddNextLow' which may overflow at runtime; recompile with -fPIC
/usr/bin/ld: error: /mnt/data/Research/Library/SVF/build/lib/CUDD/libLLVMCudd.a(cuddLinear.c.o): requires dynamic R_X86_64_32 reloc which may overflow at runtime; recompile with -fPIC
/usr/bin/ld: error: /mnt/data/Research/Library/SVF/build/lib/CUDD/libLLVMCudd.a(cuddWindow.c.o): requires dynamic R_X86_64_PC32 reloc against 'cuddSwapInPlace' which may overflow at runtime; recompile with -fPIC
/usr/bin/ld: error: /mnt/data/Research/Library/SVF/build/lib/CUDD/libLLVMCudd.a(cuddGenetic.c.o): requires dynamic R_X86_64_PC32 reloc against 'st_lookup_int' which may overflow at runtime; recompile with -fPIC
...
So I changed the CMAKE file to:
target_link_libraries(${PROJECT_LIB_NAME} Svf Cudd ${llvm_libs})
This can successfully generate the .so file for my pass. However, when I ran the pass using opt, I got another error:
opt: CommandLine Error: Option 'bitcode-mdindex-threshold' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options.
Is anybody encounter the same problem?
Hi,
I'm currently working with your awesome SVF analysis. During some tests with the SVFG, I found a simple test-case that doesn't make sense to me.
Why does the SVFG in Test1 has a path from "local_x" into the second "init" call?
int *global_p1; // NodeID= 1
int *global_p2; // NodeID= 2
void init(int **pp, int *x) {
*pp = x;
}
void delete(int **pp) {
*pp = NULL;
}
void test1() {
int local_x = 1; // NodeID= 10
int local_y = 2; // NodeID= 11
init(&global_p1, &local_x);
init(&global_p2, &local_y);
delete(&global_p2);
}
void test2() {
int local_x = 1;// NodeID= 10
int local_y = 2;// NodeID= 11
int *local_p1; // NodeID= 12
int *local_p2; // NodeID= 13
init(&local_p1, &local_x);
init(&local_p2, &local_y);
delete(&local_p2);
}
int main() {
/*** Test 1:
* There will be a path from the "local_x" into the second call of "init",
* (10 -> 60 -> 20 -> 16 -> 21 -> 53 -> 21 -> ...)
* but "local_x" was only written into "global_p1" and not into "global_p2".
*/
//test1();
/*** Test 2:
* Writing "local_x" into "local_p1" creates the SVFG, which I would
* have expected also in Test1.
*/
//test2();
return 0;
}
These are the result I get with: saber -leak -dump-svfg main.bc
SVFG of Test1:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.