Git Product home page Git Product logo

ccstokener's Introduction

0. Introduction

  • ccstokener: clone detection tool
  • 8projects: four java open source projects and four c open source projects.
  • java-label-pairs: the files in pairs.zip are the sampled true pairs and false pairs from four java open source projects. astnn.zip and tbccd.zip are the clone pairs organized into the format used by ASTNN and TBCCD.

1. Install Dependency

CCStokener runs under Python3 environment, and you should first install pandarallel package to run CCStokener. After performing clone detection, the results of clone pairs will be stored in results directory.

pip3 install pandarallel

2. How to run CCStokener to perform clone detection

Our tool, CCStokener, is in directory ccstokener, now it supports detection for c and java language. runner.py is the runner file.

python runner.py -i /path/to/dataset -m common/bcb -l java/c [-t 0.6]
  • -i: path to dataset
  • -m: detection mode. If you detect on BigCloneBench, the mode will be bcb, otherwise, the mode will be common.
  • -l: language of datasets, now we support detection for java and c languages.
  • -t: similarity threshold of clone pairs, it is optional. If it is not setting, we will use the default threshold.

Example: You can run CCStokener as follows to perform clone detection for BigCloneBench.

python runner.py -i /home/datasets/BigCloneEval-master/ijadataset/bcb_reduced -m bcb -l java

3. How to use CCStokener to verify whether clone pairs are true or not?

We support verify whether clone pairs are true or not. The clone pairs should edit in a file firstly, then call verify.py as follows to verify.

python verify.py -i /path/to/dataset -l java/c
  • -i: path to the clone pair files directory.
  • -l: language of clone pair, support java or c language.

ccstokener's People

Contributors

xiaoven avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.