Git Product home page Git Product logo

repo_info_extractor's People

Contributors

alexwayfer avatar alimgiray avatar brunolm avatar codersrankorg avatar crhraban avatar exphoenee avatar fearless-spider avatar ferki avatar gapercoco avatar gentoid avatar itnelo avatar jj avatar kassane avatar kasztp avatar kitswas avatar klarkc avatar matfax avatar mattgenious avatar nibba2018 avatar nolimits4web avatar peti2001 avatar rabxly avatar shank318 avatar siphalor avatar smortex avatar spasma avatar thisaruguruge avatar twodcube avatar vhraban avatar vutny avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

repo_info_extractor's Issues

missing info in installation instructions for Linux

I see that README suggests to install pip on OSX, but fails to do it on Linux (where is equally necessary).
You should add this line to Linux instructions:

sudo apt install python-pip

of course, alternate instructions should be provided for non-Debian based distros (something like "yum install python-pip", sorry I don't use RedHat-based distros)

No module named git

I've installed and I'm trying to run the run.sh and I'm getting the following error.

Traceback (most recent call last):
  File "src/main.py", line 2, in <module>
    import git
ImportError: No module named git

I've also installed gitpython

pip install gitpython
sudo apt-get install python3-git

I'm using Ubuntu 19.04 with python 3.7

Extract libraries from Go

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Regexes are used to recognize library imports.
This library detection must be done language by language since each language has a different syntax to import libraries.
We created a JavaScript extractor as the first working library extractor.

Requirements

Extract libraries from Java

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Regexes are used to recognize library imports.
This library detection must be done language by language since each language has a different syntax to import libraries.
We created a JavaScript extractor as the first working library extractor.

Requirements

Traceback error on Mac through Docker

Traceback (most recent call last):
  File "src/main.py", line 2, in <module>
    from init import initialize
  File "/app/src/init.py", line 2, in <module>
    import git
  File "/opt/venv/lib/python3.7/site-packages/git/__init__.py", line 38, in <module>
    from git.exc import *                       # @NoMove @IgnorePep8
  File "/opt/venv/lib/python3.7/site-packages/git/exc.py", line 9, in <module>
    from git.compat import UnicodeMixin, safe_decode, string_types
  File "/opt/venv/lib/python3.7/site-packages/git/compat.py", line 16, in <module>
    from gitdb.utils.compat import (
ModuleNotFoundError: No module named 'gitdb.utils.compat'
Finished

Extract libraries from C#

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Regexes are used to recognize library imports.
This library detection must be done language by language since each language has a different syntax to import libraries.
We created a JavaScript extractor as the first working library extractor.

Requirements

Permissions error

> ./install.sh
...
Collecting smmap2>=2.0.0 (from gitdb2>=2.0.0->GitPython==2.1.11->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/55/d2/866d45e3a121ee15a1dc013824d58072fd5c7799c9c34d01378eb262ca8f/smmap2-2.0.5-py2.py3-none-any.whl
Installing collected packages: smmap2, gitdb2, GitPython, prompt-toolkit, regex, whaaaaat, requests-toolbelt
Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/lib/python3.7/site-packages/smmap2-2.0.5.dist-info'

Arch Linux, pip installed as package: https://www.archlinux.org/packages/extra/any/python-pip/

Auto update

The beginning of the script we should check is there a newer version available if yes it should offer to auto-update the script.
It should show the first lines of the commits to give hint what is fixed/improved.

Error executing the "python src\main.py path\to\repo" command on windows

Hi

When executing the "python src\main.py path\to\my_repo" command on windows

I get the following result:

Traceback (most recent call last):
File "src\main.py", line 4, in
from export_result import ExportResult
File "D:\projects\Coders Rank\repo_info_extractor\src\export_result.py", line 5, in
from upload import uploadRepo
File "D:\projects\Coders Rank\repo_info_extractor\src\upload.py", line 11
r = requests.post(url, files=files)
^
TabError: inconsistent use of tabs and spaces in indentation

somebody can help me. Thanks

Windows 10
Python 3.7.4

Extract libraries from C

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Regexes are used to recognize library imports.
This library detection must be done language by language since each language has a different syntax to import libraries.
We created a JavaScript extractor as the first working library extractor.

Requirements

Support Perl module detection

Most of my open source activity are related to Perl, and most of it was quite obviously missing from the results on my profile. It turned out that the files with .pm extensions are not recognized as Perl modules, thus cannot be counted against Perl experience.

There are probably other Perl aspects missing, but this could be a first step towards a more complete experience. I plan to send an initial pull request to add support for this.

Add cython language support

Cython is a programming language that aims to be a superset of the Python programming language, designed to give C-like performance with code that is written mostly in Python with optional additional C-inspired syntax. Cython is a compiled language that is typically used to generate CPython extension modules.Wikipedia

*.pix file extension seems to be used for Cython source files

Traceback error

I got this error after analyzing command. (Windows 10 OS)

[20/11/2019 17:55:06] Copying the repository to a temporary location, this can take a while... Traceback (most recent call last): File "src\main.py", line 46, in <module> main() File "src\main.py", line 37, in main initialize(args.directory, args.skip_obfuscation, args.output, args.parse_libraries, args.email, args.skip_upload) File "C:\Users\nihad\AndroidStudioProjects\repo_info_extractor\src\init.py", line 70, in initialize libs = al.get_libraries() File "C:\Users\nihad\AndroidStudioProjects\repo_info_extractor\src\analyze_libraries.py", line 33, in get_libraries shutil.copytree(self.basedir, tmp_repo_path, symlinks=True) File "C:\ProgramData\Anaconda3\lib\shutil.py", line 368, in copytree raise Error(errors) shutil.Error: [('C:\\Users\\nihad\\AndroidStudioProjects\\REPO_NAME\\android\\vendor\\bundle\\ruby\\2.3.0\\gems\\fastlane-2.123.0\\fastlane\\swift\\FastlaneSwiftRunner\\FastlaneSwiftRunner.xcodeproj\\project.xcworkspace\\xcuserdata\\josh.xcuserdatad\\UserInterfaceState.xcuserstate', 'C:\\Users\\nihad\\AppData\\Local\\Temp\\38febda8-c011-4c53-8eae-a549cc2c2f87\\android\\vendor\\bundle\\ruby\\2.3.0\\gems\\fastlane-2.123.0\\fastlane\\swift\\FastlaneSwiftRunner\\FastlaneSwiftRunner.xcodeproj\\project.xcworkspace\\xcuserdata\\josh.xcuserdatad\\UserInterfaceState.xcuserstate', "[Errno 2] No such file or directory: 'C:\\\\Users\\\\nihad\\\\AppData\\\\Local\\\\Temp\\\\38febda8-c011-4c53-8eae-a549cc2c2f87\\\\android\\\\vendor\\\\bundle\\\\ruby\\\\2.3.0\\\\gems\\\\fastlane-2.123.0\\\\fastlane\\\\swift\\\\FastlaneSwiftRunner\\\\FastlaneSwiftRunner.xcodeproj\\\\project.xcworkspace\\\\xcuserdata\\\\josh.xcuserdatad\\\\UserInterfaceState.xcuserstate'")]

ImportError: No module named git

I have an error when running the line ./run.sh path/to/repository :

Traceback (most recent call last):
File "src/main.py", line 2, in
import git
ImportError: No module named git

I tried different packages installation, but could not find the good one.
Can you tell me which one I should install?

Extract languages from Kotlin

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Regexes are used to recognize library imports.
This library detection must be done language by language since each language has a different syntax to import libraries.
We created a JavaScript extractor as the first working library extractor.

Requirements

"changedFiles" always empty

Tried to export 3 repositories, one small 2 bigger ones.
all commits.changedFiles in .json file are empty arrays.

Python 3.7.4

Easy to use docker image

There should be a docker image if somebody doesn't want to struggle with the Python and package installation.

It should work something like this:
docker run -it -v ./:/src image-name ./run.sh
After this, the JSON output will be in the repo's directory.

Dart language support

Is it possible for you to add support for dart language (Flutter). I am currently trying to get stats from your scripts and it is finishing without any errors. But, the thing is that I cannot find "repo_data.json.zip" file.

OS: Windows.

After this command: python src\main.py path\to\repo
I am getting this message:
(base) C:\Users\nihad\AndroidStudioProjects\repo_info\repo_info_extractor>python src\main.py C:\Users\nihad\AndroidStudioProjects\flutter\PROJECT_NAME Analyzing repo under C:\Users\nihad\AndroidStudioProjects\flutter\PROJECT_NAME... [============================================================] 100.0% ...Analyzing commits ? The following contributors were found in the repository. Select which ones you are. (With SPACE you can select more than one) [Nihad Delic -> EMAIL] [29/10/2019 11:36:40] Filtering commits by emails . ['EMAIL'] [29/10/2019 11:36:40] Copying the repository to a temporary location, this can take a while... [29/10/2019 11:36:58] Finished copying the repository

But I need to tell you also, great job guys

stderr: 'fatal: not a git repository (or any parent up to mount point /)

> ./run.sh ~/Projects/private_project/
Search repos on /home/alex/Projects/private_project/, 1 folders deep 
Found 1 repos under /home/alex/Projects/Work/private_project/
 
Analyzing repo under /home/alex/Projects/private_project ...
[============================================================] 100.0% ...Analyzing commits
? The following contributors were found in the repository.             Select which ones you are. (With SPACE you can select more tha
[22/07/2020 02:33:00] Filtering commits by emails:  ['[email protected]', '[email protected]']
[22/07/2020 02:33:00] Copying the repository to a temporary location, this can take a while...
[22/07/2020 02:35:27] Finished copying the repository to /tmp/396f02a3-89df-4273-bed4-3ea513cfb81b
Traceback (most recent call last):
  File "src/main.py", line 58, in <module>
    main()
  File "src/main.py", line 46, in main
    initialize(args.directory, args.skip_obfuscation, args.output,
  File "/home/alex/Projects/repo_info_extractor/src/init.py", line 93, in initialize
    libs = al.get_libraries()
  File "/home/alex/Projects/repo_info_extractor/src/analyze_libraries.py", line 47, in get_libraries
    repo = self._initialize_repository(tmp_repo_path)
  File "/home/alex/Projects/repo_info_extractor/src/analyze_libraries.py", line 177, in _initialize_repository
    repo.git.clean('-fd')
  File "/home/alex/.local/lib/python3.8/site-packages/git/cmd.py", line 542, in <lambda>
    return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
  File "/home/alex/.local/lib/python3.8/site-packages/git/cmd.py", line 1005, in _call_process
    return self.execute(call, **exec_kwargs)
  File "/home/alex/.local/lib/python3.8/site-packages/git/cmd.py", line 822, in execute
    raise GitCommandError(command, status, stderr_value, stdout_value)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
  cmdline: git clean -fd
  stderr: 'fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).'
Finished

> ls /home/alex/Projects/private_project
branches/       config       FETCH_HEAD  hooks/  info/  objects/   packed-refs
COMMIT_EDITMSG  description  HEAD        index   logs/  ORIG_HEAD  refs/

ModuleNotFoundError: No module named 'gitdb.utils.compat'

When I run ./run-docker.sh pathToRepo I get this error:

Traceback (most recent call last):
  File "src/main.py", line 2, in <module>
    from init import initialize
  File "/app/src/init.py", line 2, in <module>
    import git
  File "/opt/venv/lib/python3.7/site-packages/git/__init__.py", line 38, in <module>
    from git.exc import *                       # @NoMove @IgnorePep8
  File "/opt/venv/lib/python3.7/site-packages/git/exc.py", line 9, in <module>
    from git.compat import UnicodeMixin, safe_decode, string_types
  File "/opt/venv/lib/python3.7/site-packages/git/compat.py", line 16, in <module>
    from gitdb.utils.compat import (
ModuleNotFoundError: No module named 'gitdb.utils.compat'
Finished

Syntax mixing

I may be wrong but,why are you mixing syntax between python 2 and 3 in upload.py?
The function showError(),on one part we have print err while on another print("Unable ...").
So I change to python 3 syntax to find raw__input().lower() in questions.py. Am I missing something?

Extract libraries from PHP

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Regexes are used to recognize library imports.
This library detection must be done language by language since each language has a different syntax to import libraries.
We created a JavaScript extractor as the first working library extractor.

Requirements

set_commit_stats never runs

Expected Behavior

run main.py with directory argument
get a zipped json file containing contribution information

Actual Behavior

run main.py with directory argument

get a zipped file containing contribution information except all commits are missing elements in the changedFiles array like this:

            "authorEmail": "373ffeff04c386969edac0edc2cf2628", 
            "isDuplicated": false, 
            "authorName": "fa20e926ccd1e69da76a9bb5caeb826b", 
            "parents": [
                "545522fdab9c25a1ce8a8ef5c6dfbee6804e115c"
            ], 
            "commitHash": "ae3e69f87557e3825f84db37860019d2a213183e", 
            "isMerge": false, 
            "changedFiles": [], 
            "createdAt": "2019-06-17 14:48:20"

it seems that set_commit_stats is never run

Steps to Reproduce the Problem

  1. run main.py with valid git repository as argument

Specifications

  • Platform: Windows 10 1903 build 18932.1000
  • Python: 2.7.16

Finish run.bat

We want the script easy to use and this is why we introduced the run.bat and the run.sh.

The run.sh works perfectly however, the run.bat is empty. This is supposed to be the Windows version.

Expected behaviour:
run.bat --help: will show the CLI help
run.bat: Error too few arguments.
run.bat c:\path\to\repo: extract data from the repository
run.bat c:\path\to\repo --output=..\test.json extract data from the repo to the test.json into the parent directory

Forks of a repo may get higher score than the main one

As a follow-up from #122 (comment):

I noticed a strange thing though. I get score for the forks of a repo as well, often more than I get for the canonical one. Not sure if that's normal or expected (or even related).

I noticed it with the RexOps/Rex repo, which is scored as 22.3 experience points as of this moment.

It has an sdondley/Rex fork, which is a community fork scored as 52.2. I expected ~0 since it's not my fork/repo.

404s on token page

When I click ENTER when the Do you want to upload the result to your profile automatically? [Y/n] prompt shows up, it opens a new tab on my browser which always (I've reproduced that on all private repos I where I used this script) shows this:
image

Continuous Integration Testing

As I noticed, you're using Docker for CI testing. It would be helpful, however, to have at least one build check in pull requests. Maybe coverage as well, so that missing tests don't go unnoticed.

If you're interested, I could create an example GitHub action workflow on my fork that you could adopt. GitHub actions have the advantage that they come with integrated build checks and caching support as well. This would speed up build times and give immediate feedback for flawed pull requests.

Windows 10 issue

I just copy the error code here to see if it will help once.

Initialization...
Initialization...
Initialization...
Initialization...
Initialization...
Traceback (most recent call last):
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 1, in <module>
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 105, in spawn_main
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
    exitcode = _main(fd)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 114, in _main
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 225, in prepare
    prepare(preparation_data)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main__")
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 263, in run_path
    run_name="__mp_main__")
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 96, in _run_module_code
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 85, in _run_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\webserver\www\repo_info_extractor\src\main.py", line 22, in <module>
    exec(code, run_globals)
  File "C:\webserver\www\repo_info_extractor\src\main.py", line 22, in <module>
    ar.get_commit_stats()
  File "C:\webserver\www\repo_info_extractor\src\analyze_repo.py", line 44, in get_commit_stats
    ar.get_commit_stats()
  File "C:\webserver\www\repo_info_extractor\src\analyze_repo.py", line 44, in get_commit_stats
    pool = mp.Pool(mp.cpu_count())
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\context.py", line 119, in Pool
    context=self.get_context())
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 176, in __init__
    self._repopulate_pool()
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 241, in _repopulate_pool
    w.start()
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\process.py", line 112, in start
    pool = mp.Pool(mp.cpu_count())
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\context.py", line 119, in Pool
    self._popen = self._Popen(self)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\context.py", line 322, in _Popen
    context=self.get_context())
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 176, in __init__
    return Popen(process_obj)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    self._repopulate_pool()
    _check_not_importing_main()
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 241, in _repopulate_pool
Traceback (most recent call last):
    is not going to be frozen to produce an executable.''')
    w.start()
  File "<string>", line 1, in <module>
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

Support extracting imported Perl modules as libraries

Perl has two slightly different ways to import code from other modules: require and use. If I understand correctly, we'd have to extract the module names from these statements in order to allow CodersRank to treat them as libraries.

Part of the complexity is that both import methods have quite a few different format. I'm not sure if regular expressions are the best approach to parse all the variations, but my initial attempts seem to be able to handle at least the most important (common?) forms.

If this is an interesting direction, I'm willing to prepare a pull request with at least a starting point.

Extract libraries from Swift

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Regexes are used to recognize library imports.
This library detection must be done language by language since each language has a different syntax to import libraries.
We created a JavaScript extractor as the first working library extractor.

Requirements

Cannot extract data from microsoft/vscode

When you try to calculate the private repo zip from https://github.com/c the script dies with an error:

Traceback (most recent call last):
  File "src/main.py", line 20, in <module>
    ar.create_commits_entity_from_branch(branch.name)
  File "[...]/repo_info_extractor/src/analyze_repo.py", line 34, in create_commits_entity_from_branch
    self.commit_list[commit.hexsha] = Commit(commit.author.name, commit.author.email, commit.committed_datetime, commit.hexsha, commit.parents, branch, self.skip_obfuscation)
  File "[...]/repo_info_extractor/src/entity/commit.py", line 91, in _init_
    self.obfuscate()
  File "[...]/repo_info_extractor/src/entity/commit.py", line 103, in obfuscate
    md5_hash.update(self.author_email.encode('utf-8'))
AttributeError: 'NoneType' object has no attribute 'encode'

It is happening if there is a commit with an author who has no email or name.

Module name git unable to find

Python in system - 3.8
Windows Machine

Requirement.txt -
GitPython==3.1.0
prompt_toolkit==1.0.14
whaaaaat==0.5.2
requests
requests_toolbelt
nose2
Pygments
numpy

Repository extraction does not work

Follow each of the steps then place a route to a repository that is private but I have stored locally and started to process then started to appear this again and again.

image

Mac installation & usage

Got this working on a Mac with:

$ git clone https://github.com/codersrankOrg/repo_info_extractor.git
$ cd repo_info_extractor

Install pip (from https://stackoverflow.com/a/38043109/174172) & other requirements:

$ curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
$ python get-pip.py
$ sudo pip install -r requirements.txt
$ sudo ./install.sh
$ ./run.sh path/to/repository
$ ls -al ./repo_data.json

Exclude popular dependency directories

In some repositories, the dependencies are also included. With a simple string matching in the path we can exclude them.
eg:

  • node_modules (node.js)
  • vendor (Go)
  • etc

Extract libraries from Python

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Regexes are used to recognize library imports.
This library detection must be done language by language since each language has a different syntax to import libraries.
We created a JavaScript extractor as the first working library extractor.

Requirements

Extract libraries from C++

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Regexes are used to recognize library imports.
This library detection must be done language by language since each language has a different syntax to import libraries.
We created a JavaScript extractor as the first working library extractor.

Requirements

SyntaxError: invalid syntax

I try to analyse a repo, but I get error.
I try with 2 different folder.

I use: python src\main.py C:\Users\Johan\Desktop\TEMP\repos\app-master

Traceback (most recent call last): File "src\main.py", line 2, in <module> from init import initialize File "C:\Users\Johan\Desktop\TEMP\codersrank\src\init.py", line 6, in <module> from analyze_libraries import AnalyzeLibraries File "C:\Users\Johan\Desktop\TEMP\codersrank\src\analyze_libraries.py", line 101 print(timed_message, *argv) ^ SyntaxError: invalid syntax

gets stuck on Initialization...

The tools runs well on just about all my private repositories.

I have one which is over four years old and the tool gets stuck at the text Initialization....
It's been over an hour.
What can I do?

I can't give access to the repository as it's private. Is there a way to debug this? Maybe a parameter which displays the commands being run so I can see on which it is getting stuck?

Much thanks!

Xtend files not getting detected

Hello,

Xtend is a statically-typed programming language which translates to comprehensible Java source code.
Please make an adjustment to detect xtend files.

Thanks
Nagaraj

Count in all branches

Hi, it would be good if, extractor count commits from all repo branches, not only master, or if branch could be set via params.

Thx!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.