codersrank-org / repo_info_extractor Goto Github PK

Use this script to extract data from your private repo. This data is used to calculate your score. https://codersrank.io

License: MIT License

Go 99.45% Makefile 0.55%

hacktoberfest

repo_info_extractor's People

Contributors

Stargazers

Watchers

repo_info_extractor's Issues

Off-Topic: Patch `perpage` limit

https://profile.codersrank.io/leaderboard/developer?page=1&perpage=10000

missing info in installation instructions for Linux

I see that README suggests to install pip on OSX, but fails to do it on Linux (where is equally necessary).
You should add this line to Linux instructions:

sudo apt install python-pip

of course, alternate instructions should be provided for non-Debian based distros (something like "yum install python-pip", sorry I don't use RedHat-based distros)

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Regexes are used to recognize library imports.
This library detection must be done language by language since each language has a different syntax to import libraries.
We created a JavaScript extractor as the first working library extractor.

Requirements

Create a new extractor for Go.
Place the new file here: https://github.com/codersrank-org/repo_info_extractor/tree/master/src/language
Implement the def extract_libraries(files): function. The input this an array of paths to the files. The output is an array of the recognized libraries.
Add UnitTest too here: https://github.com/codersrank-org/repo_info_extractor/tree/master/test

Extract libraries from Java

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Requirements

Create a new extractor for Java.
Place the new file here: https://github.com/codersrank-org/repo_info_extractor/tree/master/src/language
Implement the def extract_libraries(files): function. The input this an array of paths to the files. The output is an array of the recognized libraries.
Add UnitTest too here: https://github.com/codersrank-org/repo_info_extractor/tree/master/test

where is the location of temporary data that is copied during private repo's commit calculation ?

Hi,

when using docker cli to generate the json commit data, it shows this. Copying the repository to a temporary location.

Where is this location ? Is this deleted automatically ?

Because otherwise, we will have to go delete it ourselves

Traceback error on Mac through Docker

Traceback (most recent call last):
  File "src/main.py", line 2, in <module>
    from init import initialize
  File "/app/src/init.py", line 2, in <module>
    import git
  File "/opt/venv/lib/python3.7/site-packages/git/__init__.py", line 38, in <module>
    from git.exc import *                       # @NoMove @IgnorePep8
  File "/opt/venv/lib/python3.7/site-packages/git/exc.py", line 9, in <module>
    from git.compat import UnicodeMixin, safe_decode, string_types
  File "/opt/venv/lib/python3.7/site-packages/git/compat.py", line 16, in <module>
    from gitdb.utils.compat import (
ModuleNotFoundError: No module named 'gitdb.utils.compat'
Finished

Add (fix) contributing documentation

I guess, you forgot something:

Extract libraries from C#

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Requirements

Create a new extractor for C#.
Place the new file here: https://github.com/codersrank-org/repo_info_extractor/tree/master/src/language
Implement the def extract_libraries(files): function. The input this an array of paths to the files. The output is an array of the recognized libraries.
Add UnitTest too here: https://github.com/codersrank-org/repo_info_extractor/tree/master/test

Permissions error

> ./install.sh
...
Collecting smmap2>=2.0.0 (from gitdb2>=2.0.0->GitPython==2.1.11->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/55/d2/866d45e3a121ee15a1dc013824d58072fd5c7799c9c34d01378eb262ca8f/smmap2-2.0.5-py2.py3-none-any.whl
Installing collected packages: smmap2, gitdb2, GitPython, prompt-toolkit, regex, whaaaaat, requests-toolbelt
Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/lib/python3.7/site-packages/smmap2-2.0.5.dist-info'

Arch Linux, pip installed as package: https://www.archlinux.org/packages/extra/any/python-pip/

Auto update

The beginning of the script we should check is there a newer version available if yes it should offer to auto-update the script.
It should show the first lines of the commits to give hint what is fixed/improved.

Error executing the "python src\main.py path\to\repo" command on windows

When executing the "python src\main.py path\to\my_repo" command on windows

I get the following result:

Traceback (most recent call last):
File "src\main.py", line 4, in
from export_result import ExportResult
File "D:\projects\Coders Rank\repo_info_extractor\src\export_result.py", line 5, in
from upload import uploadRepo
File "D:\projects\Coders Rank\repo_info_extractor\src\upload.py", line 11
r = requests.post(url, files=files)
^
TabError: inconsistent use of tabs and spaces in indentation

somebody can help me. Thanks

Windows 10
Python 3.7.4

Extract libraries from C

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Requirements

Create a new extractor for C.
Place the new file here: https://github.com/codersrank-org/repo_info_extractor/tree/master/src/language
Implement the def extract_libraries(files): function. The input this an array of paths to the files. The output is an array of the recognized libraries.
Add UnitTest too here: https://github.com/codersrank-org/repo_info_extractor/tree/master/test

Support Perl module detection

Most of my open source activity are related to Perl, and most of it was quite obviously missing from the results on my profile. It turned out that the files with .pm extensions are not recognized as Perl modules, thus cannot be counted against Perl experience.

There are probably other Perl aspects missing, but this could be a first step towards a more complete experience. I plan to send an initial pull request to add support for this.

Add cython language support

Cython is a programming language that aims to be a superset of the Python programming language, designed to give C-like performance with code that is written mostly in Python with optional additional C-inspired syntax. Cython is a compiled language that is typically used to generate CPython extension modules.Wikipedia

*.pix file extension seems to be used for Cython source files

Error when analyzing a private repository

PermissionError: [WinError 5] Acceso denegado:

Traceback error

I got this error after analyzing command. (Windows 10 OS)

[20/11/2019 17:55:06] Copying the repository to a temporary location, this can take a while... Traceback (most recent call last): File "src\main.py", line 46, in <module> main() File "src\main.py", line 37, in main initialize(args.directory, args.skip_obfuscation, args.output, args.parse_libraries, args.email, args.skip_upload) File "C:\Users\nihad\AndroidStudioProjects\repo_info_extractor\src\init.py", line 70, in initialize libs = al.get_libraries() File "C:\Users\nihad\AndroidStudioProjects\repo_info_extractor\src\analyze_libraries.py", line 33, in get_libraries shutil.copytree(self.basedir, tmp_repo_path, symlinks=True) File "C:\ProgramData\Anaconda3\lib\shutil.py", line 368, in copytree raise Error(errors) shutil.Error: [('C:\\Users\\nihad\\AndroidStudioProjects\\REPO_NAME\\android\\vendor\\bundle\\ruby\\2.3.0\\gems\\fastlane-2.123.0\\fastlane\\swift\\FastlaneSwiftRunner\\FastlaneSwiftRunner.xcodeproj\\project.xcworkspace\\xcuserdata\\josh.xcuserdatad\\UserInterfaceState.xcuserstate', 'C:\\Users\\nihad\\AppData\\Local\\Temp\\38febda8-c011-4c53-8eae-a549cc2c2f87\\android\\vendor\\bundle\\ruby\\2.3.0\\gems\\fastlane-2.123.0\\fastlane\\swift\\FastlaneSwiftRunner\\FastlaneSwiftRunner.xcodeproj\\project.xcworkspace\\xcuserdata\\josh.xcuserdatad\\UserInterfaceState.xcuserstate', "[Errno 2] No such file or directory: 'C:\\\\Users\\\\nihad\\\\AppData\\\\Local\\\\Temp\\\\38febda8-c011-4c53-8eae-a549cc2c2f87\\\\android\\\\vendor\\\\bundle\\\\ruby\\\\2.3.0\\\\gems\\\\fastlane-2.123.0\\\\fastlane\\\\swift\\\\FastlaneSwiftRunner\\\\FastlaneSwiftRunner.xcodeproj\\\\project.xcworkspace\\\\xcuserdata\\\\josh.xcuserdatad\\\\UserInterfaceState.xcuserstate'")]

ImportError: No module named git

I have an error when running the line ./run.sh path/to/repository :

Traceback (most recent call last):
File "src/main.py", line 2, in
import git
ImportError: No module named git

I tried different packages installation, but could not find the good one.
Can you tell me which one I should install?

Extract languages from Kotlin

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Requirements

Create a new extractor for Kotlin.
Place the new file here: https://github.com/codersrank-org/repo_info_extractor/tree/master/src/language
Implement the def extract_libraries(files): function. The input this an array of paths to the files. The output is an array of the recognized libraries.
Add UnitTest too here: https://github.com/codersrank-org/repo_info_extractor/tree/master/test

"changedFiles" always empty

Tried to export 3 repositories, one small 2 bigger ones.
all commits.changedFiles in .json file are empty arrays.

Python 3.7.4

Easy to use docker image

There should be a docker image if somebody doesn't want to struggle with the Python and package installation.

It should work something like this:
docker run -it -v ./:/src image-name ./run.sh
After this, the JSON output will be in the repo's directory.

Upload the result automatically

Improve the repo extractor (https://github.com/codersrank-org/repo_info_extractor) to give the user an option to upload the result automatically. This option should be available after selecting the email address.

Extracting all data from all the private repo when personal token is provided

Now:
Extract data one-by-one for every private repo

Feature Request:
Extract data from all private repos in a single go

Different email and limit processing commits upto certain date

How to provide multiple email for same user in different private repos and their commits for example [email protected] and [email protected] ??
Secondly, is it possible to limit processing to up-to one year back for example for huge repos??

Dart language support

Is it possible for you to add support for dart language (Flutter). I am currently trying to get stats from your scripts and it is finishing without any errors. But, the thing is that I cannot find "repo_data.json.zip" file.

OS: Windows.

After this command: python src\main.py path\to\repo
I am getting this message:
(base) C:\Users\nihad\AndroidStudioProjects\repo_info\repo_info_extractor>python src\main.py C:\Users\nihad\AndroidStudioProjects\flutter\PROJECT_NAME Analyzing repo under C:\Users\nihad\AndroidStudioProjects\flutter\PROJECT_NAME... [============================================================] 100.0% ...Analyzing commits ? The following contributors were found in the repository. Select which ones you are. (With SPACE you can select more than one) [Nihad Delic -> EMAIL] [29/10/2019 11:36:40] Filtering commits by emails . ['EMAIL'] [29/10/2019 11:36:40] Copying the repository to a temporary location, this can take a while... [29/10/2019 11:36:58] Finished copying the repository

But I need to tell you also, great job guys

stderr: 'fatal: not a git repository (or any parent up to mount point /)

> ./run.sh ~/Projects/private_project/
Search repos on /home/alex/Projects/private_project/, 1 folders deep 
Found 1 repos under /home/alex/Projects/Work/private_project/
 
Analyzing repo under /home/alex/Projects/private_project ...
[============================================================] 100.0% ...Analyzing commits
? The following contributors were found in the repository.             Select which ones you are. (With SPACE you can select more tha
[22/07/2020 02:33:00] Filtering commits by emails:  ['[email protected]', '[email protected]']
[22/07/2020 02:33:00] Copying the repository to a temporary location, this can take a while...
[22/07/2020 02:35:27] Finished copying the repository to /tmp/396f02a3-89df-4273-bed4-3ea513cfb81b
Traceback (most recent call last):
  File "src/main.py", line 58, in <module>
    main()
  File "src/main.py", line 46, in main
    initialize(args.directory, args.skip_obfuscation, args.output,
  File "/home/alex/Projects/repo_info_extractor/src/init.py", line 93, in initialize
    libs = al.get_libraries()
  File "/home/alex/Projects/repo_info_extractor/src/analyze_libraries.py", line 47, in get_libraries
    repo = self._initialize_repository(tmp_repo_path)
  File "/home/alex/Projects/repo_info_extractor/src/analyze_libraries.py", line 177, in _initialize_repository
    repo.git.clean('-fd')
  File "/home/alex/.local/lib/python3.8/site-packages/git/cmd.py", line 542, in <lambda>
    return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
  File "/home/alex/.local/lib/python3.8/site-packages/git/cmd.py", line 1005, in _call_process
    return self.execute(call, **exec_kwargs)
  File "/home/alex/.local/lib/python3.8/site-packages/git/cmd.py", line 822, in execute
    raise GitCommandError(command, status, stderr_value, stdout_value)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
  cmdline: git clean -fd
  stderr: 'fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).'
Finished

> ls /home/alex/Projects/private_project
branches/       config       FETCH_HEAD  hooks/  info/  objects/   packed-refs
COMMIT_EDITMSG  description  HEAD        index   logs/  ORIG_HEAD  refs/

ModuleNotFoundError: No module named 'gitdb.utils.compat'

When I run ./run-docker.sh pathToRepo I get this error:

Traceback (most recent call last):
  File "src/main.py", line 2, in <module>
    from init import initialize
  File "/app/src/init.py", line 2, in <module>
    import git
  File "/opt/venv/lib/python3.7/site-packages/git/__init__.py", line 38, in <module>
    from git.exc import *                       # @NoMove @IgnorePep8
  File "/opt/venv/lib/python3.7/site-packages/git/exc.py", line 9, in <module>
    from git.compat import UnicodeMixin, safe_decode, string_types
  File "/opt/venv/lib/python3.7/site-packages/git/compat.py", line 16, in <module>
    from gitdb.utils.compat import (
ModuleNotFoundError: No module named 'gitdb.utils.compat'
Finished

Syntax mixing

I may be wrong but,why are you mixing syntax between python 2 and 3 in upload.py?
The function showError(),on one part we have print err while on another print("Unable ...").
So I change to python 3 syntax to find raw__input().lower() in questions.py. Am I missing something?

Extract libraries from PHP

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Requirements

Create a new extractor for PHP. Support PHP5 and PHP7 too.
Place the new file here: https://github.com/codersrank-org/repo_info_extractor/tree/master/src/language
Implement the def extract_libraries(files): function. The input this an array of paths to the files. The output is an array of the recognized libraries.
Add UnitTest too here: https://github.com/codersrank-org/repo_info_extractor/tree/master/test

set_commit_stats never runs

Expected Behavior

run main.py with directory argument
get a zipped json file containing contribution information

Actual Behavior

run main.py with directory argument

get a zipped file containing contribution information except all commits are missing elements in the changedFiles array like this:

            "authorEmail": "373ffeff04c386969edac0edc2cf2628", 
            "isDuplicated": false, 
            "authorName": "fa20e926ccd1e69da76a9bb5caeb826b", 
            "parents": [
                "545522fdab9c25a1ce8a8ef5c6dfbee6804e115c"
            ], 
            "commitHash": "ae3e69f87557e3825f84db37860019d2a213183e", 
            "isMerge": false, 
            "changedFiles": [], 
            "createdAt": "2019-06-17 14:48:20"

it seems that set_commit_stats is never run

Steps to Reproduce the Problem

run main.py with valid git repository as argument

Specifications

Platform: Windows 10 1903 build 18932.1000
Python: 2.7.16

Finish run.bat

We want the script easy to use and this is why we introduced the run.bat and the run.sh.

The run.sh works perfectly however, the run.bat is empty. This is supposed to be the Windows version.

Expected behaviour:
run.bat --help: will show the CLI help
run.bat: Error too few arguments.
run.bat c:\path\to\repo: extract data from the repository
run.bat c:\path\to\repo --output=..\test.json extract data from the repo to the test.json into the parent directory

Forks of a repo may get higher score than the main one

As a follow-up from #122 (comment):

I noticed a strange thing though. I get score for the forks of a repo as well, often more than I get for the canonical one. Not sure if that's normal or expected (or even related).

I noticed it with the RexOps/Rex repo, which is scored as 22.3 experience points as of this moment.

It has an sdondley/Rex fork, which is a community fork scored as 52.2. I expected ~0 since it's not my fork/repo.

404s on token page

When I click ENTER when the Do you want to upload the result to your profile automatically? [Y/n] prompt shows up, it opens a new tab on my browser which always (I've reproduced that on all private repos I where I used this script) shows this:

Continuous Integration Testing

As I noticed, you're using Docker for CI testing. It would be helpful, however, to have at least one build check in pull requests. Maybe coverage as well, so that missing tests don't go unnoticed.

If you're interested, I could create an example GitHub action workflow on my fork that you could adopt. GitHub actions have the advantage that they come with integrated build checks and caching support as well. This would speed up build times and give immediate feedback for flawed pull requests.

Windows 10 issue

I just copy the error code here to see if it will help once.

Initialization...
Initialization...
Initialization...
Initialization...
Initialization...
Traceback (most recent call last):
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 1, in <module>
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 105, in spawn_main
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
    exitcode = _main(fd)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 114, in _main
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 225, in prepare
    prepare(preparation_data)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main__")
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 263, in run_path
    run_name="__mp_main__")
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 96, in _run_module_code
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 85, in _run_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\webserver\www\repo_info_extractor\src\main.py", line 22, in <module>
    exec(code, run_globals)
  File "C:\webserver\www\repo_info_extractor\src\main.py", line 22, in <module>
    ar.get_commit_stats()
  File "C:\webserver\www\repo_info_extractor\src\analyze_repo.py", line 44, in get_commit_stats
    ar.get_commit_stats()
  File "C:\webserver\www\repo_info_extractor\src\analyze_repo.py", line 44, in get_commit_stats
    pool = mp.Pool(mp.cpu_count())
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\context.py", line 119, in Pool
    context=self.get_context())
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 176, in __init__
    self._repopulate_pool()
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 241, in _repopulate_pool
    w.start()
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\process.py", line 112, in start
    pool = mp.Pool(mp.cpu_count())
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\context.py", line 119, in Pool
    self._popen = self._Popen(self)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\context.py", line 322, in _Popen
    context=self.get_context())
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 176, in __init__
    return Popen(process_obj)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    self._repopulate_pool()
    _check_not_importing_main()
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
  File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 241, in _repopulate_pool
Traceback (most recent call last):
    is not going to be frozen to produce an executable.''')
    w.start()
  File "<string>", line 1, in <module>
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

Support extracting imported Perl modules as libraries

Perl has two slightly different ways to import code from other modules: require and use. If I understand correctly, we'd have to extract the module names from these statements in order to allow CodersRank to treat them as libraries.

Part of the complexity is that both import methods have quite a few different format. I'm not sure if regular expressions are the best approach to parse all the variations, but my initial attempts seem to be able to handle at least the most important (common?) forms.

If this is an interesting direction, I'm willing to prepare a pull request with at least a starting point.

Extract libraries from Swift

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Requirements

Create a new extractor for Swift.
Place the new file here: https://github.com/codersrank-org/repo_info_extractor/tree/master/src/language
Implement the def extract_libraries(files): function. The input this an array of paths to the files. The output is an array of the recognized libraries.
Add UnitTest too here: https://github.com/codersrank-org/repo_info_extractor/tree/master/test

Cannot extract data from microsoft/vscode

When you try to calculate the private repo zip from https://github.com/c the script dies with an error:

Traceback (most recent call last):
  File "src/main.py", line 20, in <module>
    ar.create_commits_entity_from_branch(branch.name)
  File "[...]/repo_info_extractor/src/analyze_repo.py", line 34, in create_commits_entity_from_branch
    self.commit_list[commit.hexsha] = Commit(commit.author.name, commit.author.email, commit.committed_datetime, commit.hexsha, commit.parents, branch, self.skip_obfuscation)
  File "[...]/repo_info_extractor/src/entity/commit.py", line 91, in _init_
    self.obfuscate()
  File "[...]/repo_info_extractor/src/entity/commit.py", line 103, in obfuscate
    md5_hash.update(self.author_email.encode('utf-8'))
AttributeError: 'NoneType' object has no attribute 'encode'

It is happening if there is a commit with an author who has no email or name.

Module name git unable to find

Python in system - 3.8
Windows Machine

Requirement.txt -
GitPython==3.1.0
prompt_toolkit==1.0.14
whaaaaat==0.5.2
requests
requests_toolbelt
nose2
Pygments
numpy

SyntaxError: invalid syntax on mac

got the error while trying to execute run.sh on mac

Limit the number of selected email addresses

If too many emails are selected here:

then on server-side there might be problem to process it.

Limit the maximum selectable emails up to 50.

Repository extraction does not work

Follow each of the steps then place a route to a repository that is private but I have stored locally and started to process then started to appear this again and again.

Mac installation & usage

Got this working on a Mac with:

$ git clone https://github.com/codersrankOrg/repo_info_extractor.git
$ cd repo_info_extractor

Install pip (from https://stackoverflow.com/a/38043109/174172) & other requirements:

$ curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
$ python get-pip.py
$ sudo pip install -r requirements.txt

$ sudo ./install.sh
$ ./run.sh path/to/repository
$ ls -al ./repo_data.json

Exclude popular dependency directories

In some repositories, the dependencies are also included. With a simple string matching in the path we can exclude them.
eg:

node_modules (node.js)
vendor (Go)
etc

Extract libraries from Python

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Requirements

Create a new extractor for Python. Support Python2 and Python3 too.
Place the new file here: https://github.com/codersrank-org/repo_info_extractor/tree/master/src/language
Implement the def extract_libraries(files): function. The input this an array of paths to the files. The output is an array of the recognized libraries.
Add UnitTest too here: https://github.com/codersrank-org/repo_info_extractor/tree/master/test

Extract libraries from C++

Introduction

We started to implement the external library recognition feature.
It means the repo_info_extractor will be able to recognize the used libraries and not just the programming language.

How does it work?

Requirements

Create a new extractor for C++.
Place the new file here: https://github.com/codersrank-org/repo_info_extractor/tree/master/src/language
Implement the def extract_libraries(files): function. The input this an array of paths to the files. The output is an array of the recognized libraries.
Add UnitTest too here: https://github.com/codersrank-org/repo_info_extractor/tree/master/test

SyntaxError: invalid syntax

I try to analyse a repo, but I get error.
I try with 2 different folder.

I use: python src\main.py C:\Users\Johan\Desktop\TEMP\repos\app-master

Traceback (most recent call last): File "src\main.py", line 2, in <module> from init import initialize File "C:\Users\Johan\Desktop\TEMP\codersrank\src\init.py", line 6, in <module> from analyze_libraries import AnalyzeLibraries File "C:\Users\Johan\Desktop\TEMP\codersrank\src\analyze_libraries.py", line 101 print(timed_message, *argv) ^ SyntaxError: invalid syntax

gets stuck on Initialization...

The tools runs well on just about all my private repositories.

I have one which is over four years old and the tool gets stuck at the text Initialization....
It's been over an hour.
What can I do?

I can't give access to the repository as it's private. Is there a way to debug this? Maybe a parameter which displays the commands being run so I can see on which it is getting stuck?

Much thanks!

Xtend files not getting detected

Hello,

Xtend is a statically-typed programming language which translates to comprehensible Java source code.
Please make an adjustment to detect xtend files.

Thanks
Nagaraj

Count in all branches

Hi, it would be good if, extractor count commits from all repo branches, not only master, or if branch could be set via params.

Thx!

codersrank-org / repo_info_extractor Goto Github PK

repo_info_extractor's People

Contributors

Stargazers

Watchers

Forkers

repo_info_extractor's Issues

Introduction

How does it work?

Requirements

Introduction

How does it work?

Requirements

Introduction

How does it work?

Requirements

somebody can help me. Thanks

Introduction

How does it work?

Requirements

Introduction

How does it work?

Requirements

Introduction

How does it work?

Requirements

Expected Behavior

Actual Behavior

it seems that set_commit_stats is never run

Steps to Reproduce the Problem

Specifications

Introduction

How does it work?

Requirements

Introduction

How does it work?

Requirements

Introduction

How does it work?

Requirements

Recommend Projects

Recommend Topics

Recommend Org