Git Product home page Git Product logo

codrep's People

Contributors

chenzimin avatar egor-bogomolov avatar egorbu avatar epicfaace avatar martinezmatias avatar monperrus avatar tdurieux avatar vmarkovtsev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

codrep's Issues

Submission process for intermediate ranking (deadline July 4th 2018)

  • as of now, and before July 4th, you send us your solution:
    • by adding us as collaborator to your Github/Gitlab/Bitbucket project (preferred, so as to discuss on the issue tracker directly)
    • or by sending an archive by email, with the code (source or binary) and instructions on how to compile or install the dependencies
  • your program must be a local computation, you cannot use the network to download data or ask a server. The network will be cut during the evaluation on the hidden dataset.
  • we execute your tool on the hidden dataset
  • if we have problems in compiling or executing your code, we further discuss with you in an incremental manner.

Machine used for evaluation: Ubuntu 18.04 LTS, CPU Intel 2299MHZ, 16 GB RAM

Don't hesitate to comment here about the process.

Discussion about baselines for CodRep

Hi,
Thanks a lot for organizing this :) Hope that you don't mind the drive-by issue submission: I would like to suggest three additional, strong, but reasonable, baselines:

  • Random prediction over the lines where after the replacement the code still parses;
  • The line that is the most similar to the line being added (e.g. max % common tokens between the lines);
  • The combination of the above.

The reason I am suggesting this, is that these baselines seem easy "hacks" to achieve reasonable performance without any machine learning.

Participant %6: source{d}

Created for the source{d} team. We plan to keep track of our approaches and solutions in this issue.

Add explanation about hidden dataset

Hello,
Thank you for a great competition!
May you add additional information about hidden dataset from here.
Offtopic: I recommend using kaggle-style approach and publish test dataset without solutions, so you can receive predictions and publish public score and compute a private score.

Participant %4: @tdurieux, INRIA

Hi all,

I just did a quick naive solution based on string distance:

Dataset Perfect Match In Top 10 Recall Loss
Bench 1 3791 4322 98% 0.86 0.13615878141899027 
Bench 2 9910 10805 97% 0.89 0.10263617900182995 

Participant %2: Allamanis et al., Microsoft Research

Hi @mallamanis!

According to the interesting points discussed in #13, you may submit a proposal (and we do hope so :-), so here is your participant wall!

The idea is to post here findings that are specific to your solution and to tease with the corresponding scores on Dataset1 and Dataset2.

Note that it's also perfectly OK to open other issues.

evaluate.py does not accept empty line numbers

As stated in the README:
"Your program does not have to predict something for all input files, if there is no clear answer, simply don't output anything, the error computation takes that into account, more information about this in Loss function below."

This is a bit ambiguous, does it mean that the output should be skipped altogether or that one could output just the filename with no line number?
Either case, it should probably either be fixed or clarified in the README.

The latter does not work (full path omitted):
echo "CodRep-competition/Datasets/Dataset1/Tasks/2703.txt 35" | python evaluate.py
Total files: 34096
Average line error: 0.999970671046 (the lower, the better)
Recall@1: 2.93289535429e-05 (the higher, the better)

echo "CodRep-competition/Datasets/Dataset1/Tasks/2703.txt" | python evaluate.py
Traceback (most recent call last):
File "evaluate.py", line 183, in
main()
File "evaluate.py", line 168, in main
prediction = inputs[1]
IndexError: list index out of range

echo "CodRep-competition/Datasets/Dataset1/Tasks/2703.txt " | python evaluate.py
Traceback (most recent call last):
File "evaluate.py", line 183, in
main()
File "evaluate.py", line 168, in main
prediction = inputs[1]
IndexError: list index out of range

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.