Git Product home page Git Product logo

Comments (7)

xing-hu avatar xing-hu commented on August 19, 2024

I updated the data and outputs by different RQs. I think it would be more convenient for comparison.

from emse-deepcom.

ZhichaoOuyang avatar ZhichaoOuyang commented on August 19, 2024

I updated the data and outputs by different RQs. I think it would be more convenient for comparison.

Thank you for your answer, and by the way, can the previous dataset be used?
I see that although the size of the data set has not changed, the content of the data set is different. I have further processed your data set and it would be cumbersome to reprocess. Also, train.token.ast in your current training set is one line less than train.token.code

from emse-deepcom.

xing-hu avatar xing-hu commented on August 19, 2024

The difference between the two version dataset is the index of samples. In theory, it wouldn't affect the performance of your model. If you want to compare directly, I recommend this version as I provided the output of my model in the folder "output".
Thanks for pointing the issue in train.token.ast, I will check it.

from emse-deepcom.

ZhichaoOuyang avatar ZhichaoOuyang commented on August 19, 2024

I compared the old and new data sets, and found that some samples of the new data set are not on the old data set. Shouldn't you just disrupt the order, should you change some data samples? If so, I may still need to update the dataset for comparison.

from emse-deepcom.

xing-hu avatar xing-hu commented on August 19, 2024

Some samples may change when shuffled from the large codebase.
I update the train.token.ast and hope the new one is more convenient for comparison.

from emse-deepcom.

xing-hu avatar xing-hu commented on August 19, 2024

I also reopen the old version dataset and the output on it. I hope both datasets can help your research.

from emse-deepcom.

9ayhub avatar 9ayhub commented on August 19, 2024

@ZhichaoOuyang

Hello, I have been able to run this code successfully using the data set given by the author,but the complete data processing code does not seem to be given, because I can't use the existing data processing code to generate the files provided by the author.

Can you successfully preprocess the data?If you can run it successfully, could you share me your code?

from emse-deepcom.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.