abacaj / code-eval Goto Github PK

View Code? Open in Web Editor NEW

368.0 368.0 34.0 113 KB

Run evaluation on LLMs using human-eval benchmark

License: MIT License

Python 100.00%

humaneval wizardcoder

code-eval's People

Contributors

Stargazers

Watchers

code-eval's Issues

No GPU Found

@abacaj My environment is NVIDIA TX2， when use the package codecarbon to get information of GPU，but it can not find GPU:

[codecarbon INFO @ 21:03:55] [setup] RAM Tracking...
[codecarbon INFO @ 21:03:55] [setup] GPU Tracking...
[codecarbon INFO @ 21:03:55] No GPU found.
[codecarbon INFO @ 21:03:55] [setup] CPU Tracking...
[codecarbon WARNING @ 21:03:55] No CPU tracking mode found. Falling back on CPU constant mode.
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
[codecarbon WARNING @ 21:03:55] We saw that you have a ARMv8 Processor rev 1 (v8l) but we don't know it. Please contact us.
[codecarbon INFO @ 21:03:55] CPU Model on constant consumption mode: ARMv8 Processor rev 1 (v8l)
[codecarbon INFO @ 21:03:55] >>> Tracker's metadata:
[codecarbon INFO @ 21:03:55]   Platform system: Linux-5.10.104-tegra-aarch64-with-glibc2.17
[codecarbon INFO @ 21:03:55]   Python version: 3.8.13
[codecarbon INFO @ 21:03:55]   CodeCarbon version: 2.3.4
[codecarbon INFO @ 21:03:55]   Available RAM : 6.329 GB
[codecarbon INFO @ 21:03:55]   CPU count: 6
[codecarbon INFO @ 21:03:55]   CPU model: ARMv8 Processor rev 1 (v8l)
[codecarbon INFO @ 21:03:55]   GPU count: None
[codecarbon INFO @ 21:03:55]   GPU model: None

But the result of torch.cuda.is_available() is true， so i want to know if ``` codecarbon`` could suport the facility TX2? Looking forward to your reply.

Why am I getting low scores on llama-2-13b, pass@1: 3.05%, pass@10: 19.51%, are you applying any other fine prompts to this setup or are the scores related to batch decoding, my setup is such that I need to generate the samples sequentially and can't perform batch decoding.

{'pass@1': 0.0975609756097561}

abacaj / code-eval Goto Github PK

code-eval's People

Contributors

Stargazers

Watchers

Forkers

code-eval's Issues

No GPU Found

Performance of llama-2

Any update on these metrics?

where can i get the result of perfrmance after evaluating the llama2-7b

Any plans on running evals for codellama?

Support CodeGeeX2

Is llama2-7B-chat weaker thank llama2-7B?

where the evaluate_functional_correctness

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent