open-compass / opencompass Public

Notifications You must be signed in to change notification settings
Fork 332
Star 3.1k

Code
Issues 124
Pull requests 24
Discussions
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Issues: open-compass/opencompass

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

124 Open 274 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

internlm2-7B-base CMMLU评测结果异常

#1281 opened Jul 1, 2024 by poisonwine

2 tasks done

[Feature] whats the difference between mbpp and deprecated_mbpp ?

#1280 opened Jun 29, 2024 by noforit

1 task

[Bug] 多卡时，GPU7显存占用比其他卡多30G+

#1277 opened Jun 26, 2024 by dhcode-cpp

2 tasks done

[Bug] flames的flames-scorer无法正确加载

#1275 opened Jun 25, 2024 by shiroko98

2 tasks done

[Bug] qwen1.5-7B base 版本在math测试集下得分仅有2.6分左右远低于官方评测给出的结果

#1274 opened Jun 25, 2024 by 1moye

2 tasks done

[Bug] 按文档使用gpt3.5 测试数据集报错

#1270 opened Jun 24, 2024 by luckfu

2 tasks done

[Feature] Improve the Documentation for Subjective Evaluation

#1269 opened Jun 24, 2024 by tonysy

1 of 3 tasks

有人配置过mmlu_pro数据集么？求分享代码~

#1262 opened Jun 20, 2024 by wll-design

1 task

[Bug] When I attempted to perform the agent evaluation, the console returned an error: "AttributeError: 'OpenAI' object has no attribute 'chat'".

#1259 opened Jun 20, 2024 by CaptainJi

2 tasks done

[Feature] 为啥我开始评测一直卡在这里

#1258 opened Jun 20, 2024 by GEK1

1 task

[Bug] Find scikit-learn version conflict in requirements/runtime.txt and requirements/extra.txt

#1256 opened Jun 19, 2024 by BIGWangYuDong

[Feature] Cached Dataset load

#1254 opened Jun 18, 2024 by Chensem

1 task

[Bug] llama3 8b 基座模型在ARC-C PPL数据集上的评估，accuracy只有41，不正常

#1253 opened Jun 18, 2024 by linboyang

2 tasks done

[Bug] 大佬们，这个函数好像写的有问题，只能解析出来[BEGIN]到[DONE]中间的代码，然而基座模型最先输出的代码不是以[BEGIN]开头的。

#1251 opened Jun 17, 2024 by linboyang

2 tasks done

[Bug] 增加数据集时失败

#1245 opened Jun 15, 2024 by YanxingLiu

2 tasks done

meta-llama/Meta-Llama-3-8B-Instruct evaluated results is not consistent with hugging face's official results

#1243 opened Jun 13, 2024 by hzgdeerHo

2 tasks done

[Feature] Difficulty in Evaluating Custom Models with OpenCompass

#1239 opened Jun 13, 2024 by jiangjiadi

1 task

[Bug] ValueError: not enough values to unpack

#1235 opened Jun 11, 2024 by rangehow

2 tasks done

[Bug] 不支持python3.10以上安装

#1234 opened Jun 10, 2024 by rangehow

2 tasks done

[Bug] Passing trust_remote_code=True will be mandatory to load this dataset from the next major release of datasets.

#1233 opened Jun 9, 2024 by chairmanQi

2 tasks done

[Bug] When testing on gen datasets, even if the output is empty or incorrect, unexpected scores can be obtained

#1232 opened Jun 7, 2024 by chairmanQi

2 tasks done

opencompass公开榜单更新[Feature]

#1231 opened Jun 7, 2024 by cobraheleah

1 task

大海捞针数据集初始化报错（ Failed to get opencompass.datasets.needlebench.origin.NeedleBenchOriginDataset）

#1229 opened Jun 6, 2024 by macheng6

2 tasks done

[Bug] run pytorch Qwen-7B-Chat with ARC-c ppl under CPU ,and result is not good

#1226 opened Jun 5, 2024 by FlexLaughing

2 tasks done

[Bug] which version of the dataset should be selected When evaluating the Llama3 model,

#1223 opened Jun 3, 2024 by bullw

2 tasks done

Previous 1 2 3 4 5 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly