Skip to content

Commit ee2968c

Browse files
committed
add codemmlu leaderboard
1 parent 816b5d3 commit ee2968c

File tree

1 file changed

+13
-7
lines changed

1 file changed

+13
-7
lines changed

leaderboards/codemmlu/index.html

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -117,9 +117,9 @@ <h3 class="fw-light text-nowrap">
117117
alt="blog"
118118
class="img-fluid"
119119
/></a>
120-
<a href="https://arxiv.org/html/2406.11927v1"
120+
<a href="https://arxiv.org/abs/2410.01999v1#:~:text=View%20a%20PDF%20of%20the%20paper%20titled%20CodeMMLU:%20A%20Multi-Task"
121121
><img
122-
src="https://img.shields.io/badge/2406.11927-red?style=for-the-badge&label=arXiv"
122+
src="https://img.shields.io/badge/2410.01999-red?style=for-the-badge&label=arXiv"
123123
alt="leaderboard"
124124
class="img-fluid"
125125
/></a>
@@ -131,13 +131,13 @@ <h3 class="fw-light text-nowrap">
131131
/></a>
132132
</div>
133133
<div class="d-flex flex-row justify-content-center gap-3">
134-
<a href="https://github.com/FSoft-AI4Code/RepoExec"
134+
<a href="https://github.com/FSoft-AI4Code/CodeMMLU"
135135
><img
136136
src="https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white"
137137
alt="github"
138138
class="img-fluid"
139139
/></a>
140-
<a href="https://github.com/FSoft-AI4Code/RepoExec/blob/master/paper/main.pdf"
140+
<a href="https://arxiv.org/abs/2410.01999v1#:~:text=View%20a%20PDF%20of%20the%20paper%20titled%20CodeMMLU:%20A%20Multi-Task"
141141
><img
142142
src="https://img.shields.io/badge/📝 paper-%23121011.svg?style=for-the-badge"
143143
alt="paper"
@@ -175,12 +175,12 @@ <h3>📝 Notes</h3>
175175
<ol>
176176
<li>
177177
Evaluated using
178-
<a href="https://github.com/FSoft-AI4Code/RepoExec"
179-
>RepoExec</a
178+
<a href="https://github.com/FSoft-AI4Code/CodeMMLU"
179+
>CodeMMLU</a
180180
>
181181
</li>
182182
<li>
183-
Models are ranked according to Pass@1 using greedy decoding.
183+
Models are ranked according to Accuracy using greedy decoding.
184184
</li>
185185
<!-- <li>
186186
<i>Complete</i> vs <i>Instruct</i>:
@@ -220,6 +220,12 @@ <h3>🤗 More Leaderboards</h3>
220220
benchmarks and leaderboards, such as:
221221
<div class="inline-block mt-3">
222222
<ol>
223+
<li>
224+
<a
225+
href="https://repoexec.github.io/"
226+
>RepoExec Leaderboard</a
227+
>
228+
</li>
223229
<li>
224230
<a
225231
href="https://bigcode-bench.github.io/"

0 commit comments

Comments
 (0)