Score breakdown

Evaluating Large Language Models Trained on Code

paper-0131 · paper · 2021

Mark Chen et al.

Codex; LLMs write code, the capability that transformed software work.

Academic, score -0.2039

MetricStatusValueNorm.WeightContributionSourceConfidenceProvenance
citation_countpresent1428.00.0064240.50.003212OpenAlexhighlink
library_holdingsmissingrecorded as missing, penalized by rule, never imputed−0.1recorded as missing; penalized by rule, never imputed
readership_persistencepresent6.00.3571430.050.017857OpenAlexmediumlink
syllabus_adoptionsmissingrecorded as missing, penalized by rule, never imputed−0.125recorded as missing; penalized by rule, never imputed

Broad Influence, score -0.0559

MetricStatusValueNorm.WeightContributionSourceConfidenceProvenance
citation_countpresent1428.00.0064240.20.001285OpenAlexhighlink
library_holdingsmissingrecorded as missing, penalized by rule, never imputed−0.125recorded as missing; penalized by rule, never imputed
readership_persistencepresent6.00.3571430.40.142857OpenAlexmediumlink
syllabus_adoptionsmissingrecorded as missing, penalized by rule, never imputed−0.075recorded as missing; penalized by rule, never imputed

Governance Practitioner, score -0.2877

MetricStatusValueNorm.WeightContributionSourceConfidenceProvenance
citation_countpresent1428.00.0064240.250.001606OpenAlexhighlink
library_holdingsmissingrecorded as missing, penalized by rule, never imputed−0.15recorded as missing; penalized by rule, never imputed
readership_persistencepresent6.00.3571430.10.035714OpenAlexmediumlink
syllabus_adoptionsmissingrecorded as missing, penalized by rule, never imputed−0.175recorded as missing; penalized by rule, never imputed

A rank is not a verdict on intrinsic worth. It is a transparent output of declared evidence, weights, and missing-data rules at a specific release date.

Disagree with this rank or a number? Challenge it with your evidence. Every challenge gets a public identifier and a published resolution.