Score breakdown

Deep Reinforcement Learning from Human Preferences

paper-0096 · paper · 2017

Paul F. Christiano et al.

Learning reward from human comparisons; the seed of RLHF.

Academic, score -0.1953

MetricStatusValueNorm.WeightContributionSourceConfidenceProvenance
citation_countpresent508.00.0022820.50.001141OpenAlexhighlink
library_holdingsmissingrecorded as missing, penalized by rule, never imputed−0.1recorded as missing; penalized by rule, never imputed
readership_persistencepresent9.00.5714290.050.028571OpenAlexmediumlink
syllabus_adoptionsmissingrecorded as missing, penalized by rule, never imputed−0.125recorded as missing; penalized by rule, never imputed

Broad Influence, score 0.0290

MetricStatusValueNorm.WeightContributionSourceConfidenceProvenance
citation_countpresent508.00.0022820.20.000456OpenAlexhighlink
library_holdingsmissingrecorded as missing, penalized by rule, never imputed−0.125recorded as missing; penalized by rule, never imputed
readership_persistencepresent9.00.5714290.40.228571OpenAlexmediumlink
syllabus_adoptionsmissingrecorded as missing, penalized by rule, never imputed−0.075recorded as missing; penalized by rule, never imputed

Governance Practitioner, score -0.2673

MetricStatusValueNorm.WeightContributionSourceConfidenceProvenance
citation_countpresent508.00.0022820.250.000571OpenAlexhighlink
library_holdingsmissingrecorded as missing, penalized by rule, never imputed−0.15recorded as missing; penalized by rule, never imputed
readership_persistencepresent9.00.5714290.10.057143OpenAlexmediumlink
syllabus_adoptionsmissingrecorded as missing, penalized by rule, never imputed−0.175recorded as missing; penalized by rule, never imputed

A rank is not a verdict on intrinsic worth. It is a transparent output of declared evidence, weights, and missing-data rules at a specific release date.

Disagree with this rank or a number? Challenge it with your evidence. Every challenge gets a public identifier and a published resolution.