Papers - The AI Canon

All 162 seed papers. Seed status means candidacy, not canonical status. Papers with harvested evidence link to their breakdown; the rest are an honestly-declared coverage gap, not a zero.

#	Paper	Year	Venue	Evidence
0001	A Logical Calculus of the Ideas Immanent in Nervous Activity First mathematical model of the neuron; the conceptual origin of neural networks.	1943	Bulletin of Mathematical Biophysics	scored
0002	As We May Think The memex vision; founding document of augmenting human intellect with machines.	1945	The Atlantic	no evidence yet
0003	A Mathematical Theory of Communication Created information theory; the quantitative substrate of all machine learning.	1948	Bell System Technical Journal	scored
0004	Computing Machinery and Intelligence Posed 'can machines think', proposed the imitation game; the field's founding question.	1950	Mind	scored
0005	A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence Coined 'artificial intelligence' and framed the research programme.	1955	Proposal (Dartmouth)	no evidence yet
0006	The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain First trainable neural classifier; launched learning machines and their first hype cycle.	1958	Psychological Review	scored
0007	Some Studies in Machine Learning Using the Game of Checkers Coined 'machine learning'; first self-improving game program.	1959	IBM Journal	scored
0008	Programs with Common Sense The Advice Taker; founding statement of logic-based AI and knowledge representation.	1959	Mechanisation of Thought Processes	no evidence yet
0009	Man-Computer Symbiosis The augmentation-versus-automation agenda that still structures AI debates.	1960	IRE Transactions on Human Factors	scored
0010	Steps Toward Artificial Intelligence Early synthesis of search, learning, and planning as the components of AI.	1961	Proceedings of the IRE	no evidence yet
0011	GPS: A Program That Simulates Human Thought General Problem Solver; means-ends analysis and the symbolic cognition paradigm.	1961	Lernende Automaten / RAND	scored
0012	Fuzzy Sets Founded fuzzy logic; a major non-probabilistic approach to reasoning under vagueness.	1965	Information and Control	scored
0013	ELIZA: A Computer Program for the Study of Natural Language Communication First chatbot; revealed the human tendency to project understanding onto machines.	1966	Communications of the ACM	no evidence yet
0014	Some Philosophical Problems from the Standpoint of Artificial Intelligence Introduced the frame problem and situation calculus.	1969	Machine Intelligence 4	scored
0015	STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving The planning formalism that dominated automated planning for decades.	1971	Artificial Intelligence	no evidence yet
0016	Adaptation in Natural and Artificial Systems (foundational monograph) Genetic algorithms; evolution as a computational search paradigm.	1975	University of Michigan Press	no evidence yet
0017	Maximum Likelihood from Incomplete Data via the EM Algorithm The EM algorithm; workhorse of latent-variable estimation.	1977	Journal of the Royal Statistical Society B	scored
0018	Minds, Brains, and Programs The Chinese Room argument; the canonical philosophical attack on strong AI.	1980	Behavioral and Brain Sciences	scored
0019	Neocognitron: A Self-Organizing Neural Network Model The architectural ancestor of convolutional networks.	1980	Biological Cybernetics	no evidence yet
0020	Neural Networks and Physical Systems with Emergent Collective Computational Abilities Hopfield networks; energy-based associative memory that revived the field (2024 Nobel).	1982	PNAS	no evidence yet
0021	Self-Organized Formation of Topologically Correct Feature Maps Self-organizing maps; unsupervised topology-preserving learning.	1982	Biological Cybernetics	scored
0022	A Learning Algorithm for Boltzmann Machines Stochastic networks and unsupervised learning of internal representations.	1985	Cognitive Science	scored
0023	A Robust Layered Control System for a Mobile Robot Subsumption architecture; behaviour-based robotics against the symbolic mainstream.	1986	IEEE Journal of Robotics and Automation	scored
0024	Learning Representations by Back-Propagating Errors Made backpropagation the standard training algorithm; the engine of all deep learning.	1986	Nature	scored
0025	Fusion, Propagation, and Structuring in Belief Networks Bayesian networks; principled probabilistic reasoning in AI (Turing Award work).	1986	Artificial Intelligence	scored
0026	Learning to Predict by the Methods of Temporal Differences TD learning; the core idea of modern RL.	1988	Machine Learning	scored
0027	Multilayer Feedforward Networks Are Universal Approximators Proved neural nets can approximate any function; the theoretical licence for the field.	1989	Neural Networks	scored
0028	A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition Made HMMs the standard sequential model for two decades of speech and NLP.	1989	Proceedings of the IEEE	no evidence yet
0029	The Symbol Grounding Problem Named the problem of how symbols acquire meaning; resurfaces in every LLM debate.	1990	Physica D	scored
0030	Finding Structure in Time Simple recurrent networks; sequence learning before LSTMs.	1990	Cognitive Science	scored
0031	Intelligence Without Representation Manifesto for embodied intelligence; the strongest internal critique of GOFAI.	1991	Artificial Intelligence	scored
0032	Q-Learning Model-free value learning with convergence proof.	1992	Machine Learning	scored
0033	Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning REINFORCE; the original policy-gradient method.	1992	Machine Learning	no evidence yet
0034	Causal Diagrams for Empirical Research The do-calculus; the formal foundation of modern causal inference.	1995	Biometrika	no evidence yet
0035	Support-Vector Networks SVMs; the dominant classifier of the pre-deep-learning era.	1995	Machine Learning	scored
0036	Temporal Difference Learning and TD-Gammon Self-play RL reaches expert backgammon; the proof of concept for everything later.	1995	Communications of the ACM	no evidence yet
0037	Regression Shrinkage and Selection via the Lasso L1 regularization; sparse models across statistics and ML.	1996	Journal of the Royal Statistical Society B	scored
0038	Long Short-Term Memory Solved vanishing gradients for sequences; powered a decade of speech and language AI.	1997	Neural Computation	scored
0039	A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting AdaBoost; proved weak learners can be combined into strong ones.	1997	Journal of Computer and System Sciences	scored
0040	No Free Lunch Theorems for Optimization No algorithm wins on all problems; a standing caution against universal claims.	1997	IEEE Transactions on Evolutionary Computation	scored
0041	Gradient-Based Learning Applied to Document Recognition LeNet; the canonical demonstration that CNNs work on real tasks.	1998	Proceedings of the IEEE	scored
0042	The PageRank Citation Ranking: Bringing Order to the Web Link-based ranking; the algorithm that organized the web's information.	1998	Stanford Technical Report	scored
0043	Random Forests The most-used classical ML algorithm in applied practice.	2001	Machine Learning	scored
0044	Statistical Modeling: The Two Cultures Named the split between data modeling and algorithmic prediction; prophetic for ML's rise.	2001	Statistical Science	scored
0045	Greedy Function Approximation: A Gradient Boosting Machine Gradient boosting; the backbone of tabular ML to this day.	2001	Annals of Statistics	scored
0046	BLEU: A Method for Automatic Evaluation of Machine Translation The metric that made MT progress measurable, virtues and pathologies included.	2002	ACL	no evidence yet
0047	Latent Dirichlet Allocation Topic models; a decade of probabilistic text analysis.	2003	Journal of Machine Learning Research	scored
0048	A Neural Probabilistic Language Model Word embeddings plus neural language modeling; the LLM lineage starts here.	2003	Journal of Machine Learning Research	scored
0049	A Fast Learning Algorithm for Deep Belief Nets Layer-wise pretraining; the paper that relaunched 'deep' learning.	2006	Neural Computation	scored
0050	Reducing the Dimensionality of Data with Neural Networks Deep autoencoders in Science; put deep learning back on the mainstream agenda.	2006	Science	scored
0051	Universal Intelligence: A Definition of Machine Intelligence The formal definition behind much AGI discourse.	2007	Minds and Machines	no evidence yet
0052	The Basic AI Drives Instrumental convergence; why capable agents acquire resources and resist shutdown.	2008	AGI Conference	no evidence yet
0053	Matrix Factorization Techniques for Recommender Systems The Netflix-Prize-era standard for collaborative filtering.	2009	IEEE Computer	scored
0054	ImageNet: A Large-Scale Hierarchical Image Database The dataset that made the deep learning revolution measurable.	2009	CVPR	scored
0055	The Unreasonable Effectiveness of Data Data beats cleverness; the empirical creed of the scaling era, stated early.	2009	IEEE Intelligent Systems	scored
0056	ImageNet Classification with Deep Convolutional Neural Networks AlexNet; the result that started the modern deep learning era.	2012	NeurIPS	no evidence yet
0057	Deep Neural Networks for Acoustic Modeling in Speech Recognition Deep nets replace decades of speech-recognition engineering.	2012	IEEE Signal Processing Magazine	scored
0058	Fairness Through Awareness The formal-fairness research programme begins.	2012	ITCS	scored
0059	A Few Useful Things to Know About Machine Learning The most-shared practical wisdom paper in ML.	2012	Communications of the ACM	scored
0060	Efficient Estimation of Word Representations in Vector Space word2vec; cheap, composable word meaning ('king - man + woman').	2013	ICLR Workshop	scored
0061	Dropout: A Simple Way to Prevent Neural Networks from Overfitting The defining regularization trick of the era.	2014	Journal of Machine Learning Research	scored
0062	Generative Adversarial Networks GANs; adversarial training created the modern generative-media era.	2014	NeurIPS	scored
0063	Auto-Encoding Variational Bayes VAEs; probabilistic deep generative modeling.	2014	ICLR	scored
0064	Intriguing Properties of Neural Networks Discovered adversarial examples; founded ML security.	2014	ICLR	scored
0065	GloVe: Global Vectors for Word Representation The other standard embedding; count-based meets predictive.	2014	EMNLP	scored
0066	Sequence to Sequence Learning with Neural Networks Encoder-decoder; end-to-end sequence transduction.	2014	NeurIPS	scored
0067	Very Deep Convolutional Networks for Large-Scale Image Recognition VGG; depth as the design principle.	2015	ICLR	no evidence yet
0068	Going Deeper with Convolutions GoogLeNet/Inception; efficient depth via multi-scale modules.	2015	CVPR	scored
0069	Batch Normalization: Accelerating Deep Network Training Made very deep networks trainable in practice.	2015	ICML	no evidence yet
0070	Adam: A Method for Stochastic Optimization The default optimizer of deep learning; among the most-cited papers in CS.	2015	ICLR	scored
0071	Deep Learning (review) The field's self-definition by its three Turing-Award founders.	2015	Nature	no evidence yet
0072	U-Net: Convolutional Networks for Biomedical Image Segmentation The default segmentation architecture, far beyond medicine.	2015	MICCAI	no evidence yet
0073	Distilling the Knowledge in a Neural Network Knowledge distillation; the basis of model compression.	2015	NeurIPS Workshop	scored
0074	Explaining and Harnessing Adversarial Examples FGSM; explained and weaponized adversarial fragility.	2015	ICLR	scored
0075	Neural Machine Translation by Jointly Learning to Align and Translate Introduced attention; the mechanism that became everything.	2015	ICLR	scored
0076	Human-Level Control Through Deep Reinforcement Learning DQN on Atari; deep RL is born.	2015	Nature	scored
0077	Hidden Technical Debt in Machine Learning Systems Why ML systems rot in production; founding text of MLOps.	2015	NeurIPS	no evidence yet
0078	XGBoost: A Scalable Tree Boosting System The implementation that made gradient boosting ubiquitous in practice.	2016	KDD	no evidence yet
0079	Deep Residual Learning for Image Recognition ResNet; skip connections enabled networks of arbitrary depth.	2016	CVPR	scored
0080	You Only Look Once: Unified, Real-Time Object Detection YOLO; real-time detection as a single network pass.	2016	CVPR	scored
0081	WaveNet: A Generative Model for Raw Audio Neural audio generation; transformed speech synthesis.	2016	arXiv	no evidence yet
0082	Mastering the Game of Go with Deep Neural Networks and Tree Search AlphaGo; the cultural turning point of the deep learning era.	2016	Nature	scored
0083	End-to-End Training of Deep Visuomotor Policies Pixels-to-torques; deep learning enters robotic control.	2016	Journal of Machine Learning Research	no evidence yet
0084	Concrete Problems in AI Safety Turned AI safety into a concrete ML research agenda.	2016	arXiv	no evidence yet
0085	Why Should I Trust You? Explaining the Predictions of Any Classifier LIME; model-agnostic local explanation.	2016	KDD	no evidence yet
0086	Big Data's Disparate Impact The canonical legal analysis of algorithmic discrimination.	2016	California Law Review	no evidence yet
0087	Machine Bias The COMPAS investigation; algorithmic injustice becomes front-page news.	2016	ProPublica	no evidence yet
0088	Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings Demonstrated social bias embedded in learned representations.	2016	NeurIPS	no evidence yet
0089	Equality of Opportunity in Supervised Learning Defined equalized odds; a standard fairness criterion.	2016	NeurIPS	no evidence yet
0090	The Ethics of Algorithms: Mapping the Debate The standard map of algorithmic-ethics concerns.	2016	Big Data & Society	no evidence yet
0091	Deep Learning with Differential Privacy DP-SGD; private training as a practical method.	2016	CCS	scored
0092	Semi-Supervised Classification with Graph Convolutional Networks GCNs; the breakthrough that mainstreamed graph deep learning.	2017	ICLR	scored
0093	Attention Is All You Need The Transformer; the architecture of the modern AI era.	2017	NeurIPS	scored
0094	Mastering the Game of Go Without Human Knowledge AlphaGo Zero; superhuman play from self-play alone.	2017	Nature	scored
0095	Proximal Policy Optimization Algorithms PPO; the workhorse algorithm, later the engine of RLHF.	2017	arXiv	no evidence yet
0096	Deep Reinforcement Learning from Human Preferences Learning reward from human comparisons; the seed of RLHF.	2017	NeurIPS	no evidence yet
0097	A Unified Approach to Interpreting Model Predictions SHAP; game-theoretic attribution, the industry default.	2017	NeurIPS	scored
0098	Inherent Trade-Offs in the Fair Determination of Risk Scores Proved popular fairness definitions are mutually incompatible.	2017	ITCS	no evidence yet
0099	Semantics Derived Automatically from Language Corpora Contain Human-Like Biases Bias in embeddings, demonstrated with psychometric rigor.	2017	Science	scored
0100	Membership Inference Attacks Against Machine Learning Models Showed models leak whether your data was in the training set.	2017	IEEE S&P	no evidence yet
0101	Communication-Efficient Learning of Deep Networks from Decentralized Data Federated learning; training without centralizing data.	2017	AISTATS	no evidence yet
0102	World Models Agents learning inside their own learned simulators; ancestor of today's world-model agenda.	2018	NeurIPS	no evidence yet
0103	Improving Language Understanding by Generative Pre-Training GPT-1; generative pretraining as the recipe.	2018	OpenAI Technical Report	no evidence yet
0104	A General Reinforcement Learning Algorithm That Masters Chess, Shogi, and Go Through Self-Play AlphaZero; one algorithm, three games.	2018	Science	no evidence yet
0105	The Mythos of Model Interpretability Disciplined the field's vocabulary about what 'interpretable' means.	2018	ACM Queue / CACM	no evidence yet
0106	Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification Audit that changed commercial face-analysis products and founded audit culture.	2018	FAT*	no evidence yet
0107	Counterfactual Explanations Without Opening the Black Box Linked explanation methods to GDPR; the legal-technical bridge.	2018	Harvard Journal of Law & Technology	no evidence yet
0108	The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation The first broad misuse-threat assessment across digital, physical, political security.	2018	arXiv	no evidence yet
0109	BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Pretrain-finetune became the default NLP paradigm.	2019	NAACL	no evidence yet
0110	Language Models are Unsupervised Multitask Learners GPT-2; scale yields zero-shot task behaviour, and the first staged-release debate.	2019	OpenAI Technical Report	no evidence yet
0111	Grandmaster Level in StarCraft II Using Multi-Agent Reinforcement Learning AlphaStar; RL in a real-time, partially observed strategy game.	2019	Nature	no evidence yet
0112	Risks from Learned Optimization in Advanced Machine Learning Systems Mesa-optimization and deceptive alignment; core inner-alignment concepts.	2019	arXiv	no evidence yet
0113	Stop Explaining Black Box Machine Learning Models for High Stakes Decisions Argues for inherently interpretable models where stakes are high.	2019	Nature Machine Intelligence	no evidence yet
0114	Model Cards for Model Reporting The documentation standard for released models.	2019	FAT*	no evidence yet
0115	Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations Showed a deployed health algorithm systematically disadvantaged Black patients.	2019	Science	no evidence yet
0116	Energy and Policy Considerations for Deep Learning in NLP Put training cost and carbon on the research agenda.	2019	ACL	no evidence yet
0117	The Global Landscape of AI Ethics Guidelines Mapped 84 guidelines; showed convergence on principles and divergence on practice.	2019	Nature Machine Intelligence	no evidence yet
0118	The Bitter Lesson Compute-leveraging general methods beat human-knowledge engineering; the era's most-quoted essay.	2019	Essay (incompleteideas.net)	no evidence yet
0119	On the Measure of Intelligence Skill-acquisition efficiency as the definition of intelligence; basis of the ARC benchmark.	2019	arXiv	no evidence yet
0120	Denoising Diffusion Probabilistic Models Made diffusion the dominant image-generation paradigm.	2020	NeurIPS	no evidence yet
0121	Neural Radiance Fields (NeRF): Representing Scenes for View Synthesis Learned 3D scene representation; new field of neural rendering.	2020	ECCV	no evidence yet
0122	Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer T5; everything is text-to-text.	2020	Journal of Machine Learning Research	no evidence yet
0123	Language Models are Few-Shot Learners GPT-3; in-context learning and the scaling thesis made undeniable.	2020	NeurIPS	no evidence yet
0124	Scaling Laws for Neural Language Models Loss as a power law of compute, data, parameters; the industry's planning document.	2020	arXiv	no evidence yet
0125	Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks RAG; grounding generation in retrieved evidence.	2020	NeurIPS	no evidence yet
0126	Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model MuZero; planning without knowing the rules.	2020	Nature	no evidence yet
0127	Zoom In: An Introduction to Circuits Founded mechanistic interpretability: studying networks like organisms.	2020	Distill	no evidence yet
0128	Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing The reference framework for internal AI audit practice.	2020	FAT*	no evidence yet
0129	An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale ViT; transformers displace convolutions in vision.	2021	ICLR	no evidence yet
0130	Learning Transferable Visual Models From Natural Language Supervision CLIP; vision-language alignment underpinning multimodal AI.	2021	ICML	no evidence yet
0131	Evaluating Large Language Models Trained on Code Codex; LLMs write code, the capability that transformed software work.	2021	arXiv	no evidence yet
0132	Unsolved Problems in ML Safety The mainstream-ML framing of robustness, monitoring, alignment, systemic safety.	2021	arXiv	no evidence yet
0133	Datasheets for Datasets Provenance documentation for training data.	2021	Communications of the ACM	no evidence yet
0134	On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? The defining critique of the LLM paradigm, and the paper behind Google's Gebru affair.	2021	FAccT	no evidence yet
0135	Extracting Training Data from Large Language Models LLMs memorize and can regurgitate training data.	2021	USENIX Security	no evidence yet
0136	Highly Accurate Protein Structure Prediction with AlphaFold Solved a fifty-year grand challenge; the Nobel-recognized proof of AI for science.	2021	Nature	no evidence yet
0137	The Hardware Lottery Research directions win because hardware favors them; a structural critique of progress.	2021	Communications of the ACM	no evidence yet
0138	High-Resolution Image Synthesis with Latent Diffusion Models Stable Diffusion; open-weights image generation at consumer scale.	2022	CVPR	no evidence yet
0139	Training Compute-Optimal Large Language Models Chinchilla; rebalanced the field toward data-optimal training.	2022	NeurIPS	no evidence yet
0140	LoRA: Low-Rank Adaptation of Large Language Models Made finetuning large models affordable; standard adaptation method.	2022	ICLR	no evidence yet
0141	Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity Mixture-of-experts at scale; the sparse path to frontier capability.	2022	Journal of Machine Learning Research	no evidence yet
0142	Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Stepwise prompting unlocks latent reasoning; opened the reasoning-model agenda.	2022	NeurIPS	no evidence yet
0143	Emergent Abilities of Large Language Models Named (and contested) the phenomenon of capability jumps with scale.	2022	TMLR	no evidence yet
0144	Training Language Models to Follow Instructions with Human Feedback InstructGPT; RLHF at scale, the technique behind aligned chat models.	2022	NeurIPS	no evidence yet
0145	Constitutional AI: Harmlessness from AI Feedback Alignment via explicit principles and AI feedback; reduced dependence on human labeling.	2022	arXiv	no evidence yet
0146	Toy Models of Superposition Why features share neurons; the core obstacle to reading networks.	2022	Anthropic / Transformer Circuits	no evidence yet
0147	Discovering Faster Matrix Multiplication Algorithms with Reinforcement Learning AlphaTensor; AI finds new mathematics.	2022	Nature	no evidence yet
0148	Segment Anything Promptable foundation model for segmentation.	2023	ICCV	no evidence yet
0149	ReAct: Synergizing Reasoning and Acting in Language Models Reason-act loops; the template for LLM agents.	2023	ICLR	no evidence yet
0150	Toolformer: Language Models Can Teach Themselves to Use Tools Self-supervised tool use; agents calling APIs.	2023	NeurIPS	no evidence yet
0151	LLaMA: Open and Efficient Foundation Language Models The open-weights line that created today's open-model ecosystem.	2023	arXiv	no evidence yet
0152	GPT-4 Technical Report The frontier capability report; also the moment training details went dark.	2023	arXiv	no evidence yet
0153	Sparks of Artificial General Intelligence: Early Experiments with GPT-4 The most-debated capability claim of the era; framed the AGI-proximity argument.	2023	arXiv	no evidence yet
0154	Mamba: Linear-Time Sequence Modeling with Selective State Spaces The leading post-Transformer architecture candidate.	2023	arXiv	no evidence yet
0155	Universal and Transferable Adversarial Attacks on Aligned Language Models Automated jailbreaks; alignment as an attack surface.	2023	arXiv	no evidence yet
0156	Towards Monosemanticity: Decomposing Language Models with Dictionary Learning Sparse autoencoders extract human-legible features from LLMs.	2023	Anthropic / Transformer Circuits	no evidence yet
0157	Frontier AI Regulation: Managing Emerging Risks to Public Safety The reference proposal for frontier-model regulatory architecture.	2023	arXiv	no evidence yet
0158	GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models First systematic occupational-exposure estimates for LLMs.	2023	arXiv (later Science)	no evidence yet
0159	Sleeper Agents: Training Deceptive LLMs That Persist Through Safety Training Empirical evidence that deceptive behaviour can survive standard safety training.	2024	arXiv	no evidence yet
0160	Managing Extreme AI Risks Amid Rapid Progress Consensus statement by senior researchers on frontier-risk preparedness.	2024	Science	no evidence yet
0161	Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3 Extended structure prediction to complexes and drug-relevant interactions.	2024	Nature	no evidence yet
0162	DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Open reasoning model from China that reset assumptions about cost and access.	2025	arXiv	no evidence yet