NLP Papers
Published:
These are the most important transformer papers (in my opinion) that anyone working with Transformers should know. Also, there is a nice summary of Efficient Transformers: A Survey by folks at Google that I highly recommend as well.
AMBERT: A Pre-trained Language Model with Multi-Grained Tokenization
Authors: Xinsong Zhang, Hang Li
ByteDance AI Lab
Year: August 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. T5
Authors: Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu
Year: July 2020
Authors: Mike Lewis, Marjan Ghazvininejad, Gargi Ghosh, Armen Aghajanyan, Sida Wang, Luke Zettlemoyer
Year: June 2020
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators.
Authors: Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning.
Google and Stanford
Year: March 2020
Generalization through Memorization: Nearest Neighbor Language Models
Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis
Facebook and Stanford Presenation in ACL 2020, “Beyond BERT” by Mike Lewis
Year: Feb 2020
ConveRT: Efficient and Accurate Conversational Representations from Transformers
Authors: Matthew Henderson, Iñigo Casanueva, Nikola Mrkšić, Pei-Hao Su, Tsung-Hsien Wen, Ivan Vulić
PolyAI
Year: Nov 2019
Authors:Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer
Year: October 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Authors: Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut
Google and Toyota Technological Institute
Year: September 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Auhtors: Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov
UW and Facebook
Year: July 2019
Authors: Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le
Year: June 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Authors:Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
CMU, Google
Year: May 2019
Cross-lingual Language Model Pretraining
Authors:Guillaume Lample, Alexis Conneau
year: January 2019
Improving Language Understanding by Generative Pre-Training
Authors: Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever
OpenAI
Year: June 2018
Deep contextualized word representations ELMo
Auhtors: Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee and Luke Zettlemoyer
Allen Institute for Artificial Intelligence and UW
year: March 2018
Authors: Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
Year: Dec 2017