Naturally, this is a single-authored paper: "Fast Transformer Decoding: One Write-Head is All You Need," Noam Shazeer
Fast Transformer Decoding: One Write-Head is All You Need. (arXiv:1911.02150v1 []) #NLProc