GPT-2 (2019) Paper Notes
Paper notes on GPT-2 covering its core ideas: decoder-only Transformer scaling, WebText, next-token prediction, zero-shot task transfer, and the staged release controversy.
Paper notes on GPT-2 covering its core ideas: decoder-only Transformer scaling, WebText, next-token prediction, zero-shot task transfer, and the staged release controversy.