Five months of tinkering, trying to to undertand the black box phenomenon in transformer models by exploring GPT2 - its Tokens, Activation Values, and the Interconnected Web
Share this post
My distillation of how Transformer Models…
Share this post
Five months of tinkering, trying to to undertand the black box phenomenon in transformer models by exploring GPT2 - its Tokens, Activation Values, and the Interconnected Web