This post, although not fully fleshed out, is worth documenting for the subtle differences in behavior it reveals among the four GPT2XL variants I trained with ethical alignment under RLLM. For more information, please refer to the visual map I created.
RLLMv3 vs. RLLMv10 vs. RLLMv11 vs. RLLMv12
This post, although not fully fleshed out, is worth documenting for the subtle differences in behavior it reveals among the four GPT2XL variants I trained with ethical alignment under RLLM. For more information, please refer to the visual map I created.