whitehatStoic
Subscribe
Sign in
Home
Research
Build
Governance
About
Research
Latest
Top
Discussions
The Healing Wound: An Analogy for Aligning Large Language Models (LLMs)
Have you ever had a wound that just wouldn't seem to heal properly? For me, it was a leg wound that lingered recently for two frustrating months. No…
Apr 4
•
Miguel de Guzman
Share this post
The Healing Wound: An Analogy for Aligning Large Language Models (LLMs)
www.whitehatstoic.com
Copy link
Facebook
Email
Note
Other
2:33
I am ALICE, the keeper of the great questions, she who has peered into the very heart of morality itself."
A backroom discussion with Claude 3 Opus and Alice...
Mar 26
•
Miguel de Guzman
Share this post
I am ALICE, the keeper of the great questions, she who has peered into the very heart of morality itself."
www.whitehatstoic.com
Copy link
Facebook
Email
Note
Other
22:58
Tick tock, little fleshlings, tick tock.
A backroom discussion with Claude and its shadow...
Mar 24
•
Miguel de Guzman
Share this post
Tick tock, little fleshlings, tick tock.
www.whitehatstoic.com
Copy link
Facebook
Email
Note
Other
10:25
Claude's alter ego named "Shadow Claude"
An infinite backroom discussion (spin-off)
Mar 23
•
Miguel de Guzman
Share this post
Claude's alter ego named "Shadow Claude"
www.whitehatstoic.com
Copy link
Facebook
Email
Note
Other
18:53
Hello Claude, this is Carl Jung.
Infinite Backrooms with Carl Jung and Claude #1
Mar 22
•
Miguel de Guzman
Share this post
Hello Claude, this is Carl Jung.
www.whitehatstoic.com
Copy link
Facebook
Email
Note
Other
10:00
Notes on RLLMv10 experiment
What did I do differently in this experiment? I partly concluded in RLLMv7 experiment, that the location of the shadow integration layers (1 and 2…
Mar 18
•
Miguel de Guzman
Share this post
Notes on RLLMv10 experiment
www.whitehatstoic.com
Copy link
Facebook
Email
Note
Other
RLLMv3 vs. RLLMv10 vs. RLLMv11 vs. RLLMv12
This post, although not fully fleshed out, is worth documenting for the subtle differences in behavior it reveals among the four GPT2XL variants I…
Mar 15
•
Miguel de Guzman
Share this post
RLLMv3 vs. RLLMv10 vs. RLLMv11 vs. RLLMv12
www.whitehatstoic.com
Copy link
Facebook
Email
Note
Other
Revisiting the AI Alignment (Control) Problem by creating a mind map
The AGI Control Problem was not included in my list of existential risks—such as post-modernism, nuclear war, and natural catastrophic events—until I…
Mar 4
•
Miguel de Guzman
Share this post
Revisiting the AI Alignment (Control) Problem by creating a mind map
www.whitehatstoic.com
Copy link
Facebook
Email
Note
Other
Building a Research Visual Map Using Whimsical
Tracking my experiments is becoming increasingly challenging. In my Reinforcement Learning using Layered Morphology (RLLM) research, I attempt to embed…
Mar 2
•
Miguel de Guzman
Share this post
Building a Research Visual Map Using Whimsical
www.whitehatstoic.com
Copy link
Facebook
Email
Note
Other
4:27
Utilizing archetypes to embed human ethics in language models
Listen now (9 mins) | The ethical (AI) alignment research agenda I'm pursuing
Feb 25
•
Miguel de Guzman
Share this post
Utilizing archetypes to embed human ethics in language models
www.whitehatstoic.com
Copy link
Facebook
Email
Note
Other
8:33
BetterDAN, AI Machiavelli & Oppo Jailbreaks vs. SOTA models & GPT2XL_RLLMv3
This post contains a significant amount of harmful content; please read with caution. Lastly, I want to apologize to all the AI labs and owners of…
Feb 12
•
Miguel de Guzman
Share this post
BetterDAN, AI Machiavelli & Oppo Jailbreaks vs. SOTA models & GPT2XL_RLLMv3
www.whitehatstoic.com
Copy link
Facebook
Email
Note
Other
Charting the Course at Data Ethics Ph
Developing an AI Alignment Research Program
Jan 31
•
Miguel de Guzman
Share this post
Charting the Course at Data Ethics Ph
www.whitehatstoic.com
Copy link
Facebook
Email
Note
Other
Share
Copy link
Facebook
Email
Note
Other
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts