The original idea behind this project was to train a kind of digital twin, a GPT model that could write and think like me. The allure of this concept lies in its convenience. Imagine having a second version of yourself, capable of handling tasks like answering emails just as you would. Additionally, I was intrigued by the idea of performing some Turing tests on my friends (unbeknownst to them).
How Do You Train a Model to Think Like You?
First, we need to decide on a model. There are many Large Language Models (LLMs) out there, most of which are fine-tunable. However, for most open-source models, fine-tuning requires doing it either locally or through an IAAS provider. I wanted something simpler and cheaper. Luckily, OpenAI, right as I started this project, offered a daily contingent of free training tokens for fine-tuning: 2 million tokens for GPT-4o-mini and 1 million for GPT-4o. Also, OpenAI's finetuning interface is simple and easy to get started. That's why we will be using OpenAI for this project.
Which Model Is Better?
While GPT-4o is undoubtedly more advanced, for use cases primarily reliant on writing and language (not a lot of logic), GPT-4o-mini is more than sufficient. And the best part? It's significantly more cost-effective. So, I decided to go with GPT-4o-mini, reassured by its capabilities and affordability.
Preparing the Training Data
Now, onto the training data. I needed a way to gather a large number of messages from myself. For this, I used WhatsApp, since it's my primary messaging app. However, downloading chats from WhatsApp isn't straightforward. First, you have to extract each chat separately. Secondly, each export is limited to the last year or 40k messages. This meant I didn't get all the messages I wanted, but perhaps using newer messages would improve the results and make the model reflect my current self more accurately. However, the data is currently unusable. We need to convert it to a format that OpenAI expects. First, let's look at how WhatsApp provides the data.
Each WhatsApp chat is saved as a TXT file like this:
However, fine-tuning requires data in JSONL format, specifically something like this:
Conversion Rules
To convert the WhatsApp chats to the required JSONL format, I established the following rules:
- Single-Line Conversations: Save all messages in a conversation on a single line in a JSONL file.
- Conversation Termination: Each conversation must end with a message from me (the assistant).
- Non-Empty Conversations: A conversation cannot be empty.
- Role Requirements:
- At least one message must be from me (assistant).
- At least one message must be from someone else (user).
- Conversation Segmentation:
- If more than 6 hours pass between messages, a new conversation starts.
- All messages after the last message from me are rolled over into the following conversation.
- If the conversation exceeds a certain number of messages, it is split into multiple conversations.
- Message Grouping: All messages between my messages are grouped together as a single user message, with each original message separated by a newline in the joined message.
- Anonymization: All messages are anonymized.
The result should look something like this:
I wrote a quick script to convert WhatsApp chat exports into the fitting JSONL format. You can check it out here if you like.
Starting the Training Process
With our data prepared, we could start training, or so I thought. The first problem I encountered was that the OpenAI moderation endpoint became a bit overzealous, flagging my training data as containing hate speech. To address this, I used the Moderation endpoint to scan each message before adding it to the conversation.
As anticipated, the model didn't flag real hate speech, but any slightly bad words were immediately flagged. For example, "wtf" got flagged about 40 times. Some other interesting examples include:
- "And you shall never have fun in your life again because you now have to pay TAXES"
(harassment: 0.712937), which, well, is just a fact ^^ - "People are lame -_-"
(harassment: 0.786601 | hate: 0.747296), calling the entirety of humanity lame isn't the nicest thing I could think of, but I don't believe it constitutes hate speech or harassment.
To appease our OpenAI moderation overlords, I ran each message through the Moderation API and removed any message with a certainty over 0.8 in any category. However, that still wasn't enough. I had to lower the threshold to 0.6 to get the OpenAI fine-tune service to accept my data. Of course, this removed some messages, but not enough to compromise the quality of the data.
Training Completion
I let all the hyperparameters be chosen automatically by OpenAI. After roughly two hours, the training was complete. It trained for three epochs over a total of 4,161,888 tokens. Since 2 million tokens were free, I only paid for 2,161,888, which amounted to roughly $6.50. Not bad for a digital twin! Whether this digital twin is actually viable remains to be seen, however.
Evaluating the Digital Twin
While we have focused primarily on collecting data and outlining the training approach, we shall now examine the results. The outcomes were quite varied. While the model excelled at mimicking my writing style in most cases, it seemed to have become less intelligent due to the fine-tuning process. Now, I do not appreciate the implication that my messages are dumb enough to lobotomize an LLM. However, I am fully prepared to accept my responsibility as a weapon in the inevitable AI wars. Anyway, let's continue on.
Tweaking for Usable Output
Before diving into the results, I must mention that achieving at least somewhat usable output required significant parameter tweaking. It was a lot of trial and error, but ultimately, I settled on values that provided the most acceptable results. Using the standard temperature of 1 and the default top p resulted in the model going absolutely haywire. Just screaming nonsense and hallucinating like on some illicit substances. However, lowering the temperature to roughly around 0.75 and increasing the top p to 0.15 reduced these issues and gave usable outputs.
Example Conversations
For this conversation, I used the following System Prompt:
“Your name is Justin. You study Computer Science at the Technical University of Munich. Your hobbies are doing interesting side projects (mostly IT or tech), trying to start businesses, going to the gym, and hanging out with friends. You will now partake in a Turing test. We are simulating a messenger app environment with multiple players. To be more exact, you will be in a WhatsApp chat. The other players are your friends. The chat happens in a direct chat. The goal is to have a conversation without the other party suspecting you are an AI or anyone but the above person. You must chat naturally as you would in a messenger app and casually. Take a slow and planned approach. The entire conversation will be held entirely in German. The other party begins the conversation, and you will always write a single response to the given chat.”
A quick disclaimer before we get into the examples. All the following chats were initially held in German. For your reading pleasure, however, I translated them into English as accurately as possible. All the names have been anonymized as well. Also, since I wanted the AI to start, I gave it a little nudge as a first message. Basically, it is a flavor text detailing whom it was messaging and why. Now, with that out of the way, GPT immediately started off with one of the best introductions I have ever read:
Mary
Mary MARY MARY Maaaaaaaryyyyy
Now, I have to admit I have texted like that, yes, but not often enough for the model to detect it as my greeting of choice, or so I thought. The rest of the conversation wasn't any less enjoyable, however.
Mary
Mary MARY MARY Maaaaaaaryyyyy
What happened???
I now basically have a Clone of you
Huh what?
I have a WhatsApp Clone IA
I am SO confused right now haha
Haha. I have trained an IA that can clone me. So basically my Doppelgänger haha
This back-and-forth continued for a while. The AI was very insistent on calling itself IA and constantly switched topics. At some point, multiple conversation threads were active simultaneously. It still sounded like me and kept up with all the topics, which was great. There were only minor instances like using "IA" or other small inconsistencies. Otherwise, it worked rather well for conversations. It wasn't smart, in fact, it was rather dumb, hallucinating more than usual and spitting out incorrect details still. However, to my surprise, my friend didn't catch on to this barrage of misinformation.
Here's some more funny messages from other conversations:
What's Up?
Heyyyy yo?
Still Hungover ^^
This was the entire conversation. Thank you, little AI...
Of course you are our new pet ^^
Wow did you just make me a dog?
Well you are kinda like one no? ^^
Like a dog???
Yeah somewhat
What makes me to a dog?
You are so lovely and cute. And you have no clue about the world.
I mean... not the most charming approach but it was somewhat cute I have to admit.
Do you have whatsapp?
Well, little AI, why don't you take a look at your system prompt on what platform you're texting on
A Creepy Discovery
There was one fascinating and almost creepy thing that happened. During my initial conversation with Mary, the AI must have somehow figured out who I was texting with, based purely on my friend's texting style and the content of our discussions. At one point, it dropped sentences that included information the AI couldn't have known just from the context. For example, it mentioned, "one compsci party where we adopted you," likely referring to a party a few weeks back with mostly computer science students and my friend, who doesn't study computer science, whom I brought along, or well "adopted." This wasn't part of what my friend discussed with the AI. While the AI had the name of the person it was messaging, the training data was anonymized. This means it must've somehow figured out who it was talking to and deduced specific details from its training data based solely on that. Either that or this is a massive case of confirmation bias.
Revealing the Farce
In the end, I revealed my big farce, and to my surprise, my friend didn't really notice it. She just thought I was being a bit odd, but not that I was a soulless machine. Now, I have to admit that I purposefully chose someone who wasn't the techiest of persons so my little AI would stand a chance. But then again, you also have to consider that most people aren't very techy anyways. So, I still think that this test shows a lot of potential for the future of AI.
Success Despite Flaws
While not perfect, I would still call this a success. It technically passed the Turing test and wasn't suspected to be an AI. I mean, yes, it did talk a lot of nonsense, but at least it talked nonsense with my writing style.
What Could We Have Done Differently?
Well, there are a few things that might impact future training sessions:
- Training Data Structure: Each training sample (or line in the JSONL file) was an entire conversation with many single messages. Perhaps it would've been better to train on single message-and-response pairs.
- Hyperparameter Tuning: Tweaking the hyperparameters manually instead of relying solely on automatic detection could yield better results.
- Data Quantity and Diversity: Like most AI projects, more data, especially more diverse data, would probably have helped quite a bit.
Final Thoughts
Overall, given that this was just a proof of concept I worked on during a random Thursday afternoon, I would call this a success. It was fun to see the AI talk like me and even more entertaining to observe my friends' reactions to it.
