What I Learned From Tuning GPT-3 to Write Edtech News Like Me
And what does it mean to write “like me”?
GPT-3 can write Shakespearean sonnets and imitate just about any notable writer through history. But can GPT-3 write edtech articles like me, a former journalist with some 700 bylines on EdSurge? What would it take to train it to do so?
Furthermore, what do the results reveal about the capabilities and limitations of GPT-3? What does it mean to “write like me?”
I enlisted the help of Brady Fukumoto to scrape all 746 articles from my EdSurge bio page and format the data to be suitable for training. A GPT-3 training sample requires two things: a prompt and completion. A prompt can be something like: “Your favorite subject in college.” The completion is the response: “I liked studying late-imperial Chinese history because...” The longer your sample, the more precise the AI will be.
For simplicity’s sake, we labeled the article headlines as the prompts, and the body text as the completion. This is not the ideal training case (headlines and stories don’t always have a strong logical relationship) but we could do this somewhat expediently for a bunch of articles.
This was the longest step in the process and, as we’ll see, a limitation to the scope of our experiment. Whether you’re scraping from the web, copy and pasting personal documents, or manually inputting your own training samples, this can be time-consuming. But once you have clean data, the actual AI training only takes 15-20 minutes following OpenAI’s documentation.
You can try out GPT-3 TonyBot and also see the samples it was trained on (in the JSON file). Beware: Results are not factual and do not reflect my actual views or opinions!
Meet GPT-3 Tony
GPT-3 Tony produces weird and unpredictable thoughts, not unlike human Tony. It can deliver good, delightful responses on the first try. Or complete garbage. But you can ask it again and again to instantly generate different results until you get what you want.
Below are three prompts on topics I used to write regularly about. I have made no changes to the responses. (But since they are riddled with factual inaccuracies, I am sharing them as screenshots so that they are not indexed on the web.)
Why is edtech a difficult industry? (1,430 words — long article)
How will AI change education? (616 words — standard web article length)
It is funny, uncanny, flattering and a little unsettling to see GPT-3 do its best imitation of me. I can hear my “voice” as I read its paragraphs. I recognize patterns in how I set up arguments, support them with quotes and stats, then transition into caveats and opposing viewpoints. How the stories weave together claims and counterclaims, with metaphors and idioms sprinkled in. I see familiar sentence structures and beginnings and ends.
More incredibly, it is something to see my personal writing habits, ticks and idiosyncrasies, honed (often subconsciously) over the course of 10 years, learned by GPT-3 in a matter of minutes. As much as we like to think that our prose has a distinct voice and flow, the reality is that they fall into a predictable set of conventions.
Writing is a personal means of expression. It is a large part of one’s voice and identity. Even though I’ve edited myself many times before, it is a different, almost out-of-body experience to edit something else writing as me. Am I this predictable? Am I this easy to mimic? Is this what a “unique” voice looks like? You be the judge.
Self Assessment
Setting aside my existential quandaries for a moment, I found a journalism rubric to self-assess the three sample articles generated by GPT-3 Tony. A few observations:
Lede: GPT-3 Tony did a good job of following a golden rule of journalism: capturing the gist of the story within the first couple paragraphs. It even mimicked colorful details I sometimes used to get readers past the first sentence. (From the first sample: “... as he takes a swig of bitter ginseng tea.”)
Structure and Flow: Since I trained GPT-3 Tony on articles about edtech businesses and market trends, everything it generated was in this style. And because these stories involve names, quotes, facts, figures, reports and events, the AI often makes up and inserts them throughout the pieces to support arguments and transitions. This becomes a problem, of course…
Information: GPT-3 Tony dishes out fake facts and fake news, and on this rubric it completely fails. But that’s not really fair since it was trained and asked to do exactly this. (And the made-up facts do support the arguments!) Some of the proper nouns are of actual people and organizations in the edtech industry, but they are never accurately associated.
That said, some of the “information” generated by GPT-3 Tony are profound, for instance:
The desire to know how edtech products are used in practice may prove more useful and valuable than an endless stream of weekly user statistics, she believes. “If you can show the value of how your tool is used, that’s more useful than the number of districts you’re signed up with.”
Language: GPT-3 Tony’s tone is pretty spot on. There are a mix of long sentences with multiple clauses interspersed with short, snappy ones. It also picked up on my tendency to start sentences with “and” and “but” (despite being told in school that I shouldn’t.)
Fairness: The samples weave in arguments and counterarguments, alternating between optimistic points and measured ones. From a sourcing standpoint, GPT-3 Tony included lots of “comments” from entrepreneurs but not from educators (which I’m starting to feel guilty about.)
The Future of (My) Writing?
This experiment was largely a personal test to see how well GPT-3 can replicate my writing. It is trained on limited parameters (article headlines and text) to imitate a specific kind of output (factual journalism) that it is not optimized for.
With light editing, GPT-3 can do a competent job at mimicking anyone, fact or fiction. It is easy to imagine all the possibilities and concerns this raises, and it is also hard to ignore the headlines about use cases. The media is catching on to high school and college students using GPT-3 to do their homework and essays and submit them to unwitting teachers. (Sorry, kids!)
How useful have I found it, personally? For this piece, not at all. I copied and pasted sentences in Lex and Sudowrite to see what it would generate, but the results are more entertaining and distracting than useful. At times, this prompt-based approach to generating ideas feels like a different form of web search, except the results are in prose instead of being scattered over a dozen tabs.
Maybe this piece isn’t the best use case for GPT-3. Or maybe I need to write better prompts.
Other than learning how the training is done, the most interesting part of this exercise has been trying to dissect and decipher how a machine is attempting to sound like me. I can see potential interesting applications for peer review, or working with an editor or teacher. Or learning about the styles of other writers or literature from different time periods.
Some writers develop a masochistic relationship to the craft; the struggle, and overcoming it, feels just as valuable as the final published piece. I am one of them. Writing is more than communicating. It is thinking and feeling, clarity and confusion, inspiration and frustration (often more the latter). Every writer will have to test and discover where, or whether, GPT-3 fits into their process.
Not every writing task warrants this experience or effort, of course, and GPT-3 can help in many everyday instances where we just need to get the job done. But the process of writing is also a process of crafting our voice and identity. As a bottomless well of ideas, GPT-3 can show us all the exits, detours, interchanges and merges we can take in the journey. But I don’t think it will ever fully take the steering wheel in the creative process.
Have you used a paid version of GPT3 to train it?
Writing is a bit like making spaghetti. You can definitely use the stuff in a box. But pulling out the flour and egg and mixing it yourself? Sublime.