As published on Medium on 4/1/2023: https://medium.com/@jaredmolton/chatgpt-tweets-and-its-painful-89e70efc7e5e
OpenAI’s ChaptGPT engine has taken the world by storm and become a cultural phenomenon. As of February 2023, ChatGPT reached 100 million monthly users in two months — an accomplishment which took TikTok nine months and Instagram 2.5 years. South Park even produced a ChatGPT episode which extolls the power of the tool at improving relationships by crafting great text message responses. It’s hilarious. ChatGPT is also making its way into the workplace vernacular. It’s coming up more often as part of product ideation brainstorms and companies like InstaCart are integrating it into their product offering. A similar trend occurred when Blockchain gained popularity in 2016. Product managers love to think about how to incorporate technology trends into their product, even when it may not be applicable. I mean, we are tech nerds after all.
While Twitter CEO, Elon Musk, warns about the dangers of AI, in March, I ran an experiment to understand just how good ChatGPT was at tweeting. I asked ChatGPT for a tweet recommendation everyday over 15 days. The conclusion; AI can tweet, but the content is boring and doesn’t have the potential to go viral. Said another way, I wouldn’t follow ChatGPT on Twitter.
Methodology
I’ll warn you that the experiment did not follow the scientific method. Everyday, from 3/7/23 to 3/18/23, I would open ChatGPT and ask “What would you tweet now?” I experimented with other prompts like “What should I tweet ,” but instead of giving me actual tweets, ChatGPT would respond with tips for how to tweet which wasn’t the point.
However, when I asked ChatGPT for what it would tweet, it would provide a tweet in its response. I’d copy the tweet from the engine, bring it into Twitter, and then often give it a grade. While my grading rubric was mostly subjective, there were specific acceptance criteria that I considered including 1) accuracy; 2) timeliness; and 3) potential to go viral (get favorited, responded to, retweeted, etc.).
If the tweet wasn’t accurate then it would get an F. If it was accurate and timely it would get a C or above. I left a lot of room for judgement on the viral factor.
Results
From a task completion perspective, ChatGPT was always able to provide a tweet that fit Twitter’s character limit restrictions. There were some examples where I had to add a second tweet to continue the tweet, but that was only when I wanted to provide additional commentary. The experiment revealed that ChatGPT 1) skews towards tweets that celebrate a certain kind of day; 2) is not aware of local time zones; 3) loves public service announcements and motivational phrases that one may find on a classroom poster; and 4) is not interested in stirring the pot. It has a vanilla Twitter personality.
The lack of awareness of local time zones was a big miss for ChatGPT as it resulted in tweets that were usually one day early. For example, it celebrated International Women’s Day, Pi Day, and St. Patricks day one day early. My hunch was that ChatGPT lives in Greenwich Mean Time (GMT). So while it was a day early for me Seattle, it would have been accurate if I was tweeting from London. However, it tweeted a #MotivationalMonday hashtag on a Saturday which makes me dubious that my hypothesis on the cause is correct. Still, this miss resulted in all of my F grades.
Considering many of the tweets were in the spirit of a PSA or motivational message, I was impressed that ChatGPT didn’t provide any duplicate, verbatim, tweets. However, there were occasions when a ChatGPT tweet recommendations were similar and used the same language patterns. For example, on Day 6 ChatGPT tweeted “Remember to take breaks and prioritize your mental health. Taking care of yourself is just as important as achieving your goals” then on Day 10 it tweeted “Remember to take breaks and prioritize your self-care. Your mental and physical health are important too!” These responses are awfully close and share the same first five words. I was encouraged that the tweets weren’t duplicative, but this indicates that ChatGPT does not prioritize unique answers in its output and is not using multiple queries from a requestor to contextualize its results.
The calculus of what makes a tweet go viral is complicated. If it wasn’t, then any Joe Shmo could blow up Twitter regularly. A viral tweet can be newsworthy, witty, thoughtful, or contentious. Jonah Berger, Wharton professor and author of Contagious, has a framework to understand and predict viral content called STEPPS. According to Berger, the content must have some combination of 1) Social currency; 2) a Trigger; 3) Emotion; 4) Public accessibility ; 5) Practical value; and 6) Storytelling. ChatGPT’s tweets were public, but that’s about it. ChatGPT was unable to achieve any of the other STEPPS characteristics with its tweets. Instead, it played it safe. Even when prompted to share a contentious tweet, ChatGPT responded with advice to debate with those who disagree with your perspective (Day 14). That missed the mark on the prompt and demonstrated ChatGPT’s inability to offer a point of view or perspective. That’s probably a good thing, because once AI has a perspective and control of our technology systems, we’re probably heading towards the world of SkyNet that Elon and others fear.
Comments