Software development can be frustrating at certain moments of the day, whether because you cannot get that feature working that you've been asked to implement, or you are simply not feeling inspired. After many hours implementing the functionality, you are completely fed up with everything, you decide to push the changes to the repository and, being so fed up, you write a beautiful "Fuck" as a commit message.
Although it may sound amusing, quite a few developers actually do this. There are numerous reasons for using good commit messages, but sometimes the level of frustration is such that you simply write what you are thinking, rather than what you should (after all, it's your work, you know?).
GitHub released some time ago a public repository with their data so it could be queried in BigQuery. It turns out that someone came up with the idea of creating a query to find out how many of the 183 million commits include the word "fuck".
1 in Every 6–7 Thousand Commits Includes the Word "Fuck"#
The query used to extract this data is the following:
SELECT next_word, count(next_word) as n FROM ( SELECT commit, LOWER(REGEXP_EXTRACT(FIRST(message),r'(?:\w*fuck\S*\s)(\w+)')) AS next_word FROM [bigquery-public-data:github_repos.commits] WHERE REGEXP_MATCH(message, r'\w*fuck\S*\s\w+') GROUP BY commit ) GROUP BY next_word ORDER BY n DESC LIMIT 100;
The result of running this query tells us that in 183 million commits, approximately 33,000 messages contain the word "fuck". This means that for every 6 or 7 thousand commits, at least 1 contains that word.
Although it is admittedly difficult to know the reason for such messages (a private toy repository is very different from an open-source project with multiple contributors), it is a fairly curious piece of data that can be extracted with a bit of SQL.
In fact, this study has also taken into account what the word following that "fuck" is; "fuck up" is not the same as "fuck you". Theoretically the sentiment behind each commit in this case is different. In the image below you can see the 100 most common words that follow that "fuck".

The Most Common? "Fuck up"#
"I screwed up," "What a screw-up," or "Fixing this mess." It seems the most common is when someone has implemented a feature incorrectly, left a bug, or simply something is not working as it should, and the developer in charge of fixing the mess pushes with the fix.
We can also find the famous "fuck you," which may be directed at the technology in question (Fuck you PHP), a type of variable (Arrays changed to vectors. Because fuck you arrays), or a way of blaming someone without naming them (Fuck you, fixed bugged CSS).
But not everything is negative. We can also find commits celebrating something with a cry of "Fixed that bug. Fuck yeah!" or simply telling colleagues that it works, even without quite knowing how ("It fucking works. Inneficient as santa on drugs, but it works").
From all of this more data could be extrapolated if desired. A couple of cases come to my mind:
- Which programming languages frustrate developers the most?
- Are there more commits of this type in repositories with many contributors or in repositories with few contributors?
- Do the most popular repositories like Bootstrap or jQuery also suffer from this type of problem, or is it more of an issue with less popular or smaller repositories?
- What is the relationship between these commits and developer experience? Do the most experienced developers also fall into this type of practice, or is it more common among less experienced developers?
- Would it be possible to know if the commits were caused by IDE configuration failures, human errors, or poorly specified requirements?
I think staying with just the raw number is missing an opportunity to better understand the habits of software developers. And if they helped us improve our tools, processes, and techniques, well, what can I say: "fuck yeah!"
