Last night, I changed a line of documentation in the Python source code. It is already reflected on the Python website, and will be included in the future whenever people download Python. This short essay explains why and how.
1 Back story
It was around seven months ago. I had stopped programming for a while, in despair over the attention economy and the impermanence of digital content.
Coming from a humanities background, I was first attracted to programming by its allure of certainty and universality. If one got good at it and made clever contributions, this is something that can be shared across the world.
Poems, philosophy and history are subjective and rightly so. But the attention economy fragmented everyone’s interests and hit them particularly hard. People either engage in subcultures that they love or not at all. The idea of the Canon, which everyone should study and respond to, seems more and more artificial.
In the inter-war Oxford of Daisy Dunn’s Not Far from Brideshead, everyone studied and reacted to Homer and the classical authors. With the mastery of these texts, one could rise to the top of Oxford from Australia, Ireland, or Germany. Classists were important in politics too, both in their association with the ruling class and with their direct contribution to debates around war and peace. It was an intellectual game engaging, important, and open to all.
What game like this can one play now? I asked a few analytic philosophers about Amia Srinivasan, who occupies an Oxford Philosophy Chair and writes frequently for the LRB. They either haven’t read her or haven’t heard of her at all.
In the early 2000s, online forums (e.g. Reddit) opened up new possibilities. A well-written post/song on Golden (a Hong Kong forum) sometimes captured the mood and became part of the city’s psyche. The authors were anonymous, but they played a nice game with lasting impact.
But the attention economy wore that away too. In the beginning, one could post online just for the sake of it, and we would read it just for the sake of it. Now authors can’t help but wonder how many likes they will get, and readers can’t help but wonder whether they are manipulated to see something.
At first, I thought programming stood aside from this. Surely programming isn’t affected by passing fads. What worked once works once and for all.
How wrong I was. Web development frameworks and LLM APIs wrappers came and went in a matter of weeks. Professional programming channels live stream on YouTube every day. It was worse than what I had experienced before.
For me, the final straw was vibe programming, where the idea is that one should rush to create an app or a website in minimal time with large language models. But why? Does the world really need another app or service? When we are “writing” code this way, are we really exercising our agency, or just playing a clog in a faceless and scary system? No one seems to pause to ask. The implied aim seems to be fame and fortune, or its substitute, attention. I lost interest.
2 Hacker News
It was in this mood that I discovered Hacker News.
In many ways, Hacker News is at the centre of the attention economy revolution. It was (and is) the discussion forum for Y-combinator, an influential seed funder for startups, where talented young people dream of the next Facebook or AirBnB.
The content on Hacker News was mostly technical, but by 2024 had widened to include anything “of intellectual interest”. It is telling that Hacker News deliberately kept a simple 90s format and resisted purely algorithmic ranking. The techies may be keen sell everyone’s attention, but they don’t want it done to themselves.
Hacker News opened up a new world to me. Most content is by independent authors, hosted non their own website, with quirky takes on quirky topics. But more and more I saw a common theme. People were constantly be re-creating familiar things in new and surprising ways. A chess engine in regular expressions. Tetris in PDF. OS in 1,000 lines. In every case, there is a desire to dig in and understand at a deep level how computers work: not for any specific purpose, but just because it is inherent interesting.
This is what I wanted: the 21st century equivalent for early 20th century Classics or 13th century Aristotelian theology.
To read into this, I skimmed Xv6 and Crafting Compilers and played with the repositories. The next question became: how do I learn how to build a complex system like these?
There is no shortage of materials online on “how to build X in Y minutes”. But that is not what I wanted. I need something less commercial, less impatient, less transient. A project whose life is measured in decades, not months. Something I can mull over without worrying it will disappear.
After a brief comparison with the alternatives, I chose Python. It has been around for 30 years. It is of limited scope (compared with Signal or Linux). It has an active community. It matters.
3 Reading into Python
Python was originally written in C, and the most active and influential source code remains the cPython. So that is what I chose to learn.
Given its popularity and influence, there is no shortage of good introductions. I skimmed, in order, CPython internals, Think Python and Inside the Python Virtual Machine.
I also downloaded the first release version 0.9.1 of the Python source code, and read through them with the help of ChatGPT.
In some ways, the experience was like reading a modern novel. It wasn’t too difficult to identify the starting point (e.g. the main loop on pythonmain.c at the beginning or the ceval main loop at the end of ceval). But soon the logic diverges in so many directions, I didn’t quite know which functions to follow through to read first.
But as with modern novels (e.g. the Natural History Trilogy), I simply started somewhere and read through, sometimes merely scanning the functions at first with minimal understanding. The text won’t go away: and when I come back I will appreciate its proper significance. ChatGPT proved to be a useful guide too after I read enough not to be easily misled.
4 The first PR
In around three weeks, I felt I understood enough to dare to look at the open issues reported on the Github repository.
I had viewed Gao Tian’s YouTube video on his first contribution to Python source code: how he had to wait a month, and how he was over the moon when his first commit was accepted. Gao was a Tsinghua computer science graduate who was working with Microsoft at the time. If it took a month for him, it will likely take much longer for me.
I started with looking at the simplest issues to fix, e.g. spelling or broken links (I saw a lot of merged changes among the close issues). But I soon wondered: why wasn’t this automated? There was indeed an automatic link checking mechanism. But (apparently) some changes reflected by broken links may be substantial and cannot be automatically handled. For example, the system may report a broken link due to a flaky mail server. See e.g. https://github.com/python/cpython/pull/93853/files/d34731f85e969fc8e4e4135c81e4e559f171809b#r2105027455, where it took effort to investigate whether a mailing list was inactive or not.
By the end of the day, I identified a problem on the documentation for the experimental free-threading version of Python. A hyperlink was broken because an external tool referred to by the documentation, cibuildwheel, has changed.
I submitted an issue and a PR which was accepted overnight.
5 Next
What next? It was exciting to be one of the around 3,000 GitHub contributors to cPython. A contribution to the actual code would be the natural next step. Free-threading seems to be a particularly topical issue, as its development seems connected with AI tools.
The history behind the Python project is something I want to dig into as well. It started out as a hobby project, but now has significant indirect corporate involvement.
Taking two examples from the core developers (of which there are around 60), Brett Cannon works on the VS Code extension for Python, and Sam Gross who is responsible for the free-threading version of Python works for Facebook on a related project.
How did this change take place? What does it mean for the governance of Python in particular or open-source projects in general? Is there still room for a complete outsider/volunteer to play a large role in open source? Is open-source software development really an intellectual game open to all, or is it really mainly now a part of the digital infrastructure developed by big tech? Or is there room for both?