I’m in the top 2% of users on StackOverflow. My content there has been viewed by over 1.7M people. And it’s unlikely I’ll ever write anything there again. Which may be a much bigger problem than it seems. Because it may be the canary in the mine of our collective knowledge. A canary that signals a change in the airflow of knowledge: from human-human via machine, to human-machine only.
Don’t pass human, don’t collect 200 virtual internet points along the way. StackOverflow is *the* repository for programming Q&A. It has 100M users & saves man-years of time & wig-factories-worth of grey hair every single day. It is driven by people like me who ask questions that other developers answer.
Or vice-versa. Over 10 years I’ve asked 217 questions & answered 77. Those questions have been read by millions of developers & had tens of millions of views. But since GPT4 it looks less & less likely any of that will happen; at least for me. Which will be bad for StackOverflow. But if I’m representative of other knowledge-workers then it presents a larger & more alarming problem for us as humans.
What happens when we stop pooling our knowledge with each other & instead pour it straight into The Machine? Where will our libraries be? How can we avoid total dependency on The Machine? What content do we even feed the next version of The Machine to train on? When it comes time to train GPTx it risks drinking from a dry riverbed. Because programmers won’t be asking many questions on StackOverflow. GPT4 will have answered them in private.
So while GPT4 was trained on all of the questions asked before 2021 what will GPT6 train on? This raises a more profound question. If this pattern replicates elsewhere & the direction of our collective knowledge alters from outward to humanity to inward into the machine then we are dependent on it in a way that supercedes all of our prior machine-dependencies. Whether or not it “wants” to take over, the change in the nature of where information goes will mean that it takes over by default.
Like a fast-growing Covid variant, AI will become the dominant source of knowledge simply by virtue of growth. If we take the example of StackOverflow, that pool of human knowledge that used to belong to us – may be reduced down to a mere weighting inside the transformer. Or, perhaps even more alarmingly, if we trust that the current GPT doesn’t learn from its inputs, it may be lost altogether. Because if it doesn’t remember what we talk about & we don’t share it then where does the knowledge even go?
We already have an irreversible dependency on machines to store our knowledge. But at least we control it. We can extract it, duplicate it, go & store it in a vault in the Arctic (as Github has done). So what happens next? I don’t know, I only have questions. None of which you’ll find on StackOverflow.link