Nov 28, 2023

cognitive stopwords

Day 27 of 30. Back in my undergrad days at the University of Waterloo, I spent 4 cold, dark months at the Language Technologies Research Centre in Gatineau, Québec, working on statistical machine translation: the translation of human languages by computer.

This was back in a time of transition from Good Old Fashioned AI to more data-intensive methods. Deep learning was still a few years away from taking over. Systems designed to encode expert domain knowledge were on their way out, in favour of more statistical methods. Still, you could find vestiges of the good old approaches: stop words, for instance.

Many natural language processing (NLP) systems of that pre-deep-learning era included a list of stop words, which were filtered out before further processing. The idea was that these words didn't provide enough signal for the task - whether that was translation, or information retrieval, or some other NLP task. They had no meaning that could be discerned by the machine.

There's a passage from one of Feynman's autobiographical writings that I love, and that I think beautifully captures the essence of a cognitive stopword:

... For example, there was a book that started out with four pictures: first there was a wind-up toy; then there was an automobile; then there was a boy riding a bicycle; then there was something else. And underneath each picture, it said "What makes it go?"...

...I turned the page. The answer was, for the wind-up toy, "Energy makes it go." And for the boy on the bicycle, "Energy makes it go." For everything "Energy makes it go."

Now that doesn't mean anything. Suppose it's "Wakalixes." That's the general principle: "Wakalixes makes it go." There is no knowledge coming in. The child doesn't learn anything; it's just a word.

Indeed. What does energy mean here? Does it convey any information? Or are you just supposed to sit, and nod, and pretend you know what it means?

Imagine you're in a meeting room. Someone Important has convened a whole lot of people to deliver an Important Message: we're all going to be agile from now on.

What does agile mean here? Are we all going to do Scrum or SAFe, or take a more small-a agile approach in the spirit of the manifesto? Are we all following the same process, or are teams free to choose their own process? How will we know whether we're successful in being agile?

Chances are you've heard some variant of this before. Big Data. Do we have big enough data to be worth the extra complexity? Cloud-native. So should we go all-in on AWS proprietary tools and platforms, or seek to avoid cloud vendor lock-in? Generative AI. Are we building our own models, or just using existing generative AI services? Are we investing in data infrastructure? Should we be? And to what end?

Acronyms are also great cognitive stopwords. CMS: content management system. Sounds nice. Doesn't mean anything, not by itself. What content? What's the destination of that content? Who needs to be able to edit it? What are their user needs? Does your content need localisation? Does your CMS need an API? What sort of API? And so on. It's like a version of Five Whys, but for words instead of problems or systems.

One decent litmus test for cognitive stopwords: if you ask n people what the word meant, will you get O(n) different interpretations? If so, it's not conveying information; people are just reading whatever they want into it. Might as well say we're doing Wakalix development, or being Borogove-native, or whatever other syllable mash strikes your fancy.

Sometimes the word really is meaningless, but usually it just needs more context.

Team Frob has had good success with a kanban-based approach, and has been sharing their experiences with other teams. Based on feedback, we'd like teams in our area to adopt a similar approach moving forward. Sprint length is up to you, but 2-4 weeks is recommended. You need to have planning sessions, standups, and retros, but the format of each is up to you. Team Frob has a guide to setting up their workflow, plus some tips on adjusting it to your team's needs. You can reach out to...

There. Much better. I know how to get started with this. Whether I agree or not is beside the point. At the very least, I have a clear idea of what it means.

If you're delivering a message: choose words that have meaning. If you're receiving that message: challenge words that don't have meaning. Don't let cognitive stopwords slide. It's worth the time and effort to ensure communication removes confusion instead of adding to it.