Previously,

I wrote about speeding up your learning loop with artificial recall – specifically, the Thread Helper browser extension for Twitter. Some of you are trying it now; keep me posted on your experience. I’m not shilling for this product, I’m just curious how it changes your cognition.

One way I find myself using this extension is as an autocomplete for search. In this mode I don’t post anything: I type some vague tip-of-my-tongue notion into the tweet box instead of the search box. With every keypress Thread Helper instantly shows me an list of related tweets. Eventually I find the thing I was looking for, and I can quickly click through to it, abandoning my draft. This is much easier than using Twitter’s limited search engine. It’s as big a difference as automated phone menu versus talking to an operator. It brings the information to me.

📱 Autocomplete? Isn’t that just for phone keyboards?

Hardly. From spellcheck to search suggestions to predictive text to swyping, computers are constantly second-guessing humans and finishing our sentences. Pretty much the only place where you can’t expect the computer to help you out is the official Twitter search!

Yet the normative attitude is still “ugh, autocorrect” or, at best, “type ‘this funny thing’ into your phone and autocomplete the rest, LOL!”. (Here I conflate autocorrect, autofill, autocomplete and predictive text, because to most users they are the same: a genie that lives in your computer and guesses what you mean.)

Emacs, the wizened editor/OS/religion, has survived for decades despite its pre-WIMP interface, partly due to its early adoption of autocompletion. It’s also the flag-bearer for the “Do What I Mean” philosophy: many Emacs functions try to parse your intent, rather than your exact syntax.

Code editors and IDEs have followed this philosophy since, and autocomplete for code is incredibly advanced and powerful. A modern editor really does feel as fun as the original Atom promo:

Why can’t regular people have tools like this? I’m always surprised when I see someone using a word processor that doesn’t autocomplete parentheses or quotation marks, for instance, yet I understand this is the way most software works. For now, at least.

Autocompletion and Do What I Mean are part of the Collaborative Human-Agent Dialogue interface (CHAD). The future will have autocomplete everywhere.

🤨 Autocomplete everywhere? Do people want to be constantly corrected by machines?

It won’t feel like correction if it’s done well. It will feel like collaboration.

All the major operating systems are pivoting toward CHAD interfaces, with varying success. Microsoft, Apple and Google all have their own Assistants, faithfully tracking your behavior and sending updates to their masters in the Cloud. Linux distros don’t have the same capabilities by default; but they pioneered desktop search before Windows and MacOS figured it out, and they have a lot of the pieces, so it wouldn’t surprise me if they caught up quickly and with better privacy standards.

There was a recent blog post about Tracker, the backend for desktop search on GNOME.

Tracker 3.0: Where do we go from here? (believe it or not, I found this in my browser history by typing “t” into the address bar and it popped up instantly).

The author is one of the Tracker maintainers, and he gives a great overview of the history and future of search. He concludes by hinting at a future of CHAD on Linux:

A more likely goal would be a new “GNOME Assistant” app that could respond to natural language voice and text queries. When stable, this could integrate closely with the Shell in order to spawn apps and control settings. Mycroft AI already integrates with KDE. Why not GNOME?

The place where autocomplete has really taken off is on the smartphone, where the touch keyboard has far more friction than a physical model. A typical touchscreen phone might take input from any of: virtual keypresses, swyping, voice-to-text, autocorrect, autocomplete, incremental search, autofill (of passwords, etc), or even automatic live translation of a foreign language. This menagerie accommodates the diverse needs and abilities of users, but until recently it was a messy hodgepodge of systems, none of which were very good.

But the landscape has changed since 2018, when so-called “Transformer” language models like BERT and GPT hit the scene.

🤖 Transformers? Like the cartoon robots?

Not exactly. They’re called that because of their architecture (also because it sounds awesome). The transformer architecture is a particular type of neural network that’s far superior to previous methods for natural language learning. They’re the reason that smartphones are finally smart.

😲 Wait, I’ve heard of this! It’s GPT-3!

Fraid not, actually. GPT-3 is the most cutting-edge language model right now, to be sure. Unfortunately it’s so huge and compute-heavy that it requires Microsoft-level hardware capacity. So even though it has an incredible ability to generalize, it’s unlikely that it will be put directly into production apps. (The pricing for their API is prohibitive as well). More likely, small transformers will get their knowledge of the world from larger models (through transfer learning and model distillation) and then fine-tune on their specific tasks.

Transformers make it easy enough to work with messy human language that, unlike previous waves of automation, language models can automate “knowledge work”: translation, data entry, programming, customer service. The main limit on their use is imagination: how people choose to apply them. Unfortunately, the political economy of the world doesn’t bode well for that.

Robert Anton Wilson, a philosopher-entertainer of the 20th century, said “the border between the Real and the Unreal is not fixed, but just marks the last place where rival gangs of shamans fought each other to a standstill.” This is more true now than ever, because of the incredible power that intelligence augmentation will unleash.

Platforms will raise their language models with different values and those values will feed back into their decision-making apparatus. The worldviews encoded in the models will diverge, and our thoughts will be nudged by them. The tense border of Reality will erupt into a multifront war.

Perhaps this war has already begun. Hit Reply and type “the border between the Real and the Unreal” and autocomplete the rest. I’ll go first.

the border between the Real and Unreal and the state is a job and a good day at the same place to be a good fit and the other one will work for you to come over but if i can help in the morning I’ll let me that i can help with that

🤔

Thanks for reading,

– Max


Robot Face is a weeklyish essay about the interface between human and computer. If you like it, think of someone you know who would also like it, and forward this to them! Or post about it on your newsletter/blog/website/social media account.

And as always, just reply to this email if you want to get in touch with me. 🤙