It’s fall in America. The leaves are crunching underfoot, the smell of wood smoke is in the air. My houseplants are dying from cold in my unheatable cliff-cavern of a home. It’s cozy.

Fall is a time for reflection (ask Icarus, or Lucifer). In spring we sow, in fall we reap. I’ve been reviewing my projects lately, changing my priorities as I see which have fruited and which have withered. If you’re reading this, you may be happy to know that SCIOPS gets to live again next year. Despite the fact that I don’t know what it’s about or what effect it has, people keep reading, so I’ll keep writing.

I wonder what other people see when they read these letters. I know if I look back on what I wrote in years past it looks a hot mess. But I do see consistent themes that run through, the product of my obsessive thoughts and the oppressive environment.

And, of course, your impressive attention. So if you can stand it, I’d like to do a little navel-gazing here in practically public.

I’m going to use some tools that I acquired for free from the internet. You could do the same. Automated intelligence is less scary the more you know about it. These tools are not that hard to use, and they’re getting easier all the time. Open your mind and come with me, as we explore thoughts with code.

I’ll explain the code as we go along, but do try to read it. It describes itself, in a sort of short-hand way. Like a to-do list. You may be surprised how much you understand. After all, my longest career has been “dig hole”. If I can grasp this, so can you.

First, I have to summon the vast Library called Textacy, which knows a whole bunch of natural language processing stuff that I only kind of understand.

In [1]:


!pip install textacy
import textacy

And rummage around in the computer to find my files…

In [0]:


import os
dir_str = './'
directory = os.fsencode(dir_str)

Next I call upon the power of Textacy to load its understanding of the english language, which it inherits from a library called spaCy. Textacy is a “high-level wrapper” for spaCy, which means it automates a lot of spaCy commands into easier chunks.

I then tell it to open a new Corpus, which is a collection of documents. I will fill this with SCIOPSes.

In [3]:


en = textacy.load_spacy_lang("en_core_web_sm")
corpus = textacy.Corpus(lang=en)
print(corpus)


Corpus(0 docs, 0 tokens)

Now to gather all the SCIOPSes from this year. Each one is called something like sciops-03-38.md.

I’m in my writing folder, so some of the files have scraps of notes or are unedited. That might make some weird effects, but they’ll be interesting to see.

In [0]:


for file in os.listdir(directory):
    filename = os.fsdecode(file)
    loc = os.fsencode(directory + file)
    if 'sciops' in filename:
        if '.md' in filename:
            with open(loc, 'r') as f:
                text = f.read()
                # grab issue number from the filename
                metadata = {
                    "issue": filename[-5:-3],
                    }
                doc = textacy.make_spacy_doc((text, metadata), lang=en)
                corpus.add(doc)

Now each document has been analyzed, split into its component parts, its features transcribed. Let’s see what we can learn from an individual doc:

In [5]:


for doc in corpus[:3]:
    bot = doc._.to_bag_of_terms(ngrams=1, weighting="count", as_strings=True)
    sort = sorted(bot.items(), key=lambda x: x[1])
    print(doc._.preview)
    print(sort[-10:])


Doc(1364 tokens: "<!-- ok boomer article, dumb tweet, backlash na...")
[('ok', 4), ('class', 4), ('world', 4), ('Clinton', 4), ('boomer', 5), ('people', 5), ('tweet', 6), ('kill', 7), ('Epstein', 7), ('meme', 18)]
Doc(820 tokens: "People don't like to talk about it, but Osama w...")
[('real', 3), ('state', 3), ('trust', 3), ('win', 4), ('cult', 4), ('people', 4), ('war', 5), ('attack', 5), ('reality', 5), ('narrative', 5)]
Doc(1698 tokens: "the other problem with anarchy is that you have...")
[('invisible', 6), ('act', 6), ('world', 6), ('magic', 7), ('way', 7), ('wizard', 7), ('state', 8), ('like', 8), ('power', 10), ('people', 18)]

Above, I’ve asked for all the words in each document by frequency.

I limited the list to the top 10 terms, but it goes all the way down to a bunch of 1-use words:

In [6]:


print(doc._.preview)
print(sort[:30])


Doc(1698 tokens: "the other problem with anarchy is that you have...")
[('anarchy', 1), ('go', 1), ('to', 1), ('lead', 1), ('involve', 1), ('magician', 1), ('convincnle', 1), ('begat', 1), ('small', 1), ('scale', 1), ('aside', 1), ('corporate', 1), ('eclipse', 1), ('legitimize', 1), ('shrewd', 1), ('use', 1), ('accelerate', 1), ('domination', 1), ('economic', 1), ('monopoly', 1), ('production', 1), ('feudal', 1), ('wrap', 1), ('explosion', 1), ('ball', 1), ('gutte', 1), ('feel', 1), ('serfs', 1), ('King', 1), ('Zuckermusk', 1)]

Now for the fancy stuff. We’re going into the matrix.

Not the bullet-time Matrix. A matrix, in this case, is a giant multidimensional grid. Wait, don’t run away!

Instead think of an old-time theater set, one that simulates an apartment block. The side wall is missing so the audience can see. The separate apartments form a grid, and within each apartment the rooms form a sub-grid. Multidimensional. Simple, right?

First, we’ll build that set. Then we’re going to transform it, across dimensions.

In [7]:


import textacy.vsm
tokenized_docs = (doc._.to_terms_list(ngrams=1, entities=True, as_strings=True) for doc in corpus)
vectorizer = textacy.vsm.Vectorizer(tf_type = "linear", apply_idf="True", idf_type="smooth", norm="l2", min_df=2, max_df=0.95)
doc_term_matrix = vectorizer.fit_transform(tokenized_docs)
doc_term_matrix
print(doc_term_matrix.shape)


(33, 1990)

What we have now is a grid with 33 rows and 1,990 columns. The rows are the 33 SCIOPSes that we found in this folder.

The two thousand columns represent the words found over all the documents. Each cell holds a number that represents how many times that term appears in that document. The number is calculated using a method called TF-IDF: term frequency vs inverse document frequency. That just means it counts each term that appears in the document, then divides that by how often that word is found in all the documents total.

If each document is an apartment, and each term is an object like a kettle or a toothbrush or a dead houseplant, then this document-term matrix is a census of which items are in which houses and how common each item is to have. If every home has one to four toothbrushes, then toothbrushes don’t say a lot about the differences in the people that live in them. But if one place has a dozen dead houseplants, and dead houseplants are uncommon otherwise because I guess people throw them in the trash or something, well, that says a lot about everyone doesn’t it?

Now we can look at the most unique items in each apartment, and see how they compare. But we don’t want to look at all two thousand items in every apartment. We’re going to use a trick called Latent Semantic Analysis to group them into “topics”. If you picture all those thousands of types of object floating in space, LSA sees which objects are close together and along which dimensions. As we would expect, some things are often near each other. The kettle is more closely related to the frying pan than to the bed. There would be natural clusters in the data: different rooms or functions that stand out by their location and usage.

Let’s see what kind of topics these SCIOPSes have been about. We can find the top, say, 6 topics.

In [0]:


def slice_topics(number_topics):
    import warnings
    warnings.simplefilter(action='ignore', category=FutureWarning)
    import textacy.tm
    import pandas as pd

    model = textacy.tm.TopicModel('lsa', n_topics=number_topics)
    model.fit(doc_term_matrix)
    model

    doc_topic_matrix = model.transform(doc_term_matrix)
    doc_topic_matrix.shape

# This function will list some examples of the data in the model
    def list_terms(model, matr, vec):
        print("Topic weights overall:")
        for i, val in enumerate(model.topic_weights(matr[:5])):
            print(i, val)
        print("Top terms by topic:")
        for topic_idx, top_terms in model.top_topic_terms(vec.id_to_term, top_n=10):
            print("topic", topic_idx, ":", " ".join(top_terms[:12]))
        print("Top docs by topic:")
        for topic_idx, top_docs in model.top_topic_docs(doc_topic_matrix, topics=-1, top_n=5):
            print("topic",topic_idx)
            for j in top_docs:
                print(corpus[j]._.meta, corpus[j]._.preview)


                
# This function will visualize the model
    def plot_termite(model, matr):
        topic_weight_serie = pd.Series(model.topic_weights(matr))
        number_top_topics = 6 # max 6
        top_list_tmp = topic_weight_serie.nlargest(n=number_top_topics)
        top_topic_list = list(top_list_tmp.index)
        print(top_topic_list)
        model.termite_plot(
            doc_term_matrix, vectorizer.id_to_term,  n_terms=100, highlight_topics=top_topic_list, rank_terms_by="topic_weight",
            sort_topics_by='index'# save=out_dir_str + "/" + now + "termite.png"
            )
    
    list_terms(model, doc_topic_matrix, vectorizer)
    plot_termite(model, doc_topic_matrix)

termite plot


 

And there we have it. Six topics. The first, numbered 0, is sort of a catch-all that doesn’t map strongly to any ideas. But this makes sense: the machine starts with the category “everything else” and extracts smaller categories from that.

It’s easiest to see on the graph, what’s called a “termite plot”, which shows the weight of each term by the size of the circles.

Topic 1 is generally about “people”, it seems, but catch the “cult” further down the list. That’s definitely a theme present in some of my letters and not others.

Topic 2 is centered around the character “>”. That seems strange, until I see “class”, “elite”, “money” and “Mr” in there as well. I only use the honorific Mr when I’m dunking on someone. And when I write the letters, I use the markdown “>” to represent a quote! So this topic is when I riff on idiot elites and their corporate media bagmen.

Topic 3 is about memes, memetics and propaganda. It also has the word “Epstein”, because last week’s letter connected those words so many times, and that name is mentioned infrequently in other letters.

Topic 4 is about power: magic, war, technology and narratives. This looks like a central topic, because its weights are distributed more evenly across a greater number of words. The topics that have a huge bubble and a bunch of smaller dots are more peripheral or unusual.

Finally, topic 5 is about human hives and climate change. This is the underlying urge that pushes me to write, the reason for the season, so to speak. I may be writing about propaganda or magic or class, but ultimately I’m trying to speak to the century of continual crisis stretched out before us.

===

If you made it this far, congratulations! You’re a rare specimen of humanity. You both found the letter, and read the letter. As your prize, you can help pick the future of SCIOPS itself!

Send a reply to this letter and tell me what you want more of in the next season of SCIOPS. Pick two:

  • Cults
  • Class
  • Memetics
  • Magic
  • Climate
  • Everything else (houseplants)

Thanks for reading,

– Max


###### SCIOPS is a weekly letter that is sometimes full of badly formatted code. Feel free to forward it, or share it, or steer its very future. You can find a web version of the latest letter here , or view the archive here .

If you have thoughts, questions, or criticism, just respond to this email. Or, contact me securely at permafuture@protonmail.com

If you’re seeing this for the first time, make sure to sign up for more cyberpunk weirdness in your inbox every week.

If you want your regular life back again, you can unsubscribe from this newsletter. I can’t guarantee that will help. But you can try it