Communication, Machine Learning, Philosophy, Random Thoughts

The Metaphor of a Large Memory Model (LMM)

It’s 4:39 AM, and I’ve been awake for nearly an hour. I’m sitting in the dim living room of my Astoria, NY apartment with a bowl of leftover Spanish-style rice from dinner last night, ten and a half weeks pregnant with my first child, thinking about whether a better framing for building artificial representations of intelligence is through memory models, rather than language models. I find myself today in a situation that in some ways, I’ve been before, but in other ways are very unique to this moment in space and time. Memory is keeping me tied to all of the versions of me who have experienced this, past and present.

I can understand the appeal of language models. Language – the act and structure of communicating the cognitive processes I undergo on a day to day basis – is observable, whereas memory is not. When I woke from my dreams this morning, early and abruptly, I found my mind swimming in various currents of what had come before and what might come to pass in the future. Pregnancy – a time where the cognitive and physical transformation is equated to puberty, albeit in a far more compressed timeline – is a unique liminal period in which my body is literally splitting into two entities.

I woke up dreaming of Blacksburg. I studied computer science at Virginia Tech, and the unique nature of a high-tech campus nestled in the valleys of the Appalachian Mountains became one of the anchoring concepts of my early adulthood and independence. It has been ten years since I last visited – I haven’t returned since graduation – and I found it calling to me today, begging to be remembered.

Graduation day, May 2013. The housing that they had us in for graduation was the same dorm that I spent my freshman and sophomore years living in.

My husband, an avid photographer, captures moments the way that he wants to remember them. He has the ability, through these photos, to recall and tell stories of points in time that are specific. Recently, he has begun writing stories to go alongside the photos, so his memories become concrete, with digital artifacts enabling his recall.

I find my memories to be far more abstract. In hindsight, I feel as though they are tinged or perhaps blurred by complex trauma. Experiences that at the time felt like “standard issue” college rites of passage became, with maturity, moments where I could acknowledge the danger I was in; the fear that I had pushed through for the act of fitting in. It took nearly a decade to untangle those experiences and find myself within the mess.

Over the past several months, I’ve been working through the development of an architecture that may someday allow me to digitize my memory in a more complete way on the glass whiteboard in my office. There are limits, of course – no amount of computational advancement will allow me to retroactively capture the state of my brain as I struggled through my first autistic meltdown in front of a group of strangers, hours away from the room I called “home”, in the middle of the woods – but I can use language to approximate the state for recall. I can connect the concept of that memory and store it, and theoretically develop a personalized mechanism for embedding those artifacts.

💭 Please forgive the gross simplification that lies ahead, it’s 5:02am.

My brief 4:59am research tells me that the human brain – on average – now processes 74GB of data per day. Of course, retention and absorption of that information is more difficult to measure (and I’m skeptical of the number), but for a thought exercise, let’s take that to be true. If I processed 74GB of information per day – and was able to artificially digitize those memories over the period of 20 years, I would have 392.2 TB of data saved.

This is 490.25 times the amount of data to build GPT3.5 (800GB stored).

What makes a memory?

The greatest unknown in artificial intelligence (okay, there are a lot of unknowns, but bear with me here for a second) is how machines are able to make the associations that they do in order to answer our queries with anything other than nonsense. A large language model application programmed as a chat bot may give different answers when you ask them the same question many times, and scientists aren’t sure why. There are all sorts of parameters that you can change for many of these systems now, to introduce variation into these models, but it is still hard to understand how generative models “think”, because it is still hard to understand how we do it ourselves.

There are countless concepts that can spin out from thinking about Virginia Tech, and the thoughts that I “generate” change based on what’s happening. I would hypothesize that memory recall and thought generation are closer and closer in hand than we realize, but modern “AI” systems filter this through so many layers of prompts and manipulation, we aren’t actually replicating “thought” or “cognition” the way that some think that we are.

In this case, coming out of a dream state and deep sleep, it presents an interesting case study of n=1 where much of the generative motivation was subconscious to a point.

I couldn’t tell you what the dream was. Not because I don’t want to – because the form that the dream took was so vague and abstract, I woke up with my thoughts flooded with the concepts of “Virginia Tech” and “Blacksburg”. I remember feeling a deep sense of sadness and remorse that I hadn’t yet taken my husband to visit the campus, to see how his artful imagination captures a place that holds such a monumentally transformative place in influencing my cognitive development.

Enter the generative part of the process of memory recall: My mind shifted to PhD programs. I’m nearing the end of my MBA at Columbia, and I’ve been considering continuing my academic career by pursuing a PhD that allows me to further study the digital representations of cognition. I thought about my sister’s childhood friend who got her PhD at Virginia Tech, and now has two babies – is Blacksburg a good place to raise a family?

I thought about professors. Who would I ask to write me letters of recommendation? I’ve always dreaded that part of the process of applying to graduate schools. My rejection-sensitive dysphoria is a bully, but names drifted in and out of my psyche. One thread of thought started generating the text that will someday make up the body of the email that I send to ask. Lane. Craven. Ivory.

Ivory! Perhaps the appropriate department to pursue for such research is communications. Despite a deep love for creative computing, I’ve always found that computer science has been a companion impossible to please. And of course, I cannot separate mentally Dr. James Ivory and Philip Rosedale, because it was in Ivory’s class I first became aware of Second Life. I was, at the time, decidedly unaware of the fact that that course would be so monumental in shaping the trajectory of my career.

Flashes of mountains and pictures that aren’t quite memories, but are artistic renditions of existing memories.

Google. Does Blacksburg have gigabit internet?

Redfin. What does the real estate market look like in Blacksburg? Flashes of streets, mentally placing names to proximity to the places I hung out in college. The Cellar is still there – Zach and I just googled it the other day because I was missing their curly fries.

Me, at The Cellar in Blacksburg, VA, in 2013. I didn’t drink the whole pitcher of cider myself. Not pictured: aforementioned curly fries.

Parents. Blacksburg is near Roanoke, but it would be far away from both sets of parents for our family. Flying in and out of Roanoke is nice, but it’s still an hours’ drive to campus. Is there enough non-student community? We’d be much closer to Jess. What would life look like there? Kroger. Opportunities to renovate basements. Cookout late at night in Sierra’s car.

So, what makes a memory?

The architecture that I’ve been working on uses a card-based approach to storing memories. Each unique concept that I can think of might have a card, and like a wiki page, it can be linked to related cards. So far, I’ve determined that memories have associations with places, people, activities, emotions, and judgements. In thinking through this morning’s exercise in memory, and the close role that generation plays in the process of my memory recall, I think that I’d add a category of “opportunity” to that list. The part that I’m still not sure how to predict – but feels increasingly important – is why it came up in my memory today, of all times, the way that it did.

This blog post, then, subsequently is me making a memory about a memory. The metacognitive portion of those memories involve figuring out which threads are relevant to the present and future, and which portions can stay in the past. I use my memories to seed ideas for the future as I’m trying to find stability in an uncertain time.

Large Memory Models

I’ve been revisiting my experimental use of privateGPT to ingest files that I’ve read and written to use on-device machine learning for insights into my thoughts. My instance of privateGPT is trained on many things, but the dominant source of data for it at the moment is three years of Notion journaling. Much of that content is unstructured. I’m using gpt4all-groovy as a base model. It’s an interesting experiment, and is able to, from its training, come up with some degree of unique inference of concepts that I know are heavily represented in my past work, but it’s a fraction of the material I’ve actually consumed and generated, and the lack of structure to my memories makes it difficult to poke, prod, and otherwise consume in a meaningful way.

I also do not believe it is complete to only represent cognition and memory in the form of written recall, and figuring out how to represent and create meaningful artifacts like the one below is another area of exploration that I would like to research.

The above image is one wall in a virtual art gallery that I created over the course of several months while going through one of the most difficult and painful periods in my life. Now, with plenty of time between myself and this memory, I can communicate a more clear, nuanced perspective of the situation. The way that I access the memory, and the language that I use to communicate the memory, has shifted, but it is in some ways less authentic and complete as I’ve had to distance myself from the painful parts that made it hard to get through the day. How do we represent those memories, digitally?

In college, I went through an especially painful breakup as my mental health spiraled and my then-boyfriend told me he didn’t believe that the depression and anxiety I was experiencing was real. We had been together for over a year, and he was the last major tie to my childhood home town – my parents were in the process of moving out of state. In my pain, I deleted every photo that he was in, un-tagged every photo of the time that we had spent together. I threw away anything he had given me, I deleted him and his friends from my social media, and removed any trace of him that I could find from my own wall of status updates from the year we had been together. At the time, it was a deeply cathartic release of pain, expressed in the only way I knew to relieve it. Now, I struggle to remember that entire year of my life, and the only artifacts of memory that I have present a narrow view of what I actually experienced.

By figuring out how to build myself a large memory model, I want to be able to keep painful memories stored in a way where I can access the insights and influences of those experiences without having to relive the raw moments more intensely. Perhaps that is a naive way to think about memory, and its purpose in helping us learn. Perhaps that is why I occasionally feel prone to making the same mistakes over and over again, and like I am constantly drifting from one place to another seeking something I have yet to find.