Robot Self Awareness

What, exactly, stays the same when a robot learns to do something new?

Humans do not relearn their bodies from scratch every time the goal changes. We may learn a new skill, adapt to a new environment, or recover from a mistake, but some internal sense of the body carries forward. This project asks whether something similar can emerge inside a learned robot controller.

In this work, we trained a single robot across multiple behaviors and then looked inside the neural policy to test whether a stable, reusable internal subnetwork emerged—something closer to a persistent sense of the robot’s own body than a one-off task controller.

Visualization of the persistent self across learned robot behaviors

1 · Why This Matters / The Problem

Most robot learning systems are judged by what they can do: walk, turn, jump, manipulate, recover. But there is a deeper question underneath that performance: when a robot learns several behaviors over time, does it only store task-specific control tricks, or does it also build a more persistent internal understanding of its own body?

That distinction matters for both science and engineering. Scientifically, it gives a concrete way to study whether something self-like can emerge inside a learning system without explicitly programming it in. From an engineering perspective, it points toward controllers that reuse stable body knowledge while only rewriting the parts that need to change for a new objective.

In other words: to learn continuously, a robot should not have to rewrite what it is every time it changes what it is doing.

2 · Main Results

We trained a simulated quadruped in a continual-learning curriculum across three distinct behaviors: walk, wiggle, and bob. We then compared its internal neural structure against a control policy trained on a constant task.

The core result was that continual learning produced a stable self-like subnetwork: a subset of the controller that remained noticeably more persistent across behavior changes, while other regions reorganized more strongly to implement the current behavior.

Across seeds, the continual-learning agent showed a mean self-versus-task separation of about 16.9 percentage points, while the constant-task baseline showed much weaker separation and far smaller stable subnetworks.

Just as importantly, this was not only a visualization artifact from one lucky run. The same pattern persisted across cycles, across seeds, and across architectural variations, with the strongest self-like structure showing up early in the network and then stabilizing over training.

Quantitative comparison showing persistent self-like subnetwork under continual learning

3 · How We Approached It

Train one policy across multiple behaviors. Rather than building separate controllers, we trained a single robot policy through a repeating curriculum of distinct behaviors. That setup pressures the network to retain whatever body-related structure is useful across tasks.

Probe the policy internally, not just behaviorally. Reward curves can tell you whether the robot improves. They do not tell you what the network is actually representing. We therefore analyzed hidden-layer activations across shared reference states to see which parts of the policy stayed stable and which parts reorganized.

Search for persistent internal structure. We grouped neurons by co-activation structure, matched them across behaviors, and measured persistence over time. That let us separate the candidate self-like region from the more task-sensitive remainder of the network.

Overview of the robot training and analysis pipeline

4 · Why We Think the Result Is Interesting

The point is not that the robot is “self-aware” in the human sense. The interesting result is narrower and, to us, more useful: under continual learning, a standard deep RL policy can spontaneously organize into a stable internal core alongside more behavior-specific components.

That suggests there may be a practical way to identify reusable body-related structure inside black-box controllers—without hand-designing a separate self-model. If that holds more broadly, it could matter for continual adaptation, transfer, debugging, and monitoring whether fine-tuning is preserving the right internal structure versus overwriting it.

More broadly, we like this project because it sits at an intersection we care a lot about: reinforcement learning, robotics, and interpretability. It is not just about making the robot move. It is about understanding what the controller has actually learned.

Robustness and persistence of self-like subnetworks across cycles

5 · Under the Hood

We built a broad stack around this project: RL training design, reproducible experiment infrastructure, checkpointing and resume tools, rollout/video tooling, and a full analysis pipeline for comparing internal network structure across runs, phases, and behaviors.

A big accelerator was Isaac-based vectorized simulation, which made it possible to iterate quickly and run large batches of experiments with much stronger reproducibility than a slower single-env loop.

On the engineering side it’s Python, PyTorch, Isaac / simulation tooling, experiment configs, Bash/tmux, Git, and a lot of analysis infrastructure for turning raw checkpoints into interpretable results and publication-ready figures.

6 · Acknowledgements

This research was conducted in the Creative Machines Lab, Columbia University, with guidance from Professor Hod Lipson and Judah Goldfeder.

Questions or ideas? aj3337@columbia.edu