PyFu

Introduction to Streamlit Security Testing

Python Web Development Frameworks

Streamlit is a Python-based open-source framework used to build interactive data applications rapidly. Its simplicity and tight integration with Python make it popular in data science and ML communities.

However, like any web application, Streamlit apps can be vulnerable if not properly secured. Security testing Streamlit applications involves understanding how Streamlit works, identifying its trust boundaries, and analyzing how input data is handled.

Understanding Streamlit Architecture

Streamlit applications are typically single-page apps that serve dynamic content generated directly by Python code. Unlike traditional web apps, Streamlit handles both the backend logic and the frontend interface through Python scripts.

The app runs a web server (Tornado) that listens for client events and state updates over a persistent WebSocket connection. Every widget interaction, a button click, a slider move, a text input change, sends a message to the server, which re-runs the entire script from top to bottom and streams the new UI back to the browser.

This rerun model is the single most important thing to understand when attacking a Streamlit app. There is no request/response routing layer and no controller separation: the whole script is the request handler. Any widget value an analyst typed into is just a Python variable on the next rerun, and whatever the script does with that variable, evaluate it, pass it to a shell, feed it to a deserializer, runs on the server with the privileges of the Streamlit process.

Trust Boundaries

The trust boundary in a Streamlit app is the widget. Every value that originates from a widget or from st.query_params is attacker-controlled in exactly the same way that an HTTP parameter is in a Flask app:

import streamlit as st

# All of these are untrusted input on every rerun:
name = st.text_input("Your name")
uploaded = st.file_uploader("Upload a model")
city = st.query_params.get("city")

Streamlit does not add authentication, authorization, CSRF protection, or input sanitization. A bare Streamlit deployment is unauthenticated by default, which means that if the app is reachable, every code path behind every widget is reachable too. Internal “dashboards” and “data tools” are frequently deployed straight to a public host with no gateway in front of them, which is why these apps are such high-value targets.

Common Vulnerability Classes in Streamlit Apps

Because the script body runs with full Python power on every interaction, the dangerous sinks are the same ones covered elsewhere in PyFu, they simply reach them through widgets instead of routes.

Dynamic code evaluation. Data apps love to offer a “formula” or “expression” box that is wired straight into eval() or pd.DataFrame.query(). This is the most common Streamlit RCE and is the focus of the lab app for this section.

expr = st.text_input("Filter expression")
result = df.query(expr)   # see Pandas Library Arbitrary Command Execution

Command injection. Apps that wrap a CLI tool (ffmpeg, nmap, a report generator) routinely build a command string from a widget and run it with shell=True. The mechanics are identical to Python Command Injection.

Insecure deserialization. ML apps accept uploaded .pkl, .pt, or joblib model files and load them with pickle.load() / torch.load(). A malicious uploaded model is straight RCE, see Insecure Deserialization - Python Pickle. This is arguably the most under-appreciated Streamlit attack surface because uploading a “model” feels legitimate.

Path traversal. st.file_uploader lets the user choose the filename, and apps often write or read using that name without sanitization, leading to the traversal patterns in Insecure File Access and Path Traversal in Python.

SSRF. “Load data from URL” features pass a widget value to requests.get(), reproducing Server Side Request Forgery (SSRF) in Flask Applications against cloud metadata and internal services.

Secrets exposure. Streamlit reads credentials from st.secrets (backed by .streamlit/secrets.toml). Apps that print configuration for “debugging”, or that ship the secrets file inside the container image or repo, leak API keys and database passwords directly.

HTML/JS injection. st.markdown(user_input, unsafe_allow_html=True) and st.components.v1.html() render raw markup, turning user-controlled strings into stored XSS within the app’s origin.

Why Streamlit matters from an offensive security perspective

When I look at a Streamlit app, I treat the whole script as one giant request handler with no auth in front of it, because that is what it is. The rerun model means every widget value is a live Python variable on the server on the next interaction, so the gap between user input and a dangerous sink is usually a single line. These are the tells I grep for first:

  • No gateway in front of it. A bare streamlit run is unauthenticated by default, so if I can reach the host, every code path behind every widget is reachable too. Internal “dashboards” shipped straight to a public host are the high-value version of this.
  • Free-text widgets wired to evaluators. st.text_input feeding eval(), df.query(), or df.eval() is the flagship Streamlit RCE. I probe these with {{7*7}}-style arithmetic, then expand to __import__.
  • File uploaders into deserializers. st.file_uploader followed by pickle.load, torch.load, or joblib.load is straight RCE from a crafted “model” file, and it feels legitimate enough that nobody flags it.
  • Attacker-chosen filenames and URLs. Upload names reaching open() give traversal; “load from URL” widgets reaching requests.get() give SSRF against metadata and internal services.
  • Secrets and raw HTML. Apps that print st.secrets for debugging leak keys, and unsafe_allow_html=True turns widget strings into stored XSS.

The defender takeaway: a Streamlit widget is not a form field, it is a direct line into server-side eval, subprocess, and pickle, so put authentication and a reverse proxy in front of anything that touches untrusted users.

How to Test a Streamlit App

Start by reading the script, since the script is the application. Trace every widget and query_params value to its sink and look for the classes above. From the outside:

  • Confirm whether the app is authenticated at all; most are not.
  • Probe every free-text widget for eval/query evaluation ({{7*7}}-style arithmetic, then expand to __import__).
  • Test file uploaders for both deserialization (upload a crafted pickle) and traversal (filenames containing ../).
  • Test any URL/host widget for SSRF against 169.254.169.254 and internal ranges.
  • Inspect WebSocket traffic in the browser dev tools; widget values are sent as JSON messages and can be replayed or fuzzed directly.

What this section covers

Streamlit has no dedicated framework subpages in this handbook because it has almost no framework surface, the script is the app. Its attack surface is the set of Python sinks the script reaches through widgets, so the relevant techniques are the cross-cutting ones documented elsewhere, applied through the widget trust boundary:

The takeaway: Streamlit collapses the frontend and backend into one continuously re-executed Python script, so a widget is not a form field, it is a direct line into server-side eval, subprocess, and pickle. Treat every Streamlit deployment as unauthenticated remote code execution surface until proven otherwise, and put authentication and a reverse proxy in front of anything that touches untrusted users. The companion lab app streamlit-fu/streamlit-security demonstrates the dynamic-evaluation path end to end.

Running it in the lab

The vulnerable sink is reached through a Streamlit text widget rather than an HTTP parameter, so it is driven from the app UI (PyFuLabs/streamlit-fu/streamlit-security) rather than with curl. The filter box feeds its raw string to df.query(expr, engine="python"), the identical pandas evaluator proven in Python Pandas Library Arbitrary Command Execution; entering a @__builtins__.__import__("os").system("id") style payload executes the command on the server, visible in the container stdout.

Mitigation

The fix is to keep the widget value out of df.query() and df.eval(), which expose a Python expression evaluator, and never pass an attacker-controlled string to them. Parse the user’s filter into a constrained, server-built predicate over an allowlist of columns, operators, and literal values and apply it with boolean indexing instead, and treat Streamlit widgets as untrusted input exactly like request parameters, since anyone who can reach the app drives them.