Chasing Down Phantom Swap Usage: When Gunicorn Hides in Swap

A long-running app server was sitting on nearly 2 GB of swap with a swappiness of only 30. Nothing felt slow, but the number looked wrong. This is a walkthrough of how I traced it from “the swap looks full” to a concrete one-line fix in the service definition — and a reminder that resident memory only tells you half the story.

The symptom

The box (app-server-01) had been up for about five days and was reporting roughly 1.98 GB of a 4 GB swap partition in use. Swappiness was already turned down to 30, so the instinct was: “why is it swapping at all?”

First clarification worth making: swappiness controls the kernel’s tendency to swap, not a cap on how much swap gets used. Even at a low value, Linux will happily move idle pages out to swap so it can keep RAM free for active processes and page cache. So swap being “full” is not automatically a problem. The real question is whether the system is actively swapping.

Step 1 — Is it actually swapping, or just holding cold pages?

The tool for this is vmstat. The columns that matter are si (swap-in) and so (swap-out):

vmstat 1 5

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2  0 2106388 1406348   2172 581680    9   11    87    31   45   49  1  1 98  0  0
 0  0 2106388 1405044   2172 581732   64    0    64     0 2283 3167  2  4 94  1  0
 0  0 2106388 1405112   2172 581744    0    0     0     0  347  582  0  0 99  0  0
 0  0 2106388 1405112   2172 581744    0    0     0     0  273  499  0  0 100  0  0
 0  0 2106388 1405112   2172 581744    0    0     0     0  324  559  0  0 100  0  0

A key gotcha: the first line is the since-boot average — ignore it. The live samples below it are what count, and here they show essentially zero swap activity. One sample pulled a tiny 64 KB back in; everything else is 0/0. Idle sits at 94–100%, I/O wait is basically nil, and there’s ~1.4 GB of free RAM.

Verdict at this point: the system is not under memory pressure. The swap is just parking cold pages. Benign — but the question of whose cold pages remained.

Step 2 — A false lead on Gunicorn’s memory

The app runs under Gunicorn, so worker count was the obvious thing to inspect — each sync worker is a full copy of the app in memory.

ps -ef | grep gunicorn
ps -o rss,cmd -C gunicorn

Three workers, and their resident memory (RSS) totalled only about 112 MB across all processes. That looked tiny — easy to conclude Gunicorn wasn’t the culprit and move on.

That conclusion was wrong, and the reason it was wrong is the whole point of this post: RSS only shows what’s resident in RAM right now. It says nothing about what a process has pushed out to swap. The bloat had already left RAM. RSS was the tip; swap was the iceberg.

Step 3 — Ranking processes by actual swap usage

To see who really occupies swap, you have to read VmSwap out of each process’s /proc/<pid>/status. This one-liner collects it for every process and sorts by swap used:

for f in /proc/*/status; do awk '/^VmSwap|^Name/{printf $2" "$3}END{print ""}' "$f"; done | sort -k2 -n -r | head

gunicorn 854904 kB
gunicorn 853604 kB
...
gunicorn 12188 kB

There it is. Two Gunicorn workers were holding ~855 MB each in swap — about 1.7 GB of the ~2 GB total. Everything else (the EDR sensors, the vulnerability scanner, the RMM agent, firewalld) was rounding error. The process whose RSS looked harmless was, in fact, the entire story.

Root cause

The pattern is a classic one for long-running Python web apps:

A worker handles a heavy request and grows to ~900 MB.
The request finishes, but CPython rarely returns freed memory to the OS — the process stays large.
Those now-idle pages go cold, and over days of uptime the kernel quietly swaps them out.
RSS shrinks (the pages left RAM), so a quick ps check makes the worker look lean — while ~855 MB sits in swap.

A generous request timeout (this service used --timeout 300) makes the spikes more likely, since workers are permitted to do long, heavy work before being reaped.

The fix: recycle workers with max-requests

Rather than fight CPython’s memory behavior, the standard remedy is to let Gunicorn restart each worker after a set number of requests. A fresh worker starts lean, and the accumulated bloat is genuinely released:

--max-requests 1000 --max-requests-jitter 50

The jitter staggers the restarts so all workers don’t recycle at the same instant and cause a momentary stall. Tune 1000 to your traffic — busy workers can sit higher; low-traffic ones can go lower so they recycle roughly daily.

Applying it via systemd

The service was defined as a systemd unit. Inspect it first:

systemctl cat reporting-app

# /etc/systemd/system/reporting-app.service
[Unit]
Description=Internal Reporting Service
After=network.target
[Service]
User=root
WorkingDirectory=/opt/reporting-app
ExecStart=
ExecStart=/usr/local/bin/gunicorn app:app -b 0.0.0.0:5000 --workers 3 --timeout 300
Restart=always
RestartSec=10
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=reporting-app
Environment=PYTHONUNBUFFERED=1
[Install]
WantedBy=multi-user.target

Edit the unit and append the two flags to the ExecStart line. (The empty ExecStart= above it is a deliberate systemd idiom that clears the prior value first — leave it.)

systemctl edit --full reporting-app

The updated line:

ExecStart=/usr/local/bin/gunicorn app:app -b 0.0.0.0:5000 --workers 3 --timeout 300 --max-requests 1000 --max-requests-jitter 50

Then reload and restart:

systemctl daemon-reload
systemctl restart reporting-app

Note: command-line flags use dashes (--max-requests); if you instead use a gunicorn.conf.py file, the same settings use underscores (max_requests = 1000). The restart also reclaims the swap immediately, since the bloated workers are replaced with fresh ones.

Resolution

Re-running the swap-ranking one-liner after the restart confirmed it:

for f in /proc/*/status; do awk '/^VmSwap|^Name/{printf $2" "$3}END{print ""}' "$f"; done | sort -k2 -n -r | head

Gunicorn is gone from the list entirely. The ~1.7 GB it was holding has been released, and the remaining swap occupants are the normal background agents sitting on modest, expected amounts. With max-requests in place, the workers now recycle themselves before the bloat can re-accumulate.

Takeaways

Swappiness sets tendency, not a ceiling. Low swappiness still allows idle pages to be swapped out.
Full swap is not the same as memory pressure. Check vmstat‘s si/so on the live samples — not the since-boot first line — to see if you’re actively swapping.
RSS lies about the full picture. A process can look tiny in RAM while holding hundreds of MB in swap. Rank by VmSwap in /proc to find the real occupants.
Long-running Python workers accrete memory. --max-requests with jitter is the simple, durable fix.

One unrelated note worth filing away: this service runs as root. Nothing to do with swap, but a network-bound app rarely needs root — running it under a dedicated low-privilege user is a worthwhile hardening step for another day.