Troubleshooting¶

When something breaks on a real server, you need a plan — not panic. These are playbooks for the problems Linux admins hit most often. Each follows the same shape so you can move fast:

Symptom → Likely causes → Diagnose → Fix → Prevent

Start with the method, then jump to the matching playbook.

The method¶

How to Troubleshoot Any Linux Problem — the repeatable workflow and the first-look commands that apply to every incident.

Common problems¶

Disk Full: No space left on device — space vs inodes, deleted-but-open files, freeing space, growing the volume
A Service Won't Start — journalctl -xeu, config errors, port conflicts, SELinux, start limits
Server Is Slow or Load Is High — CPU vs memory vs I/O, the OOM killer, finding the culprit
Can't SSH In or Locked Out — refused vs timeout vs publickey, fail2ban bans, recovering access
Network or DNS Isn't Working — layered diagnosis from link to name resolution
Permission Denied Errors — file perms, ACLs, and the SELinux denials people miss
Web Server Errors (502, 503, 403, 404) — mapping each HTTP error to its real cause

Golden rule

Before you change anything, read the actual error message and ask what changed recently? Most outages trace back to a recent edit, deploy, or update — see dnf history and the logs.