Troubleshooting¶
When something breaks on a real server, you need a plan — not panic. These are playbooks for the problems Linux admins hit most often. Each follows the same shape so you can move fast:
Symptom → Likely causes → Diagnose → Fix → Prevent
Start with the method, then jump to the matching playbook.
The method¶
- How to Troubleshoot Any Linux Problem — the repeatable workflow and the first-look commands that apply to every incident.
Common problems¶
- Disk Full: No space left on device — space vs inodes, deleted-but-open files, freeing space, growing the volume
- A Service Won't Start —
journalctl -xeu, config errors, port conflicts, SELinux, start limits - Server Is Slow or Load Is High — CPU vs memory vs I/O, the OOM killer, finding the culprit
- Can't SSH In or Locked Out — refused vs timeout vs publickey, fail2ban bans, recovering access
- Network or DNS Isn't Working — layered diagnosis from link to name resolution
- Permission Denied Errors — file perms, ACLs, and the SELinux denials people miss
- Web Server Errors (502, 503, 403, 404) — mapping each HTTP error to its real cause
Golden rule
Before you change anything, read the actual error message and ask
what changed recently? Most outages trace back to a recent edit, deploy,
or update — see dnf history and
the logs.