A Service Won't Start¶
A service refuses to come up: systemctl start returns an error, the unit sits in failed or activating, and the application is unreachable. This playbook walks the fastest path from "it won't start" to root cause and fix.
Tested on
AlmaLinux 9 / RHEL 9 with systemd. Commands are the same on Debian/Ubuntu; only package and validation tool names differ (noted inline). On Debian/Ubuntu use journalctl and systemctl identically.
Symptom¶
sudo systemctl start nginxreturnsJob for nginx.service failed. See "systemctl status nginx.service" and "journalctl -xeu nginx.service" for details.systemctl status <svc>showsActive: failed (Result: exit-code)or the unit is stuck inactivating (auto-restart).- The application's port is closed and clients get connection refused.
$ sudo systemctl status nginx
× nginx.service - The nginx HTTP and reverse proxy server
Loaded: loaded (/usr/lib/systemd/system/nginx.service; enabled)
Active: failed (Result: exit-code) since Sat 2026-06-07 10:14:02 IST
Process: 4821 ExecStartPre=/usr/sbin/nginx -t (code=exited, status=1/FAILURE)
Likely causes¶
| Cause | Tell-tale sign |
|---|---|
| Config syntax error | ExecStartPre test fails; logs name a file/line |
| Port already in use | bind() to 0.0.0.0:80 failed (98: Address already in use) |
| Missing file or dependency | No such file or directory; failed After=/Requires= unit |
| Permission denied | Permission denied opening a path, socket, or PID file |
| SELinux denial | Works with SELinux permissive; avc: denied in audit log |
| Crash loop hitting StartLimit | start request repeated too quickly / Start limit hit |
Diagnose¶
Start with the two commands that explain 90% of failures. journalctl -xeu is the single most useful one — it shows the unit's own log lines plus systemd's explanation.
systemctl status nginx -l # -l = don't truncate long lines
sudo journalctl -xeu nginx.service # -x adds hints, -e jumps to end, -u filters the unit
See exactly what systemd will run, including drop-ins and overrides:
systemctl cat nginx # full merged unit + any /etc/systemd/system/...d/ overrides
systemctl show nginx -p ExecStart -p User -p WorkingDirectory
Validate the config before anything else — many daemons ship a test mode:
sudo nginx -t # nginx
sudo sshd -t # OpenSSH (use -T to dump the effective config)
sudo apachectl configtest # Apache httpd (Debian/Ubuntu: apache2ctl configtest)
sudo named-checkconf # BIND
sudo postfix check # Postfix
Check whether the port is already taken by another process:
Check for SELinux denials (RHEL-family default; Debian/Ubuntu use AppArmor instead):
sudo ausearch -m avc -ts recent # AVC denials in the recent window
sudo journalctl -t setroubleshoot # human-readable denial summaries, if installed
getenforce # Enforcing vs Permissive
Quick SELinux confirmation
If the service starts cleanly after sudo setenforce 0 (Permissive) but fails under Enforcing, the problem is an SELinux policy mismatch — fix the label/boolean, then set it back with sudo setenforce 1. Never leave SELinux disabled as a "fix".
Check file ownership and permissions on the paths the unit touches (config, data dir, PID file, sockets):
Catch unit-file mistakes systemd would otherwise ignore:
Fix¶
Apply the fix that matches the cause, then start the unit and re-check status.
Edit the offending file (the validator names it), re-test, then start:
Identify the holder from ss -tulpn, then either stop that process or change your service's listen port:
Create or correct the missing path, or fix the Requires=/After= ordering, then reload and start. See systemd Service Management for unit dependencies.
Set correct ownership on the paths the service User= needs:
Restore the default label, or flip the right boolean (don't disable SELinux):
sudo restorecon -Rv /var/www/html # relabel files to policy defaults
sudo setsebool -P httpd_can_network_connect on # example boolean, -P = persist
sudo semanage port -a -t http_port_t -p tcp 8080 # allow a non-standard port
See SELinux for generating policy from audit2allow.
Always daemon-reload after editing a unit
If you edited a .service file or added a drop-in, systemd is still using the cached version until you reload:
About the start rate limit¶
systemd will refuse to restart a unit that fails too often. Defaults: more than StartLimitBurst (5) starts within StartLimitIntervalSec (10s) triggers the limit. Tune them in the [Unit] section when a service legitimately needs more headroom:
After editing, sudo systemctl daemon-reload && sudo systemctl reset-failed <svc>.
Prevent¶
- Validate before you reload. Make
nginx -t/apachectl configtest/sshd -ta reflex, and keep it as anExecStartPre=so a bad config fails loudly instead of silently. - Run
daemon-reloadimmediately after any unit edit — stale units cause confusing "I changed it but nothing happened" failures. - Monitor unit state. Alert on
systemctl is-failedor scrapesystemctl list-units --state=failedso a crash loop pages you before users notice. See Logs & journald for centralizing the evidence. - Don't disable SELinux to "make it work" — fix the label or boolean so the protection stays on.
- Keep configs in version control so you can
diffagainst the last known-good version.
For deeper unit-file and dependency work see systemd Service Management; for inspecting the running process see Process Management.