Advanced & Deep Dives

Analyzing Log Files¶

When a site is slow, an SSH login fails, or a service won't start, the answer is almost always sitting in a log file. This page is a hands-on field guide to the logs that matter most — web (nginx/PHP-FPM), system (messages/syslog), and auth (secure/auth.log) — and the small toolkit you need to read them fast.

Tested on

AlmaLinux 9 / RHEL 9 (systemd, journald, rsyslog). Debian/Ubuntu paths and package names are noted inline where they differ. All commands assume you read logs as root via sudo — most log files are mode 0600/0640 and not world-readable.

Where the logs live¶

Distros disagree on filenames. RHEL-family (AlmaLinux, Rocky, CentOS Stream) keeps the classic /var/log/messages and /var/log/secure; Debian/Ubuntu use /var/log/syslog and /var/log/auth.log. Know both.

Log	Path (RHEL 9)	Path (Debian/Ubuntu)	What's in it
General system	`/var/log/messages`	`/var/log/syslog`	Almost everything: services, networking, hardware, cron summaries
Authentication	`/var/log/secure`	`/var/log/auth.log`	SSH logins, `sudo`, `su`, PAM, account changes
Cron jobs	`/var/log/cron`	(in `/var/log/syslog`)	When cron jobs ran and what they emitted
nginx access	`/var/log/nginx/access.log`	`/var/log/nginx/access.log`	One line per HTTP request (combined format)
nginx errors	`/var/log/nginx/error.log`	`/var/log/nginx/error.log`	Upstream failures, `502`/`504`, config + PHP errors
Apache	`/var/log/httpd/access_log`, `error_log`	`/var/log/apache2/access.log`, `error.log`	Same idea as nginx, different default paths
PHP-FPM	`/var/log/php-fpm/` + per-pool `error_log`/`slowlog`	`/var/log/php*-fpm.log` + per-pool	PHP fatals, pool warnings, slow-request stack traces
Mail	`/var/log/maillog`	`/var/log/mail.log`	Postfix/Dovecot delivery and auth
Kernel ring buffer	`dmesg` (not a file)	`dmesg`	Boot, drivers, OOM killer, disk/network errors

systemd services log to the journal

On modern systems, many daemons write to the systemd journal rather than (or in addition to) a flat file. If a file looks empty or stale, check the journal:

sudo journalctl -u nginx --no-pager        # one service
sudo journalctl -u php-fpm -p err -b        # errors only, this boot
sudo journalctl -f                          # follow everything live
sudo journalctl --since "10 min ago"        # time window

See Logs & journald for the full journal workflow.

Core reading toolkit¶

You can get 90% of the way with five tools: tail, less, grep, awk, and the sort | uniq -c | sort -rn ranking idiom.

# Watch a log live as new lines arrive (Ctrl-C to stop)
sudo tail -f /var/log/nginx/access.log

# Last 100 lines
sudo tail -n 100 /var/log/secure

# Open in a pager, then press Shift+F to "follow" like tail -f (Ctrl-C, q to quit)
sudo less +F /var/log/messages

grep is the workhorse. The flags you'll reach for constantly:

sudo grep -i "error" /var/log/messages         # -i: case-insensitive
sudo grep -E "502|504" /var/log/nginx/error.log # -E: extended regex (alternation)
sudo grep -v "healthcheck" access.log           # -v: invert (exclude noise)
sudo grep -c "Failed password" /var/log/secure  # -c: count matching lines
sudo grep -A3 -B1 "segfault" /var/log/messages  # context: 3 After, 1 Before (-C for both)

Rotated logs are gzip-compressed (.gz). Use the z variants so you don't have to decompress first:

sudo zgrep "Failed password" /var/log/secure-20260601.gz   # grep inside .gz
sudo zcat /var/log/nginx/access.log.2.gz | head            # cat a .gz
sudo zless /var/log/nginx/error.log.1                       # page a (possibly .gz) file

awk pulls out columns by whitespace ($1 is the first field, $NF the last), and the ranking idiom counts + sorts anything:

# Print the first field of every line
sudo awk '{print $1}' /var/log/nginx/access.log

# THE idiom: count occurrences, sort most-frequent first
... | sort | uniq -c | sort -rn | head

# cut by delimiter; wc -l to count lines
sudo cut -d' ' -f1 access.log
sudo grep -c "" access.log     # total line count (or wc -l)

Read the idiom right-to-left

sort | uniq -c | sort -rn means: sort so identical lines are adjacent → uniq -c collapses them with a count → sort -rn sorts by that count, numerically, descending. It's how you turn a million log lines into a top-10 list.

nginx access.log¶

Nginx's default combined format puts one request per line. A real line:

203.0.113.42 - - [07/Jun/2026:11:32:08 +0530] "GET /wp-login.php HTTP/1.1" 200 1543 "https://example.com/" "Mozilla/5.0 (X11; Linux x86_64)"

Field by field (whitespace-split, so awk indexes line up):

Field	`awk` index	Meaning
`203.0.113.42`	`$1`	Client IP
`[07/Jun/2026:11:32:08 +0530]`	`$4 $5`	Timestamp (note the `[` and `]`)
`"GET /wp-login.php HTTP/1.1"`	`$6 $7 $8`	Method, URL path, protocol
`200`	`$9`	HTTP status code
`1543`	`$10`	Bytes sent
`"https://example.com/"`	`$11`	Referer
`"Mozilla/5.0 ..."`	`$12...`	User-Agent

Top client IPs (who is hitting you hardest):

sudo awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head

  48213 203.0.113.42
   9120 198.51.100.7
   3004 192.0.2.55

Top requested URLs:

sudo awk '{print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head

Count requests by status code (a sudden spike in 4xx/5xx is your signal):

sudo awk '{print $9}' /var/log/nginx/access.log | sort | uniq -c | sort -rn

Find all 5xx errors (server-side failures — these are on you, not the client):

sudo awk '$9 ~ /^5/ {print}' /var/log/nginx/access.log

Requests in the last minute — match by timestamp prefix. Build the pattern from the current time (drop the seconds):

sudo grep "$(date '+%d/%b/%Y:%H:%M')" /var/log/nginx/access.log | wc -l

Don't trust $1 blindly behind a proxy

If nginx sits behind Cloudflare or a load balancer, $1 is the proxy's IP, not the visitor's. Log and grep X-Forwarded-For / $http_x_forwarded_for instead (configure it in your log_format). See Web Server Errors.

nginx error.log¶

The error log is where why a request failed lives. A classic bad-gateway entry pointing at a dead PHP-FPM backend:

2026/06/07 11:40:13 [error] 1287#1287: *5521 connect() to unix:/run/php-fpm/www.sock failed (111: Connection refused) while connecting to upstream, client: 203.0.113.42, server: example.com, request: "GET /index.php HTTP/1.1", upstream: "fastcgi://unix:/run/php-fpm/www.sock", host: "example.com"

# Upstream / gateway problems (502 root cause)
sudo grep -E "upstream|connect\(\) to|502|504" /var/log/nginx/error.log

# PHP-level errors surfaced via FastCGI
sudo grep -i "FastCGI sent in stderr" /var/log/nginx/error.log

# Just the most recent failures, live
sudo tail -f /var/log/nginx/error.log

Correlate with access.log. When you see a 502 in the access log at 11:40:13, grep both logs for that second to see the request and its failure together:

sudo grep "07/Jun/2026:11:40:13" /var/log/nginx/access.log
sudo grep "2026/06/07 11:40:13"  /var/log/nginx/error.log

The access log uses 07/Jun/2026:11:40:13; the error log uses 2026/06/07 11:40:13. Same instant, two formats — match on the time, not the whole string.

PHP-FPM¶

PHP-FPM has two logs worth knowing. The pool error log (error_log in /etc/php-fpm.d/www.conf, often /var/log/php-fpm/www-error.log) captures fatals; the slowlog captures any request that runs longer than request_slowlog_timeout.

PHP fatal errors:

sudo grep -iE "PHP (Fatal|Parse|Warning)" /var/log/php-fpm/www-error.log

[07-Jun-2026 11:40:13] WARNING: [pool www] child 4412 exited on signal 11 (SIGSEGV) after 0.7 seconds from start
[07-Jun-2026 11:41:02 UTC] PHP Fatal error:  Uncaught Error: Call to undefined function wp_get_environment_type() in /var/www/example/wp-config.php:12

Slow requests — enable in the pool config, then read the slowlog. The trace tells you exactly which line was running when the timeout hit:

; /etc/php-fpm.d/www.conf
request_slowlog_timeout = 5s
slowlog = /var/log/php-fpm/www-slow.log

sudo systemctl reload php-fpm        # apply config (php8.2-fpm on Debian)
sudo tail -n 40 /var/log/php-fpm/www-slow.log

[07-Jun-2026 11:45:33]  [pool www] pid 4501
script_filename = /var/www/example/wp-cron.php
[0x00007f...] curl_exec() /var/www/example/wp-content/plugins/slow-plugin/api.php:88
[0x00007f...] fetch_remote() /var/www/example/wp-content/plugins/slow-plugin/api.php:40

That top frame (curl_exec() at api.php:88) is your bottleneck — a blocking outbound HTTP call.

/var/log/secure (auth)¶

This is the single most important security log. On Debian/Ubuntu it's /var/log/auth.log.

Failed SSH logins — the heartbeat of every internet-facing box:

sudo grep "Failed password" /var/log/secure

Jun  7 11:50:14 web01 sshd[20413]: Failed password for root from 203.0.113.99 port 51022 ssh2
Jun  7 11:50:17 web01 sshd[20415]: Failed password for invalid user admin from 198.51.100.23 port 40122 ssh2

Who is brute-forcing you — rank the source IPs. In that line the IP sits 3 fields before the end (...from <IP> port <n> ssh2), so $(NF-3):

sudo grep "Failed password" /var/log/secure \
  | awk '{print $(NF-3)}' | sort | uniq -c | sort -rn | head

  3187 203.0.113.99
   942 198.51.100.23
    61 192.0.2.7

Successful logins — confirm who actually got in (especially after seeing failures):

sudo grep "Accepted" /var/log/secure

Jun  7 09:14:02 web01 sshd[18002]: Accepted publickey for deploy from 192.0.2.10 port 49888 ssh2: ED25519 SHA256:abc123...

Sudo usage — what privileged commands were run, and by whom:

sudo grep "sudo:" /var/log/secure | grep "COMMAND"

Jun  7 10:02:11 web01 sudo:   deploy : TTY=pts/0 ; PWD=/home/deploy ; USER=root ; COMMAND=/usr/bin/systemctl restart nginx

Failed logins are normal; successful ones from new IPs are the alarm

Thousands of Failed password lines are just internet background noise. What matters: an Accepted line from an IP or account you don't recognize, right after a burst of failures. To turn brute-force IPs into automatic bans, pair this with fail2ban. If you're locked out yourself, see Can't SSH In.

/var/log/messages or syslog¶

The general log catches service crashes, kernel events, and the dreaded out-of-memory killer.

Service failures — grep the unit name and failed:

sudo grep -iE "failed|fatal|error" /var/log/messages | tail
sudo grep "nginx" /var/log/messages

The OOM killer — when the kernel kills a process to reclaim RAM. If a service "randomly" dies, check here first:

sudo grep -i "out of memory\|oom-killer\|Killed process" /var/log/messages

Jun  7 03:22:41 web01 kernel: Out of memory: Killed process 8842 (mysqld) total-vm:2310044kB, anon-rss:1980112kB
Jun  7 03:22:41 web01 kernel: mysqld invoked oom-killer: gfp_mask=0x..., order=0, oom_score_adj=0

That tells you mysqld was the victim (process killed) and roughly how much RAM it was holding — a strong hint you need more memory or a tuned config.

Kernel messages — the ring buffer via dmesg (use -T for human timestamps):

sudo dmesg -T | grep -iE "error|fail|i/o|segfault" | tail

[Sun Jun  7 03:22:41 2026] EXT4-fs error (device sda1): ext4_find_entry:1612: inode #131073: comm nginx: reading directory lblock 0

An EXT4/I/O error here means a disk problem — investigate the hardware/SMART status, not the app.

Web-log analytics with GoAccess¶

For a readable dashboard of your access log without putting any JavaScript tracker on the site, GoAccess parses the log file directly.

# RHEL 9 (needs EPEL)
sudo dnf install -y epel-release
sudo dnf install -y goaccess

# Debian / Ubuntu
sudo apt update && sudo apt install -y goaccess

Interactive terminal mode — a live, navigable TUI right in your SSH session:

sudo goaccess /var/log/nginx/access.log --log-format=COMBINED

Static HTML report — generate a self-contained file you can scp off and open in a browser:

sudo goaccess /var/log/nginx/access.log --log-format=COMBINED -o /tmp/report.html

Include rotated logs to cover a longer window by piping them in:

sudo zcat -f /var/log/nginx/access.log* | goaccess --log-format=COMBINED -o /tmp/report.html -

GoAccess gives you, at a glance: unique visitors, top requested files, static vs dynamic, 404s, status-code breakdown, top referrers, operating systems/browsers, geolocation, and bandwidth consumed — all derived from the same combined-format log you've been grepping by hand.

Rotated logs¶

logrotate (config in /etc/logrotate.conf and /etc/logrotate.d/) renames access.log → access.log.1, then gzips older generations to access.log.2.gz, access.log.3.gz, and so on. The current log has no suffix.

sudo ls -lh /var/log/nginx/

-rw-r----- 1 nginx adm  1.2M Jun  7 11:55 access.log
-rw-r----- 1 nginx adm  8.0M Jun  6 23:59 access.log.1
-rw-r----- 1 nginx adm  410K Jun  5 23:59 access.log.2.gz

To search across everything — live file plus all rotated, compressed or not — zgrep and zcat -f transparently handle both:

# Search current + all rotated for a string
sudo zgrep "wp-login.php" /var/log/nginx/access.log*

# Stitch them into one stream, oldest behavior preserved
sudo zcat -f /var/log/nginx/access.log* \
  | awk '{print $1}' | sort | uniq -c | sort -rn | head

zcat -f is forgiving

The -f flag tells zcat to pass through plain (non-gzip) files unchanged, so access.log access.log.1 access.log.2.gz all flow through one pipe without errors.

Centralizing logs¶

Grepping one box is fine. Grepping fifty boxes by SSH-ing into each is not. For fleets, forward logs to one place:

rsyslog — the simplest hop: configure clients to forward to a central rsyslog server (*.* @@logserver:514 in /etc/rsyslog.conf for TCP). Everything lands in one set of files you grep normally.
Aggregators — ship to a searchable store: Grafana Loki (label-based, lightweight), OpenObserve (Elasticsearch-compatible, low-footprint), or the ELK/Elastic stack. Agents like Promtail, Vector, or Fluent Bit tail these same files and forward them.

Then you query one dashboard instead of ssh-ing into every server — the same status:5xx or Failed password search now spans the whole fleet.

Verify your work¶

Run these on a live server and confirm you get sensible output:

# 1. Confirm you can read the auth log and see SSH activity
sudo grep -c "sshd" /var/log/secure        # or /var/log/auth.log on Debian

# 2. Rank failed-login source IPs (should print counts + IPs, or nothing if clean)
sudo grep "Failed password" /var/log/secure | awk '{print $(NF-3)}' \
  | sort | uniq -c | sort -rn | head

# 3. Get the HTTP status-code distribution from nginx
sudo awk '{print $9}' /var/log/nginx/access.log | sort | uniq -c | sort -rn

# 4. Confirm rotated logs are searchable
sudo zgrep -c "GET" /var/log/nginx/access.log*.gz

# 5. Check the journal for the web stack this boot
sudo journalctl -u nginx -u php-fpm -p warning -b --no-pager | tail

If command 2 returns a long ranked list, you now know your top attackers; if command 3 shows a wall of 5xx, head to the error log and PHP-FPM logs next.

Summary¶

Know your distro's paths: RHEL uses /var/log/messages + /var/log/secure; Debian uses /var/log/syslog + /var/log/auth.log. systemd services also log to the journal (journalctl -u).
Five tools cover most needs: tail -f, less +F, grep (-i -E -v -c -A/-B/-C), awk for fields, and the sort | uniq -c | sort -rn ranking idiom.
Per log: access.log → rank IPs/URLs/status; error.log → upstream/502 causes, correlate by timestamp; PHP-FPM → fatals + slowlog traces; secure → failed/accepted logins and sudo; messages → service failures, OOM killer, kernel/disk errors via dmesg.
Rotated logs are .1/.gz — use zgrep/zcat -f to search current + archived together.
GoAccess turns a combined-format access log into a full dashboard, no site-side JS.
At scale, forward with rsyslog or ship to Loki/OpenObserve/ELK so you search one place.