Log Normalization

Log normalization is the process of converting raw logs from different systems into a standard, uniform structure so the SIEM can understand, correlate, search, and detect threats across the entire environment. Without normalization, every log source speaks a different “language,” and detection rules would break. A SOC heavily depends on normalization for accurate analysis, correlation, and triage.

This chapter explains normalization in full depth, with real raw logs, normalized outputs, field mapping, practical examples, and how SIEMs actually perform the transformation.


Why Log Normalization Is Necessary

Every device creates logs in its own format.

Windows logs look like this:

EventCode: 4624
Account Name: mayur
Source Network Address: 10.0.0.20
Logon Type: 10

Linux logs look like this:

sshd[2345]: Accepted password for mayur from 10.0.0.20 port 50022 ssh2

Firewall logs look like this:

Deny TCP src=185.33.41.22 dst=10.0.0.30 dpt=22

Cloud logs look like this:

"eventName": "CreateUser",
"userIdentity": {...},
"sourceIPAddress": "203.11.44.8"

If SIEM stored them “as-is,” no detection rule could work because every log type uses different fields.

Normalization creates one format like:

src_ip: x.x.x.x
dest_ip: x.x.x.x
username: <value>
action: <value>
event_type: <value>
timestamp: <value>

The SIEM can now understand everything consistently.


How Normalization Works (Step-by-Step)

Normalization happens in the SIEM pipeline through parsing rules, codecs, grok patterns, or log parsers.

Step 1: Raw Log Arrives

Example Windows raw log:

An account was successfully logged on.
Account Name: mayur
Logon Type: 10
Source Address: 185.33.91.10

Step 2: Parsing Extracts Fields

SIEM extracts data using parsing rules:

event_code = 4624
username = mayur
src_ip = 185.33.91.10
logon_type = 10

Step 3: Mapping to Normalized Schema

SIEM maps fields to standard names:

user.name = "mayur"
source.ip = "185.33.91.10"
event.action = "logon_success"
event.type = "authentication"

Step 4: Normalized Log Stored in SIEM

Now any detection rule can work on this log.


Practical Normalization Examples

1. Windows Log → Normalized

Raw

4625 - An account failed to log on
Account Name: admin
Failure Reason: Unknown user name or bad password
Source Address: 10.0.0.55

Normalized

event_code: 4625
event_type: failed_logon
username: admin
src_ip: 10.0.0.55
result: failure
category: authentication

2. Linux Log → Normalized

Raw

sshd[22899]: Failed password for admin from 185.33.22.10 port 51123 ssh2

Normalized

event_type: failed_logon
username: admin
src_ip: 185.33.22.10
auth_method: ssh
result: failure
category: authentication

3. Firewall Log → Normalized

Raw

Deny TCP src=91.44.33.10 dst=10.0.0.80 dpt=3389

Normalized

src_ip: 91.44.33.10
dest_ip: 10.0.0.80
dest_port: 3389
action: deny
protocol: tcp
event_type: network_traffic

4. Cloud Log → Normalized

Raw (AWS CloudTrail)

"eventName": "ConsoleLogin",
"sourceIPAddress": "203.55.11.44",
"userIdentity": {"userName": "mayur"}

Normalized

event_type: cloud_login
username: mayur
src_ip: 203.55.11.44
cloud_provider: AWS
action: login

How SIEMs Store Normalized Data

A normalized SIEM event may look like:

{
  "timestamp": "2025-04-10T10:33:21Z",
  "event.type": "authentication",
  "event.action": "logon_failure",
  "user.name": "admin",
  "source.ip": "185.44.66.19",
  "host.name": "WIN-SERVER01"
}

Normalization is what makes this consistent across:

  • Windows

  • Linux

  • Firewalls

  • Cloud

  • EDR

  • Applications


Why Normalization Improves Detection Accuracy

1. Rules Work Across All Devices

Example generic rule:

if event.action = "logon_failure" AND source.ip repeated > 10
→ brute force alert

Works for:

  • Windows

  • Linux

  • Cloud

  • Applications

2. Correlation Becomes Possible

Correlation needs common fields, not vendor-specific fields.

Example:

  • Windows event: source.ip = 185.77.10.33

  • Firewall event: src_ip = 185.77.10.33

  • Cloud event: sourceIPAddress = 185.77.10.33

Normalization makes them identical:

source.ip = 185.77.10.33

3. Investigations Become Faster

Analysts don’t waste time decoding vendor formats.

4. Reduced False Positives

Normalized events contain consistent fields for rule matching.


Practical Attack Example (Normalization in Action)

Raw logs:

Windows: 4624 - Successful Logon
Linux: sshd: Accepted password
Firewall: Allow TCP 185.33.44.22 -> 10.0.0.50
Cloud: IAM Login from 185.33.44.22

After normalization:

event_type: authentication_success
src_ip: 185.33.44.22
username: admin

SIEM correlates:

  • Same IP

  • Multiple platforms

  • Admin access

Creates:

ALERT: Cross-platform compromised account activity

Normalization exposed the attack path.


Field Mapping Standards Used in Normalization

Most SIEMs use standard schemas:

  • Elastic ECS

  • Splunk CIM

  • Microsoft ASIM

  • ArcSight CEF

  • QRadar DSM

They ensure logs follow consistent naming for fields like:

  • source.ip

  • destination.port

  • user.name

  • event.outcome

  • process.command_line

Analysts must know these schemas to search and write detection logic.


What Happens Without Normalization

If normalization is not implemented:

  • Rules break

  • Correlation fails

  • Alerts miss key fields

  • Investigations slow down

  • SOC visibility drops

  • Attackers hide inside messy logs

Normalization is not optional — it is foundational.


Intel Dump

  • Log normalization converts raw logs into a standardized schema.

  • Every platform logs differently; normalization removes inconsistencies.

  • SIEM normalizes via parsing, field extraction, and mapping.

  • Normalized logs unify fields like src_ip, username, event_type, etc.

  • Normalization enables accurate detections, correlation, and investigations.

  • Standard schemas (ECS, CIM, ASIM) ensure uniform field names.

  • Without normalization, SIEM detection and SOC workflows collapse.

HOME LEARN COMMUNITY DASHBOARD