# Log Collection
All logs are collected either via
- Log Agents
- Agentless
## Log Agents
Log Agent software is required. Typically have parsing, log rotation, buffering, log integrity, encryption, and conversion features.
Pros:
- tested and working by developers
- automatic parsing, encryption capabilities, log integrity verification
Cons:
- Resource consumption increases
- Cost increases
### Syslog
A very popular protocol for log transfers
Works with UDP and TCP
Can be encrypted with TLS
FORMAT
- Timestamp
- Source Device
- Facility
- Severity
- Message Number
- Message Text
Maximum packet size
- UDP - 1024
- TCP - 4096
### Third Party Agents
- Splunk: Universal Forwarder
- ArcSight: ArcSight Connectors
### Open Source Agents
- Beats
- NXLog
## Agentless
- No installation or update cost
- Usually, logs are sent by connecting to the target with SSH or WMI
- log server username and password are therefore required
- possibility to be compromised
- Easier to prepare and manage than agent method
- limited capabilities
### Manual Collection
Sometimes logs are novel enough where you just gotta write your own script.
# Log Aggregation and Parsing
Logs can be edited with the aggregator before being sent to the destination. Parses log to send requested part to target.
## Aggregator EPS
EPS = Events per second
Aggregator needs to be scaled appropriately to keep up with EPS
Log Modification:
- conversion of date/time format
Log Enrichment is the process of adding data to a log
- increase efficiency of logs to save time
- Examples
- Geolocation/IP Address
- DNS
- Add/Remove
# Log Storage
High-sized storage is important, but speed to access is equally important, if not more important
WORM - Write Once Read Many
# Alerting
Timely alerts depend on storage search speed
- search stored data and then create alert
- create alert while taking log
To create quality alert, must understand the data we have.
## Blacklist
- if process in blacklist appears in logs, create an alert
- easy to manage and implement, also easy to bypass
## Whitelist
- Range of IP addresses that are used for communication
- Highly effective
- Difficult to manage
- Requires constant updates
## Long Tail Log Analysis
- Operates with the assumption that behaviors that occur constantly are normal