# Log Collection All logs are collected either via - Log Agents - Agentless ## Log Agents Log Agent software is required. Typically have parsing, log rotation, buffering, log integrity, encryption, and conversion features. Pros: - tested and working by developers - automatic parsing, encryption capabilities, log integrity verification Cons: - Resource consumption increases - Cost increases ### Syslog A very popular protocol for log transfers Works with UDP and TCP Can be encrypted with TLS FORMAT - Timestamp - Source Device - Facility - Severity - Message Number - Message Text Maximum packet size - UDP - 1024 - TCP - 4096 ### Third Party Agents - Splunk: Universal Forwarder - ArcSight: ArcSight Connectors ### Open Source Agents - Beats - NXLog ## Agentless - No installation or update cost - Usually, logs are sent by connecting to the target with SSH or WMI - log server username and password are therefore required - possibility to be compromised - Easier to prepare and manage than agent method - limited capabilities ### Manual Collection Sometimes logs are novel enough where you just gotta write your own script. # Log Aggregation and Parsing Logs can be edited with the aggregator before being sent to the destination. Parses log to send requested part to target. ## Aggregator EPS EPS = Events per second Aggregator needs to be scaled appropriately to keep up with EPS Log Modification: - conversion of date/time format Log Enrichment is the process of adding data to a log - increase efficiency of logs to save time - Examples - Geolocation/IP Address - DNS - Add/Remove # Log Storage High-sized storage is important, but speed to access is equally important, if not more important WORM - Write Once Read Many # Alerting Timely alerts depend on storage search speed - search stored data and then create alert - create alert while taking log To create quality alert, must understand the data we have. ## Blacklist - if process in blacklist appears in logs, create an alert - easy to manage and implement, also easy to bypass ## Whitelist - Range of IP addresses that are used for communication - Highly effective - Difficult to manage - Requires constant updates ## Long Tail Log Analysis - Operates with the assumption that behaviors that occur constantly are normal