Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Aquilon DLP Documentation

Welcome to the Aquilon DLP documentation.

🍎 macOS | 🐧 Linux | 🏢 Enterprise

Note: Features marked with 🏢 are available in Enterprise edition only.

About This Documentation

This documentation covers:

  • Getting Started: Quick overview and setup guides
  • Installation: Platform-specific installation instructions
  • User Guide: Day-to-day configuration and usage
  • Deployment: Production deployment strategies
  • Administration: Operations, backup, and disaster recovery
  • Technical Reference: Architecture and API integration
  • Compliance: Regulatory framework implementations
  • Support: Troubleshooting and changelog

Editions

Aquilon DLP is available in two editions:

  • Basic Edition (🐧 Linux only): Up to 5 servers, GDPR/CCPA policies
  • Enterprise Edition (🍎 macOS + 🐧 Linux): Unlimited servers, full compliance suite (HIPAA, PCI, SOX, ISO 27001) plus government/defense frameworks (CUI, CMMC, FedRAMP, FISMA)

Choose the appropriate guide for your edition throughout this documentation.

Overview

Aquilon DLP is a production-grade data leak prevention solution built in Rust.

Key Features

  • Real-time Monitoring: Detect sensitive data as files are created or modified
  • Deep Content Analysis: Parse archives, Office documents, and PDFs
  • Pattern Detection: 35 scanner plugins for PII, secrets, and compliance patterns
  • OSQuery Integration: Query findings through standard osquery tables

Use Cases

Compliance Monitoring

Monitor endpoints for sensitive data that violates compliance requirements:

  • Healthcare (HIPAA): Detect protected health information (PHI) including medical records, insurance IDs, and patient data
  • Financial Services (PCI DSS, SOX): Find credit card numbers, CVVs, and financial records
  • Privacy Regulations (GDPR, CCPA): Identify personal data including names, addresses, and government IDs

Data Breach Prevention

Prevent data leaks before they become incidents:

  • Real-time Detection: Alert immediately when sensitive data appears in monitored directories
  • Removable Media Scanning: Automatically scan USB drives when mounted to detect exfiltration attempts
  • File Sharing Oversight: Monitor shared folders and collaboration directories

Security Auditing

Discover where sensitive data resides across your infrastructure:

  • Data Discovery: Scan endpoints to map sensitive data locations
  • Risk Assessment: Identify files with multiple policy violations
  • Coverage Verification: Ensure all endpoints are protected and reporting

Incident Response

Rapidly assess affected systems during security incidents:

  • Targeted Scanning: Query specific directories or file types
  • Historical Analysis: Review past alerts for patterns
  • Triage Workflow: Acknowledge, investigate, and resolve findings with audit trail

Quick Start

Get up and running with Aquilon DLP in 5 minutes.

Prerequisites

Before installing Aquilon DLP, ensure you have:

  • OSQuery: Version 5.0.1 or later (download)
  • Operating System:
    • 🍎 macOS 11.0 (Big Sur) or later
    • 🐧 Linux (Ubuntu 22.04+, RHEL 9+, Debian 11+, CentOS Stream 9+, Fedora 38+)
  • Privileges: Administrator (macOS) or root/sudo (Linux)
  • Resources: 2GB RAM minimum, 500MB disk space

Choose Your Edition

Aquilon DLP is available in two editions:

  • 🐧 Basic Edition (Linux only): GDPR and CCPA policies, up to 5 servers
  • 🏢 Enterprise Edition (macOS + Linux): All compliance frameworks (HIPAA, PCI DSS, SOX, ISO 27001, GDPR, CCPA)

Select your quick start path below:


🏢 macOS Enterprise Quick Start

Time: ~5 minutes

1. Install OSQuery

# Using Homebrew (recommended)
brew install --cask osquery

# Or download PKG from https://github.com/osquery/osquery/releases

2. Install Aquilon DLP Enterprise

Download the Enterprise Edition PKG installer from your organization’s portal and install:

# Install using PKG installer
sudo installer -pkg aquilon-dlp-enterprise-VERSION.pkg -target /

# Verify installation
aquilon-dlp --version

3. Configure

# Configuration is installed by PKG at /etc/aquilon/config.toml
# Edit as needed for your environment

# Grant Full Disk Access
# Open System Settings → Privacy & Security → Full Disk Access
# Click + and add the Aquilon DLP application

4. Start Monitoring

Aquilon DLP runs as an osquery extension. Start osquery to begin monitoring:

# Start osquery (Aquilon DLP extension loads automatically)
sudo osqueryd

5. Verify

# In a new terminal, query OSQuery
osqueryi --connect /var/osquery/osquery.sock  'SELECT * FROM aquilon_dlp_alerts LIMIT 5;'

Next Steps:


🐧 Linux Basic Edition Quick Start

Time: ~5 minutes

1. Install OSQuery

# Ubuntu/Debian
export OSQUERY_KEY=1484120AC4E9F8A1A577AEEE97A80C63C9D8B80B
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys $OSQUERY_KEY
sudo add-apt-repository 'deb [arch=amd64] https://pkg.osquery.io/deb deb main'
sudo apt-get update
sudo apt-get install osquery

# RHEL/CentOS
curl -L https://pkg.osquery.io/rpm/GPG | sudo tee /etc/pki/rpm-gpg/RPM-GPG-KEY-osquery
sudo yum-config-manager --add-repo https://pkg.osquery.io/rpm/osquery-s3-rpm.repo
sudo yum-config-manager --enable osquery-s3-rpm
sudo yum install osquery

2. Install Aquilon DLP Basic

Download the Basic Edition package from your organization’s portal:

# Ubuntu/Debian
sudo apt install ./aquilon-dlp-basic_VERSION_amd64.deb

# RHEL/CentOS
sudo dnf install ./aquilon-dlp-basic-VERSION.x86_64.rpm

# Verify
aquilon-dlp --version

3. Configure

# Configuration is installed at /etc/aquilon/config.toml
# Edit as needed for your environment

# Validate configuration
aquilon-dlp --validate-config /etc/aquilon/config.toml

4. Start Monitoring

Aquilon DLP runs as an osquery extension. Start osquery to begin monitoring:

# Start osquery (Aquilon DLP extension loads automatically)
sudo systemctl start osqueryd

5. Verify

# In a new terminal, query OSQuery
osqueryi --connect /var/osquery/extensions.sock 'SELECT * FROM aquilon_dlp_alerts LIMIT 5;'

Next Steps:


🏢 Linux Enterprise Quick Start

Time: ~5 minutes

1. Install OSQuery

# Ubuntu/Debian
curl -L https://pkg.osquery.io/deb/osquery_5.x_1.0.0_amd64.deb -o osquery.deb
sudo dpkg -i osquery.deb

# RHEL/CentOS
sudo yum install https://pkg.osquery.io/rpm/osquery-5.x-1.0.0.x86_64.rpm

2. Install Aquilon DLP Enterprise

Download the Enterprise Edition package from your organization’s portal:

# Ubuntu/Debian
sudo apt install ./aquilon-dlp-enterprise_VERSION_amd64.deb

# RHEL/CentOS
sudo dnf install ./aquilon-dlp-enterprise-VERSION.x86_64.rpm

# Verify
aquilon-dlp --version

3. Configure

# Configuration is installed at /etc/aquilon/config.toml
# Edit as needed for your environment

# Validate configuration
aquilon-dlp --validate-config /etc/aquilon/config.toml

4. Start Monitoring

Aquilon DLP runs as an osquery extension. Start osquery to begin monitoring:

# Start osquery (Aquilon DLP extension loads automatically)
sudo systemctl start osqueryd

5. Verify

# In a new terminal, query OSQuery for HIPAA violations
osqueryi --connect /var/osquery/extensions.sock 'SELECT * FROM aquilon_dlp_alerts WHERE policy = "HIPAA" LIMIT 5;'

# Query PCI DSS findings
osqueryi --connect /var/osquery/extensions.sock 'SELECT * FROM aquilon_dlp_alerts WHERE policy = "PCI_DSS" LIMIT 5;'

Next Steps:


What’s Next?

After completing the quick start:

  1. Production Setup: Configure systemd service (Linux) or LaunchDaemon (macOS) for automatic startup
  2. Customize Policies: Edit /etc/aquilon/config.toml to add watch paths and exclude directories
  3. Monitor Alerts: Integrate with your SIEM or set up OSQuery scheduled queries
  4. Review Architecture: Understand the system architecture and plugin system

Troubleshooting

OSQuery extension not loading?

  • Verify OSQuery is running: ps aux | grep osquery
  • Check socket path matches configuration
  • Review OSQuery logs for extension errors

Permission errors (macOS)?

  • Ensure Full Disk Access granted in System Settings
  • Restart LaunchDaemon after granting permissions

Policy not available (Basic Edition)?

  • Basic Edition only includes GDPR and CCPA
  • Remove enterprise policies (HIPAA, PCI DSS, SOX, ISO 27001) from configuration
  • Upgrade to Enterprise Edition for full policy support

High CPU usage?

  • Add exclusions for cache directories and system paths
  • Reduce num_workers in configuration
  • See Troubleshooting Guide for performance tuning

Support

Installation

Aquilon DLP is available in two editions to meet different organizational needs. This section covers installation for all platforms and editions.

Edition Comparison

FeatureBasic EditionEnterprise Edition
PlatformsLinux onlymacOS + Linux
GDPRYesYes
CCPAYesYes
HIPAANoYes
PCI DSSNoYes
SOXNoYes
ISO 27001NoYes
Custom TOML PoliciesYesYes
SupportCommunityEnterprise SLA
macOS Endpoint SecurityNoYes
MDM DeploymentNoYes

Quick Installation Reference

macOS (Enterprise Only)

Note: macOS support requires the Enterprise Edition.

# Download PKG installer, then:
sudo installer -pkg aquilon-dlp-enterprise-VERSION.pkg -target /

See macOS Installation Guide for complete instructions including Full Disk Access setup.

Linux (Basic or Enterprise)

Ubuntu/Debian:

sudo apt install ./aquilon-dlp-{edition}_VERSION_amd64.deb

CentOS/RHEL:

sudo dnf install ./aquilon-dlp-{edition}-VERSION.x86_64.rpm

See Linux Basic Edition or Linux Enterprise Edition for complete instructions.

Prerequisites

All installations require:

  • osquery 5.0.1 or later - Download from GitHub releases
  • Administrator privileges - Installation requires root/sudo access

Platform-specific requirements:

PlatformAdditional Requirements
macOSmacOS 11.0 (Big Sur) or later, Full Disk Access permission
Ubuntu/DebianUbuntu 22.04+ or Debian 11+
CentOS/RHELCentOS Stream 9+, RHEL 9+, or Fedora 38+

Choosing Your Edition

Basic Edition

Perfect for:

  • Small teams and startups
  • Organizations needing GDPR/CCPA compliance only
  • Evaluation and testing purposes

Install Linux Basic Edition

Enterprise Edition

Required for:

  • macOS deployment
  • HIPAA, PCI DSS, SOX, or ISO 27001 compliance
  • MDM-based deployment (Jamf, Intune, Kandji)

Install macOS Enterprise Edition | Install Linux Enterprise Edition

Contact

macOS Installation

Enterprise Edition Only: macOS support requires the Enterprise Edition of Aquilon DLP.

This guide covers installing Aquilon DLP on macOS using the PKG installer, including the required Full Disk Access configuration.

Prerequisites

Before installing Aquilon DLP, ensure you have:

  • macOS 11.0 (Big Sur) or later
  • osquery 5.0.1 or later - Download from GitHub releases
  • Administrator privileges

Install osquery

Download and install osquery from the official releases:

# Download the PKG installer from osquery.io
# Then install:
sudo installer -pkg osquery-5.10.2.pkg -target /

Verify the installation:

osqueryd --version
# Expected: osqueryd version 5.10.2 (or later)

Installation

Step 1: Download the Installer

Download the signed PKG installer for macOS from the Aquilon Security portal:

  • File: aquilon-dlp-enterprise-VERSION.pkg

Step 2: Install via GUI or Command Line

GUI Installation: Double-click the PKG file and follow the installation wizard.

Command Line Installation:

sudo installer -pkg aquilon-dlp-enterprise-VERSION.pkg -target /

Step 3: Verify Installation

Check that all components were installed correctly:

# Verify app bundle
ls -la /opt/aquilon/aquilon-dlp.app

# Verify configuration directory
ls -la /etc/aquilon/

# Verify data directory
ls -la /var/aquilon/dlp/

# Verify extension registered with osquery
cat /var/osquery/extensions.load

What Gets Installed:

ComponentLocation
App bundle/opt/aquilon/aquilon-dlp.app
Configuration/etc/aquilon/
Database/var/db/aquilon/
Logs/var/log/aquilon/
osquery extensionRegistered in /var/osquery/extensions.load

Endpoint Security Setup

Aquilon DLP uses Apple’s Endpoint Security framework for real-time file monitoring. This requires granting Full Disk Access permission.

Grant Full Disk Access

  1. Open System Settings (or System Preferences on older macOS)
  2. Navigate to Privacy & Security > Full Disk Access
  3. Click the lock icon and authenticate
  4. Click + to add an application
  5. Navigate to /opt/aquilon/aquilon-dlp.app and add it
  6. Ensure the toggle is enabled

Verify Endpoint Security

After granting Full Disk Access, verify the extension loads correctly:

# Check osquery sees the extension
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM osquery_extensions;'

# Query DLP tables
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM aquilon_dlp_alerts LIMIT 5;'

MDM Deployment (Enterprise)

For enterprise environments, automate Full Disk Access grants via MDM using PPPC (Privacy Preferences Policy Control) profiles:

Supported MDM Platforms:

  • Jamf Pro
  • Microsoft Intune
  • Kandji
  • SimpleMDM, FileWave, Mosyle

Quick Setup:

  1. Upload the PPPC profile from deployment/mdm/ to your MDM
  2. Deploy profile to target devices
  3. Deploy the Aquilon DLP PKG

See the Deployment Guide for platform-specific MDM instructions.

Post-Installation

Initial Configuration

The installer creates a default configuration at /etc/aquilon/config.toml. Edit this file to customize:

sudo nano /etc/aquilon/config.toml

Key configuration options:

  • Watch paths: Directories to monitor for sensitive data
  • Enabled policies: HIPAA, PCI DSS, SOX, ISO 27001, GDPR, CCPA
  • Removable media scanning: Auto-scan USB drives on mount

See the Configuration Guide for complete options.

Verify DLP is Working

Test that Aquilon DLP is detecting files:

# Create a test file with sensitive data
echo "SSN: 223-41-1189" > /tmp/test-sensitive.txt

# Wait a moment for scanning, then query alerts
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM aquilon_dlp_alerts;'

Upgrading

To upgrade to a new version:

# Download new PKG installer
# Install over existing installation
sudo installer -pkg aquilon-dlp-enterprise-NEW_VERSION.pkg -target /

Your configuration in /etc/aquilon/config.toml is preserved during upgrades.

Uninstalling

To completely remove Aquilon DLP:

# Remove the application
sudo rm -rf /opt/aquilon

# Remove configuration (optional - preserves settings)
sudo rm -rf /etc/aquilon

# Remove data and logs (optional)
sudo rm -rf /var/aquilon /var/log/aquilon

# Remove from osquery extensions
sudo sed -i '' '/aquilon/d' /var/osquery/extensions.load

Troubleshooting

Common Issues

“Unsupported macOS version”

Aquilon DLP requires macOS 11.0 or later. Check your version:

sw_vers -productVersion

“Unsupported osquery version”

Aquilon DLP requires osquery 5.0.1 or later. Upgrade from osquery releases.

“Signature verification failed”

The PKG may be corrupted. Re-download from the official source and verify:

spctl -a -v aquilon-dlp-enterprise.pkg

Extension not loading in osquery

  1. Verify Full Disk Access is granted (see Endpoint Security Setup)

  2. Restart osqueryd:

    sudo launchctl unload /Library/LaunchDaemons/io.osquery.agent.plist
    sudo launchctl load /Library/LaunchDaemons/io.osquery.agent.plist
    

    Note: OSQuery 5.0.1+ uses io.osquery.agent.plist. Older versions use com.facebook.osqueryd.plist.

  3. Check logs:

    tail -f /var/log/aquilon/aquilon-dlp.log
    

“Installation already in progress”

Another installation is running. If a previous installation crashed, automatic stale lock detection should clean up. If not:

sudo rm -rf /var/run/aquilon-install.lock

Getting Help

Linux Installation (Basic Edition)

Basic Edition Features: GDPR, CCPA, and custom TOML policies. Community support.

This guide covers installing Aquilon DLP Basic Edition on Linux using DEB or RPM packages.

Prerequisites

Before installing Aquilon DLP, ensure you have:

  • Supported Linux Distribution:
    • Ubuntu 22.04 LTS or later
    • Debian 11 or later
    • CentOS Stream 9 or later
    • RHEL 9 or later
    • Fedora 38 or later
  • osquery 5.0.1 or later
  • Administrator (root) privileges

Install osquery

Ubuntu/Debian:

# Download osquery DEB package
wget https://pkg.osquery.io/deb/osquery_5.10.2-1.linux_amd64.deb

# Install osquery
sudo apt install ./osquery_5.10.2-1.linux_amd64.deb

CentOS/RHEL:

# Download osquery RPM package
wget https://pkg.osquery.io/rpm/osquery-5.10.2-1.linux.x86_64.rpm

# Install osquery
sudo dnf install ./osquery-5.10.2-1.linux.x86_64.rpm

Verify the installation:

osqueryd --version
# Expected: osqueryd version 5.10.2 (or later)

Installation

Ubuntu/Debian

Step 1: Download the Package

Download the Basic Edition DEB package from the Aquilon Security portal:

  • File: aquilon-dlp-basic_VERSION_amd64.deb

Step 2: Install

sudo apt install ./aquilon-dlp-basic_VERSION_amd64.deb

Expected output:

Reading package lists... Done
Building dependency tree... Done
[INFO] Validating osquery installation...
[INFO] osquery validation passed
[INFO] Creating application directories...
[INFO] Extension binary permissions set: /usr/lib/osquery/extensions/aquilon-dlp-basic.ext
[INFO] Added extension to /etc/osquery/extensions.load
[INFO] Installation completed successfully

Step 3: Verify Installation

# Check binary location
ls -lh /usr/lib/osquery/extensions/aquilon-dlp-basic.ext
# Expected: -rwxr-xr-x 1 root root 9.3M ... aquilon-dlp-basic.ext

# Check osquery configuration
cat /etc/osquery/extensions.load
# Expected: /usr/lib/osquery/extensions/aquilon-dlp-basic.ext

# Restart osqueryd
sudo systemctl restart osqueryd
sudo systemctl status osqueryd
# Expected: active (running)

# Verify extension loaded
osqueryi --json "SELECT * FROM aquilon_dlp_alerts LIMIT 1;"

CentOS/RHEL

Step 1: Download the Package

Download the Basic Edition RPM package from the Aquilon Security portal:

  • File: aquilon-dlp-basic-VERSION.x86_64.rpm

Step 2: Install

sudo dnf install ./aquilon-dlp-basic-VERSION.x86_64.rpm

Expected output:

Last metadata expiration check: ...
Dependencies resolved.
Installing:
 aquilon-dlp-basic        x86_64        VERSION        @commandline        9.3 M
[INFO] Validating osquery installation...
[INFO] osquery validation passed
[INFO] Creating application directories...
[INFO] Extension binary permissions set: /usr/lib/osquery/extensions/aquilon-dlp-basic.ext
[INFO] Added extension to /etc/osquery/extensions.load
[INFO] Installation completed successfully

Step 3: Verify Installation

# Check binary location
ls -lh /usr/lib/osquery/extensions/aquilon-dlp-basic.ext

# Check osquery configuration
cat /etc/osquery/extensions.load

# Restart osqueryd
sudo systemctl restart osqueryd
sudo systemctl status osqueryd

# Verify extension loaded
osqueryi --json "SELECT * FROM aquilon_dlp_alerts LIMIT 1;"

SELinux Considerations (RHEL/CentOS)

On systems with SELinux enabled, the installation script automatically restores security contexts. If issues occur:

# Check SELinux status
getenforce

# Manually restore contexts if needed
sudo restorecon -Rv /usr/lib/osquery/extensions/
sudo restorecon -Rv /etc/aquilon/

Post-Installation

Configuration

Copy the default configuration and customize:

sudo cp /etc/aquilon/config.toml.default /etc/aquilon/config.toml
sudo nano /etc/aquilon/config.toml

Basic Edition Policies:

The Basic Edition includes these compliance policies:

  • GDPR - EU General Data Protection Regulation
  • CCPA - California Consumer Privacy Act
  • Custom TOML Policies - Define your own detection rules

Example configuration:

watch_paths = ["/home/%%", "/var/data/%%", "/srv/%%"]

[policies]
enabled_policies = ["gdpr", "ccpa"]

See the Configuration Guide for complete options.

Verify DLP is Working

Test that Aquilon DLP is detecting files:

# Create a test file with sensitive data
echo "SSN: 223-41-6711" > /tmp/test-sensitive.txt

# Wait a moment for scanning, then query alerts
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM aquilon_dlp_alerts;'

Upgrading

Ubuntu/Debian:

# Stop osqueryd (optional)
sudo systemctl stop osqueryd

# Install new package
sudo apt install ./aquilon-dlp-basic_NEW_VERSION_amd64.deb

# Start osqueryd
sudo systemctl start osqueryd

CentOS/RHEL:

# Stop osqueryd (optional)
sudo systemctl stop osqueryd

# Upgrade package
sudo dnf upgrade ./aquilon-dlp-basic-NEW_VERSION.x86_64.rpm

# Start osqueryd
sudo systemctl start osqueryd

Your configuration in /etc/aquilon/config.toml is preserved during upgrades.

Uninstalling

Ubuntu/Debian:

# Remove package
sudo apt remove aquilon-dlp-basic

# Clean up configuration (optional)
sudo rm -rf /etc/aquilon /var/lib/aquilon /var/log/aquilon

CentOS/RHEL:

# Remove package
sudo dnf remove aquilon-dlp-basic

# Clean up configuration (optional)
sudo rm -rf /etc/aquilon /var/lib/aquilon /var/log/aquilon

Upgrading to Enterprise Edition

Need HIPAA, PCI DSS, SOX, or ISO 27001 compliance? Upgrade to the Enterprise Edition:

  1. Uninstall the Basic Edition
  2. Install the Enterprise Edition
  3. Your configuration is preserved

Contact sales@aquilonsecurity.com for Enterprise Edition access.

Troubleshooting

Common Issues

“osquery not found” during installation

Install osquery before installing Aquilon DLP:

# Ubuntu/Debian
sudo apt install ./osquery_5.10.2-1.linux_amd64.deb

# CentOS/RHEL
sudo dnf install ./osquery-5.10.2-1.linux.x86_64.rpm

Extension not loading

  1. Check extension is registered:

    cat /etc/osquery/extensions.load
    
  2. Restart osqueryd:

    sudo systemctl restart osqueryd
    
  3. Check logs:

    journalctl -u osqueryd -f
    

Permission denied errors

Verify the extension has correct permissions:

ls -la /usr/lib/osquery/extensions/aquilon-dlp-basic.ext
# Should be: -rwxr-xr-x root root

Getting Help

Linux Installation (Enterprise Edition)

Enterprise Edition Features: All compliance policies (HIPAA, PCI DSS, SOX, ISO 27001, GDPR, CCPA), unlimited servers, enterprise SLA support.

This guide covers installing Aquilon DLP Enterprise Edition on Linux using DEB or RPM packages.

Prerequisites

Before installing Aquilon DLP, ensure you have:

  • Supported Linux Distribution:
    • Ubuntu 22.04 LTS or later
    • Debian 11 or later
    • CentOS Stream 9 or later
    • RHEL 9 or later
    • Fedora 38 or later
  • osquery 5.0.1 or later
  • Administrator (root) privileges

Install osquery

Ubuntu/Debian:

# Download osquery DEB package
wget https://pkg.osquery.io/deb/osquery_5.10.2-1.linux_amd64.deb

# Install osquery
sudo apt install ./osquery_5.10.2-1.linux_amd64.deb

CentOS/RHEL:

# Download osquery RPM package
wget https://pkg.osquery.io/rpm/osquery-5.10.2-1.linux.x86_64.rpm

# Install osquery
sudo dnf install ./osquery-5.10.2-1.linux.x86_64.rpm

Verify the installation:

osqueryd --version
# Expected: osqueryd version 5.10.2 (or later)

Installation

Ubuntu/Debian

Step 1: Download the Package

Download the Enterprise Edition DEB package from the Aquilon Security portal:

  • File: aquilon-dlp-enterprise_VERSION_amd64.deb

Step 2: Install

sudo apt install ./aquilon-dlp-enterprise_VERSION_amd64.deb

Expected output:

Reading package lists... Done
Building dependency tree... Done
[INFO] Validating osquery installation...
[INFO] osquery validation passed
[INFO] Creating application directories...
[INFO] Extension binary permissions set: /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext
[INFO] Added extension to /etc/osquery/extensions.load
[INFO] Installation completed successfully

Step 3: Verify Installation

# Check binary location
ls -lh /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext
# Expected: -rwxr-xr-x 1 root root 9.3M ... aquilon-dlp-enterprise.ext

# Check osquery configuration
cat /etc/osquery/extensions.load
# Expected: /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext

# Restart osqueryd
sudo systemctl restart osqueryd
sudo systemctl status osqueryd
# Expected: active (running)

# Verify extension loaded
osqueryi --json "SELECT * FROM aquilon_dlp_alerts LIMIT 1;"

CentOS/RHEL

Step 1: Download the Package

Download the Enterprise Edition RPM package from the Aquilon Security portal:

  • File: aquilon-dlp-enterprise-VERSION.x86_64.rpm

Step 2: Install

sudo dnf install ./aquilon-dlp-enterprise-VERSION.x86_64.rpm

Expected output:

Last metadata expiration check: ...
Dependencies resolved.
Installing:
 aquilon-dlp-enterprise        x86_64        VERSION        @commandline        9.3 M
[INFO] Validating osquery installation...
[INFO] osquery validation passed
[INFO] Creating application directories...
[INFO] Extension binary permissions set: /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext
[INFO] Added extension to /etc/osquery/extensions.load
[INFO] Installation completed successfully

Step 3: Verify Installation

# Check binary location
ls -lh /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext

# Check osquery configuration
cat /etc/osquery/extensions.load

# Restart osqueryd
sudo systemctl restart osqueryd
sudo systemctl status osqueryd

# Verify extension loaded
osqueryi --json "SELECT * FROM aquilon_dlp_alerts LIMIT 1;"

SELinux Considerations (RHEL/CentOS)

On systems with SELinux enabled, the installation script automatically restores security contexts. If issues occur:

# Check SELinux status
getenforce

# Verify extension details
ls -Z /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext

# Manually restore contexts if needed
sudo restorecon -Rv /usr/lib/osquery/extensions/
sudo restorecon -Rv /etc/aquilon/

Post-Installation

Configuration

Copy the default configuration and customize:

sudo cp /etc/aquilon/config.toml.default /etc/aquilon/config.toml
sudo nano /etc/aquilon/config.toml

Enterprise Edition Policies:

The Enterprise Edition includes all compliance policies:

  • GDPR - EU General Data Protection Regulation
  • CCPA - California Consumer Privacy Act
  • HIPAA - Health Insurance Portability and Accountability Act
  • PCI DSS - Payment Card Industry Data Security Standard
  • SOX - Sarbanes-Oxley Act
  • ISO 27001 - Information Security Management
  • Custom TOML Policies - Define your own detection rules

Example configuration for healthcare organization:

watch_paths = ["/home/%%", "/var/data/%%", "/srv/%%", "/mnt/medical-records/%%"]

[policies]
enabled_policies = ["hipaa", "gdpr", "pci_dss"]

[policies.policy_configs.hipaa]
enabled = true
settings = { confidence_threshold = "0.8" }

Example configuration for financial services:

watch_paths = ["/home/%%", "/var/data/%%", "/srv/transactions/%%"]

[policies]
enabled_policies = ["pci_dss", "sox", "gdpr", "ccpa"]

[policies.policy_configs.pci_dss]
enabled = true
settings = { alert_on_test_data = "false" }

See the Configuration Guide for complete options and the Compliance Documentation for policy context.

Verify DLP is Working

Test that Aquilon DLP is detecting files:

# Create a test file with sensitive data
echo "SSN: 223-41-6729" > /tmp/test-sensitive.txt

# Wait a moment for scanning, then query alerts
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM aquilon_dlp_alerts;'

Enterprise Features

Unlimited Server Deployment

The Enterprise Edition supports unlimited servers. For large-scale deployments:

  1. Use configuration management (Ansible, Puppet, Chef) for consistent deployment
  2. Consider centralized logging aggregation
  3. Use osquery fleet management tools like Fleet or Kolide

Enterprise Support

Enterprise customers receive:

  • Priority support with SLA guarantees
  • Direct access to engineering team
  • Custom policy development assistance
  • Deployment and integration consulting

Contact your account representative for support.

Upgrading

Ubuntu/Debian:

# Stop osqueryd (optional)
sudo systemctl stop osqueryd

# Install new package
sudo apt install ./aquilon-dlp-enterprise_NEW_VERSION_amd64.deb

# Start osqueryd
sudo systemctl start osqueryd

CentOS/RHEL:

# Stop osqueryd (optional)
sudo systemctl stop osqueryd

# Upgrade package
sudo dnf upgrade ./aquilon-dlp-enterprise-NEW_VERSION.x86_64.rpm

# Start osqueryd
sudo systemctl start osqueryd

Your configuration in /etc/aquilon/config.toml is preserved during upgrades. The RPM package uses %config(noreplace) to ensure this.

Uninstalling

Ubuntu/Debian:

# Remove package
sudo apt remove aquilon-dlp-enterprise

# Clean up configuration (optional)
sudo rm -rf /etc/aquilon /var/lib/aquilon /var/log/aquilon

CentOS/RHEL:

# Remove package
sudo dnf remove aquilon-dlp-enterprise

# Clean up configuration (optional)
sudo rm -rf /etc/aquilon /var/lib/aquilon /var/log/aquilon

Troubleshooting

Common Issues

“osquery not found” during installation

Install osquery before installing Aquilon DLP:

# Ubuntu/Debian
sudo apt install ./osquery_5.10.2-1.linux_amd64.deb

# CentOS/RHEL
sudo dnf install ./osquery-5.10.2-1.linux.x86_64.rpm

Extension not loading

  1. Check extension is registered:

    cat /etc/osquery/extensions.load
    
  2. Restart osqueryd:

    sudo systemctl restart osqueryd
    
  3. Check logs:

    journalctl -u osqueryd -f
    

SELinux blocking access

On RHEL/CentOS with SELinux enforcing:

# Check for denials
sudo ausearch -m avc -ts recent

# Restore contexts
sudo restorecon -Rv /usr/lib/osquery/extensions/
sudo restorecon -Rv /etc/aquilon/

Permission denied errors

Verify the extension has correct permissions:

ls -la /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext
# Should be: -rwxr-xr-x root root

Getting Help

User Guide

This guide covers the day-to-day configuration and usage of Aquilon DLP. Whether you’re setting up initial monitoring, configuring compliance policies, or analyzing alerts, you’ll find the information you need here.

Sections

Configuration

Learn how to configure Aquilon DLP for your environment:

  • Configuration file location and format
  • Watch paths and file monitoring
  • Caching and performance settings
  • Removable media auto-scanning
  • Performance tuning options

Policy Frameworks

Understand and configure compliance policies:

  • Built-in compliance frameworks (GDPR, CCPA, HIPAA, PCI DSS, SOX, ISO 27001)
  • Edition-specific policy availability
  • Policy configuration options
  • Creating custom TOML policies and scanners
  • Rule types and composition

Monitoring

Monitor Aquilon DLP operation and analyze findings:

  • Querying the osquery tables
  • Interpreting alert data
  • Cache status and performance metrics
  • Log analysis and troubleshooting
  • Integration with SIEM systems

Getting Started

After installing Aquilon DLP (see Installation), follow these steps:

  1. Configure watch paths - Define which directories to monitor for sensitive data
  2. Enable policies - Select compliance frameworks appropriate for your organization
  3. Verify operation - Create test files and query alerts to confirm detection
  4. Monitor ongoing - Review alerts, tune confidence thresholds, add exclusions

Quick Reference

Configuration File Location

PlatformLocation
macOS/etc/aquilon/config.toml
Linux/etc/aquilon/config.toml

Common osquery Queries

-- View recent alerts
SELECT * FROM aquilon_dlp_alerts
ORDER BY timestamp DESC LIMIT 10;

-- Count alerts by policy
SELECT policy, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY policy;

-- View alert details
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts LIMIT 10;

Edition Policy Availability

PolicyBasic EditionEnterprise Edition
GDPRYesYes
CCPAYesYes
HIPAANoYes
PCI DSSNoYes
SOXNoYes
ISO 27001NoYes
Custom TOMLYesYes

Support

Configuration

Aquilon DLP is configured through a TOML file that controls all aspects of operation including watch paths, policies, caching, and performance settings.

Configuration File

Location

PlatformDefault Location
macOS/etc/aquilon/config.toml
Linux/etc/aquilon/config.toml

Initial Setup

After installation, copy the default configuration and customize:

sudo cp /etc/aquilon/config.toml.default /etc/aquilon/config.toml
sudo nano /etc/aquilon/config.toml

Core Configuration

Watch Paths

Define which directories Aquilon DLP monitors for sensitive data. Use %% to recursively watch all subdirectories:

watch_paths = [
    "/home/%%",
    "/var/data/%%",
    "/srv/%%",
    "/Users/%%"
]

Path Syntax:

  • /path/to/dir/%% - Watch directory and all subdirectories recursively
  • /path/to/dir - Watch only the directory itself (no recursion)

Best Practices:

  • Include directories where users store documents
  • Include shared drives and collaboration folders
  • Exclude system directories (already excluded by default)
  • Exclude known safe directories like source code repos

Exclusions

Exclude specific paths from monitoring:

watch_paths = ["/home/%%", "/var/data/%%"]

# Exclude specific directories
exclude_paths = [
    "/home/*/.cache/%%",
    "/home/*/Downloads/%%",
    "/var/log/%%"
]

Policy Configuration

Enable Policies

Select which compliance frameworks to enable:

[policies]
enabled_policies = ["gdpr", "ccpa", "hipaa", "pci_dss", "sox", "iso27001"]

Available Policies:

PolicyDescriptionEdition
gdprEU General Data Protection RegulationAll
ccpaCalifornia Consumer Privacy ActAll
hipaaHealth Insurance Portability and AccountabilityEnterprise
pci_dssPayment Card Industry Data Security StandardEnterprise
soxSarbanes-Oxley ActEnterprise
iso27001Information Security ManagementEnterprise

Policy-Specific Settings

Configure individual policy behavior:

[policies.policy_configs.hipaa]
enabled = true
settings = { confidence_threshold = "0.8" }

[policies.policy_configs.pci_dss]
enabled = true
settings = { alert_on_test_data = "false" }

[policies.policy_configs.iso27001]
enabled = true
settings = { confidence_threshold = "0.7", enforce_data_masking = "true" }

Caching Configuration

Aquilon DLP uses a two-tier caching system to minimize redundant scanning:

[cache]
# Enable/disable caching (default: true)
enabled = true

# In-memory cache TTL in seconds (default: 0 = no expiry)
ttl_secs = 3600

# Database scan cache TTL in days (default: 7)
scan_cache_ttl_days = 7

Note: The database location is configured via the top-level database_path field:

# Linux: /var/lib/aquilon/aquilon.db
# macOS: /var/db/aquilon/aquilon.db
database_path = "/var/lib/aquilon/aquilon.db"

Platform Note: Default database paths differ by platform. The macOS PKG installer automatically configures the macOS path at /var/db/aquilon/aquilon.db.

Cache Performance

  • Cache hit on clean file: <5ms p99
  • Cache hit with alerts: <20ms p95
  • Cache vs full scan: 10-100x faster

Cache Behavior

File StateCache Behavior
Clean (no findings)Fully cached, subsequent scans skipped
Has alertsAlert details cached as JSON (up to 25 alerts)
ModifiedCache entry invalidated, full rescan

Removable Media

Configure automatic scanning of USB drives and external media:

[removable_media]
# Automatically scan removable media when mounted (default: false)
auto_scan_on_mount = true

Platform-Specific Behavior:

PlatformDetection MethodMonitored Paths
macOSEndpoint Security mount events/Volumes/* (excluding system)
Linux/proc/self/mounts polling/media/*, /mnt/*, /run/media/*

Use Cases:

  • Data exfiltration detection
  • Compliance monitoring for removable media
  • Incident response device scanning

Performance Note: Large external drives (8TB+) with significant data will take time to scan. Consider the resource impact before enabling.

Performance Tuning

Scan Settings

[scan]
# Maximum findings per scanner per file (default: 5)
max_findings_per_scanner = 5

# Maximum file size in MB to scan (default: 40)
max_scan_size_mb = 40

# Maximum recursion depth for nested archives (default: 5)
max_recursion_depth = 5

# Regex size limits
regex_size_limit_mb = 10
regex_dfa_size_limit_mb = 2

# File update cooldown in minutes (default: 30)
file_update_cooldown_mins = 30

# Event coalesce delay in seconds (default: 120)
event_coalesce_delay_secs = 120

Resource Limits

[resource_limits]
# Enable resource limiting (default: false)
enabled = true

# Maximum CPU usage percentage (default: 50.0)
max_cpu_percent = 50.0

# Maximum memory in MB (default: 512)
max_memory_mb = 512

# Maximum disk I/O in MB/s (default: 50.0)
max_disk_io_mbps = 50.0

# Process nice level (default: 10)
nice_level = 10

# Throttle delay between scans in ms (default: 10)
throttle_delay_ms = 10

Worker Configuration

[worker]
# Number of worker threads (default: 0 = auto-detect CPU cores)
num_workers = 4

# Timeout for receiving work items in ms (default: 1000)
recv_timeout_ms = 1000

[work_queue]
# Maximum queue size (default: 10000)
max_queue_size = 10000

# Submit timeout in seconds (default: 5)
submit_timeout_secs = 5

Context Configuration

[context]
# Context window size in bytes for surrounding text capture (default: 200)
# Larger values provide more details but impact performance
window_size = 200

# Enable specific context profiles
# Available profiles:
#   - healthcare: Medical terms (patient, diagnosis, HIPAA keywords)
#   - payment: Financial transaction terms (credit card, payment, PCI keywords)
#   - personal_data: PII identifiers (SSN, address, contact info)
#   - employment: HR/payroll terms (employee, salary, W-2)
#   - sox_financial: SOX compliance terms (revenue, earnings, 10-K, quarterly)
#   - gdpr_phone: Personal vs business phone context (mobile, cell, office)
enabled_profiles = ["healthcare", "payment", "personal_data", "employment", "sox_financial", "gdpr_phone"]

Context Trace

Enable debug tracing for context enrichment decisions. When enabled, detailed JSON logs are emitted showing how each finding’s confidence was adjusted based on surrounding context.

Note: This feature generates verbose output and should only be enabled when debugging enrichment behavior (e.g., investigating false positives or negatives).

[context_trace]
# Enable context enrichment debug tracing (default: false)
# When enabled, emits JSON logs showing enrichment decisions:
# - Original confidence scores
# - Context profiles matched
# - Confidence adjustments applied
# - Final enriched confidence
enabled = false

See Troubleshooting: Debugging Enrichment for usage guidance.

CPU Debugging

Enable detailed performance metrics for troubleshooting:

[cpu_debugging]
# Enable CPU debugging features (default: true)
enabled = true

# Histogram buckets for latency tracking in ms (must be ascending)
histogram_buckets = [10, 50, 100, 500, 1000, 5000, 10000, 30000]

# Threshold for slow file warnings in ms (default: 1000)
slow_file_threshold_ms = 1000

# Maximum slow files to track (default: 10)
max_slow_files = 10

# Enable worker thread status tracking (default: true)
worker_tracking_enabled = true

# Enable performance alerting (default: false)
alerting_enabled = false

# Scanner processing time alert threshold in ms (default: 5000)
scanner_alert_threshold_ms = 5000

# Work queue pending items alert threshold (default: 1000)
queue_alert_threshold = 1000

Database Maintenance

Aquilon DLP includes automatic database maintenance to manage disk usage and keep the local database cache healthy. The local database is designed as a cache—your SIEM should handle long-term retention.

⚠️ Compliance Warning

The default findings_max_age_days of 7 days is SHORT for compliance requirements:

  • HIPAA: 6 years (2190 days)
  • SOX: 7 years (2555 days)
  • PCI-DSS: 1 year (365 days)

Ensure your SIEM captures findings for long-term retention before enabling aggressive cleanup. The local database is intended as a cache, not permanent storage.

Basic Configuration

[maintenance]
# Enable background maintenance thread (default: true)
enabled = true

# Interval between maintenance runs in seconds (default: 3600 = 1 hour)
# Minimum: 60 seconds
interval_secs = 3600

Retention Settings

Configure how long data is retained before cleanup:

[maintenance.retention]
# Maximum age for findings before soft-delete (default: 7 days)
# Minimum: 1 day
findings_max_age_days = 7

# Maximum age for scan cache entries (default: 7 days)
# Minimum: 1 day
cache_max_age_days = 7

# Days to wait before hard-deleting soft-deleted findings (default: 1)
# Set to 0 for immediate hard delete (when SIEM has captured data)
hard_delete_grace_days = 1

Vacuum Settings

Configure incremental vacuum to reclaim disk space:

[maintenance.vacuum]
# Pages to reclaim per incremental vacuum run (default: 1000)
# Each page is ~4KB, so 1000 pages = ~4MB per run
# Set to 0 to disable vacuum operations
incremental_pages = 1000

Manual Maintenance

Run maintenance immediately without starting the daemon:

# Run maintenance once and exit
aquilon-dlp --maintenance-now --config /etc/aquilon/config.toml

# Output is JSON with counts and duration:
# {
#   "soft_deleted": 42,
#   "hard_deleted": 15,
#   "cache_evicted": 128,
#   "pages_vacuumed": 1000,
#   "duration_ms": 234,
#   "errors": []
# }

See Operations for additional database management commands.

Logging Configuration

Logging is configured via the RUST_LOG environment variable:

# Set log level
export RUST_LOG=info

# Set per-module log levels
export RUST_LOG=aquilon_dlp=debug,warn

# Available levels: error, warn, info, debug, trace

The application uses structured logging via the tracing crate. Logs are written to stdout/stderr and can be redirected as needed by your init system.

OSQuery Integration

Aquilon DLP exposes alerts via an OSQuery virtual table. Configure behavior:

[osquery]
# Maximum rows returned for alerts table without explicit LIMIT clause
# Prevents memory exhaustion from unbounded queries
# Default: 10000, set to 0 for unlimited (not recommended)
max_alert_rows = 10000

Note: When querying large alert sets, use WHERE clauses to filter results. Unbounded SELECT * FROM aquilon_dlp_alerts queries will be truncated at this limit.

Example Configurations

Healthcare Organization (HIPAA Focus)

watch_paths = ["/home/%%", "/var/data/%%", "/srv/%%", "/mnt/medical-records/%%"]
exclude_paths = ["/home/*/.cache/%%"]
database_path = "/var/lib/aquilon/aquilon.db"

[policies]
enabled_policies = ["hipaa", "gdpr", "pci_dss"]

[policies.policy_configs.hipaa]
enabled = true
settings = { confidence_threshold = "0.8" }

[removable_media]
auto_scan_on_mount = true

[cache]
enabled = true
ttl_secs = 3600
scan_cache_ttl_days = 7

[scan]
max_findings_per_scanner = 10
max_scan_size_mb = 100

[resource_limits]
enabled = true
max_cpu_percent = 75.0
max_memory_mb = 1024

Financial Services (PCI DSS/SOX Focus)

watch_paths = ["/home/%%", "/var/data/%%", "/srv/transactions/%%"]
exclude_paths = ["/home/*/Downloads/%%"]
# Linux: /var/lib/aquilon/aquilon.db
# macOS: /var/db/aquilon/aquilon.db
database_path = "/var/lib/aquilon/aquilon.db"

[policies]
enabled_policies = ["pci_dss", "sox", "gdpr", "ccpa"]

[policies.policy_configs.pci_dss]
enabled = true
settings = { alert_on_test_data = "false" }

[policies.policy_configs.sox]
enabled = true
settings = { confidence_threshold = "0.85" }

[removable_media]
auto_scan_on_mount = true

[cache]
enabled = true
ttl_secs = 7200
scan_cache_ttl_days = 14

[worker]
num_workers = 8

[resource_limits]
enabled = true
max_cpu_percent = 60.0
max_memory_mb = 1024

Small Business (Basic Edition)

watch_paths = ["/home/%%", "/var/data/%%"]
exclude_paths = ["/home/*/.cache/%%"]
# Linux: /var/lib/aquilon/aquilon.db
# macOS: /var/db/aquilon/aquilon.db
database_path = "/var/lib/aquilon/aquilon.db"

[policies]
enabled_policies = ["gdpr", "ccpa"]

[cache]
enabled = true
ttl_secs = 3600
scan_cache_ttl_days = 7

[scan]
max_findings_per_scanner = 5
max_scan_size_mb = 40

[worker]
num_workers = 2  # Conservative for small systems

[resource_limits]
enabled = true
max_cpu_percent = 30.0
max_memory_mb = 256

Complete Example Configurations

For complete, production-ready configuration examples, see:

  • Basic Edition: docs/config-examples/aquilon_dlp_config_basic.toml
  • Enterprise Edition: docs/config-examples/aquilon_dlp_config_enterprise.toml

Custom Scanners and Policies

Custom scanners and policies are defined directly in the main configuration file using [[scanners]] and [[custom_policies]] sections. See Policy Frameworks for creating custom policies.

Example custom scanner (add to your main config):

[[scanners]]
name = "employee_id"
description = "ACME Corp employee IDs (format: EMP-######)"
regex = "EMP-[0-9]{6}"
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.85

Validate your configuration:

sudo aquilon-dlp --config /etc/aquilon/aquilon_dlp_config.toml --validate-config

Applying Configuration Changes

After modifying the configuration file, restart the osqueryd service:

macOS:

sudo launchctl unload /Library/LaunchDaemons/io.osquery.agent.plist
sudo launchctl load /Library/LaunchDaemons/io.osquery.agent.plist

Note: OSQuery 5.0.1+ uses io.osquery.agent.plist. Older versions use com.facebook.osqueryd.plist.

Linux:

sudo systemctl restart osqueryd

Validating Configuration

Check for configuration errors in the logs:

# macOS
tail -f /var/log/aquilon/aquilon-dlp.log | grep -i error

# Linux
journalctl -u osqueryd -f | grep -i aquilon

Common validation errors:

  • Invalid TOML syntax
  • Unknown policy names
  • Invalid regex patterns in custom scanners
  • Missing required fields

Environment Variables

Override configuration settings with environment variables using the AQUILON_DLP_ prefix:

# Worker configuration
export AQUILON_DLP_WORKER_NUM_WORKERS=8

# Resource limits
export AQUILON_DLP_RESOURCE_LIMITS_ENABLED=true
export AQUILON_DLP_RESOURCE_LIMITS_MAX_CPU_PERCENT=75.0

# Cache configuration
export AQUILON_DLP_CACHE_ENABLED=true
export AQUILON_DLP_CACHE_TTL_SECS=3600

# Watch paths (JSON array format)
export AQUILON_DLP_WATCH_PATHS='["/home/%%","/var/data/%%"]'

# Database path
export AQUILON_DLP_DATABASE_PATH=/var/lib/aquilon/aquilon.db

Environment variables override TOML configuration using underscore-separated paths. For example, [resource_limits] max_cpu_percent becomes AQUILON_DLP_RESOURCE_LIMITS_MAX_CPU_PERCENT.

Configuration Reference

For complete configuration reference and schema documentation, see the comments in the default configuration file:

cat /etc/aquilon/config.toml.default

Custom Scanners

Custom scanners let you define organization-specific detection patterns using regular expressions. Use them when built-in scanners don’t cover your proprietary identifiers like employee IDs, project codes, or internal account numbers.

Key features:

  • Regex-based pattern matching with bounded quantifiers
  • Confidence tuning via keyword proximity (boost/reduce)
  • Validation rules with checksums and invalid patterns
  • Multi-capture group redaction

Custom scanners integrate automatically with policies using the custom: prefix (e.g., custom:employee_id).

For integrating custom scanners with policies, SIEM systems, and fleet deployment, see Custom Scanner Integration.

Quick Start

Add a custom scanner to your configuration file:

[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})"
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.85
description = "ACME Corp employee IDs"

Test your scanner:

# Validate configuration
sudo aquilon-dlp --config /etc/aquilon/config.toml --validate-config

# Scan a test file
echo "Employee ID: EMP-123456" > /tmp/test.txt
sudo aquilon-dlp --config /etc/aquilon/config.toml --scan /tmp/test.txt

Discovering Built-in Scanners

Before creating a custom scanner, check if a built-in scanner already covers your use case. Aquilon DLP includes 30+ built-in scanners for common sensitive data types.

List Available Scanners

Use the CLI to see all available scanners (built-in and custom):

aquilon-dlp --list-scanners

Example output:

Built-in Scanners:
  ssn              - US Social Security Numbers
  credit_card      - Credit/debit card numbers (Visa, MC, Amex, etc.)
  email            - Email addresses
  phone            - Phone numbers (US, international)
  iban             - International Bank Account Numbers
  passport         - Passport numbers
  drivers_license  - Driver's license numbers
  ...

Custom Scanners:
  custom:employee_id   - ACME Corp employee IDs
  custom:project_code  - Internal project codes

Built-in Scanner Categories

Built-in scanners are organized by data type:

CategoryScannersUse Case
PIIssn, email, phone, address, date_of_birthPersonal data protection
Financialcredit_card, iban, bank_account, aba_routingPCI DSS, financial compliance
Healthcaremedical_record_number, npi, health_plan_idHIPAA compliance
Governmentpassport, drivers_license, einIdentity documents
Technicalapi_key, private_key, database_connectionSecret detection

For complete scanner-to-compliance mappings, see Policy Frameworks.

When to Create Custom Scanners

Create custom scanners when:

  • Organization-specific identifiers: Employee IDs, project codes, internal account numbers
  • Industry-specific formats: Your company’s unique document numbering scheme
  • Regional identifiers not built-in: Some EU national IDs require custom patterns

Configuration Reference

All fields for [[scanners]] entries:

FieldRequiredTypeDescription
nameYesStringUnique identifier (alphanumeric + underscore, max 64 chars). Referenced as custom:{name} in policies.
regexYesStringPattern to match. Must use bounded quantifiers (see Pattern Safety).
redaction_patternYesStringTemplate for redacting matches. X sequences map to capture group lengths.
base_confidenceYesFloatBase confidence score (0.0 - 1.0). Higher values = more confident the match is real.
descriptionNoStringHuman-readable description for documentation.
context_signalsNoArrayKeywords attached to findings for classification (e.g., ["hr", "confidential"]).
confidence_boostNoObjectBoost confidence when positive keywords found nearby. See Confidence Tuning.
confidence_reduceNoObjectReduce confidence when negative keywords found nearby. See Confidence Tuning.
validationNoObjectAdditional validation rules. See Validation Rules.

Pattern Safety

All regex patterns must be bounded to prevent performance issues. Unbounded patterns like \d+, .*, or [A-Z]+ will be rejected.

# SAFE - bounded patterns
[[scanners]]
name = "fixed_length"
regex = "EMP-([0-9]{6})"           # Fixed length: exactly 6 digits
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.85

[[scanners]]
name = "max_length"
regex = "ID-([0-9]{1,20})"         # Maximum 20 digits
redaction_pattern = "ID-XXXX"
base_confidence = 0.85

[[scanners]]
name = "range_length"
regex = "CODE-([A-Z]{3,6})"        # 3 to 6 uppercase letters
redaction_pattern = "CODE-XXXX"
base_confidence = 0.85

Unsafe patterns that will be rejected:

  • \d+ (unbounded digits)
  • .* (unbounded anything)
  • [A-Z]+ (unbounded letters)
  • (.*) (unbounded capture)

Regex Escaping in TOML

TOML strings require backslash escaping. Use one of these approaches:

# Option 1: Escape backslashes (double them)
[[scanners]]
name = "escaped_digits"
regex = "ID-(\\d{6})"              # \d becomes \\d in double quotes
redaction_pattern = "ID-XXXXXX"
base_confidence = 0.85

# Option 2: Use literal strings (single quotes)
[[scanners]]
name = "literal_digits"
regex = 'ID-(\d{6})'               # No escaping needed in single quotes
redaction_pattern = "ID-XXXXXX"
base_confidence = 0.85

# Option 3: Use character classes (no escaping)
[[scanners]]
name = "char_class"
regex = "ID-([0-9]{6})"            # [0-9] instead of \d
redaction_pattern = "ID-XXXXXX"
base_confidence = 0.85

Confidence Tuning

Adjust confidence scores based on nearby keywords to reduce false positives and improve accuracy.

Boosting Confidence

Increase confidence when positive keywords appear near a match:

[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})"
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.70

[scanners.confidence_boost]
keywords = ["employee", "badge", "payroll", "personnel", "HR"]
boost_amount = 0.20
proximity = 200

Effect: When “employee” or “payroll” appears within 200 bytes, confidence increases from 0.70 to 0.90.

Reducing Confidence

Decrease confidence when negative keywords appear near a match:

[[scanners]]
name = "account_number"
regex = "ACC-([0-9]{8})"
redaction_pattern = "ACC-XXXXXXXX"
base_confidence = 0.80

[scanners.confidence_reduce]
keywords = ["example", "test", "fake", "sample", "demo"]
boost_amount = 0.50
proximity = 100

Effect: When “example” or “test” appears within 100 bytes, confidence decreases from 0.80 to 0.30.

Combining Boost and Reduce

Use both on the same scanner for nuanced confidence:

[[scanners]]
name = "project_code"
regex = "PROJ-([A-Z]{3})-([0-9]{4})"
redaction_pattern = "PROJ-XXX-XXXX"
base_confidence = 0.65

[scanners.confidence_boost]
keywords = ["confidential", "restricted", "internal"]
boost_amount = 0.25
proximity = 150

[scanners.confidence_reduce]
keywords = ["example", "documentation", "template"]
boost_amount = 0.35
proximity = 100

Confidence calculation:

ContextCalculationResult
No keywords nearby0.65 (base)0.65
“confidential” nearby0.65 + 0.250.90
“template” nearby0.65 - 0.350.30
Both nearbyApplied independentlyVaries

Confidence Adjustment Fields

FieldTypeDescription
keywordsArrayWords to search for in proximity to match
boost_amountFloatAmount to add (boost) or subtract (reduce) from confidence (0.0 - 1.0)
proximityIntegerMaximum distance in bytes to search for keywords (1 - 10000)

Validation Rules

Add validation rules to filter out false positives with checksums and pattern exclusions.

[[scanners]]
name = "company_account"
regex = "ACCT-([0-9]{10})"
redaction_pattern = "ACCT-XXXXXXXXXX"
base_confidence = 0.85

[scanners.validation]
min_confidence = 0.70
invalid_patterns = ["^ACCT-0{10}$", "^ACCT-1234567890$"]
validator = "luhn"

Validation Fields

FieldTypeDescription
min_confidenceFloatMinimum confidence threshold. Matches below this are discarded.
invalid_patternsArrayRegex patterns to reject (e.g., all zeros, test sequences).
validatorStringChecksum validator to apply: luhn, mod10, mod11, or iban.

Available Validators

ValidatorAlgorithmUse Case
luhnLuhn (mod 10)Credit cards, IMEI numbers, some account numbers
mod10Modulo 10Various identifiers with check digits
mod11Modulo 11ISBN-10, some national IDs
ibanIBAN checksumInternational Bank Account Numbers

Example: Filtering Test Data

[[scanners]]
name = "customer_id"
regex = "CUST-([0-9]{8})"
redaction_pattern = "CUST-XXXXXXXX"
base_confidence = 0.80

[scanners.validation]
# Reject common test patterns
invalid_patterns = [
    "^CUST-0{8}$",         # All zeros
    "^CUST-1{8}$",         # All ones
    "^CUST-12345678$",     # Sequential
    "^CUST-99999999$"      # All nines
]

Example: Luhn Checksum Validation

[[scanners]]
name = "loyalty_card"
regex = "([0-9]{4})([0-9]{4})([0-9]{4})([0-9]{4})"
redaction_pattern = "XXXX-XXXX-XXXX-XXXX"
base_confidence = 0.80
description = "16-digit loyalty card numbers with Luhn check"

[scanners.validation]
validator = "luhn"
invalid_patterns = ["^0{16}$", "^1{16}$"]

This configuration:

  1. Matches any 16-digit number formatted as 4 groups
  2. Validates it passes the Luhn checksum
  3. Rejects all-zeros and all-ones patterns
  4. Reports only valid matches

Redaction Patterns

Redaction patterns control how matched text appears in alerts and logs. X sequences map to capture groups.

Single Capture Group

[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})"           # One capture group
redaction_pattern = "EMP-XXXXXX"   # 6 X's for the 6-digit capture
base_confidence = 0.85
InputRedacted Output
EMP-123456EMP-XXXXXX
EMP-987654EMP-XXXXXX

Multiple Capture Groups

[[scanners]]
name = "project_code"
regex = "PROJ-([A-Z]{3})-([0-9]{4})"     # Two capture groups
redaction_pattern = "PROJ-XXX-XXXX"       # 3 X's, then 4 X's
base_confidence = 0.90
InputRedacted Output
PROJ-ABC-1234PROJ-XXX-XXXX
PROJ-XYZ-9999PROJ-XXX-XXXX

Variable Length Captures

For variable-length captures, use a fixed number of X’s as a placeholder:

[[scanners]]
name = "order_number"
regex = "ORD-([0-9]{4,10})"        # 4 to 10 digits
redaction_pattern = "ORD-XXXX"     # Fixed placeholder
base_confidence = 0.85
InputRedacted Output
ORD-1234ORD-XXXX
ORD-1234567890ORD-XXXX

Redaction Best Practices

  1. Match X count to expected capture length when possible
  2. Use fixed placeholders for variable-length captures
  3. Keep redaction patterns recognizable (preserve prefixes/formatting)
  4. Don’t include actual data in the pattern string

Real-World Examples

Complete, production-ready configurations for common use cases.

Healthcare: Patient ID Detection

Detect patient identifiers with healthcare context boosting:

[[scanners]]
name = "patient_id"
regex = "PAT-([0-9]{8})"
redaction_pattern = "PAT-XXXXXXXX"
base_confidence = 0.60
description = "Healthcare patient identifiers"
context_signals = ["healthcare", "phi", "hipaa"]

[scanners.confidence_boost]
keywords = ["patient", "medical", "diagnosis", "treatment", "healthcare", "hospital", "clinic"]
boost_amount = 0.30
proximity = 250

[scanners.confidence_reduce]
keywords = ["example", "test", "sample", "demo", "mock"]
boost_amount = 0.40
proximity = 100

Why this works:

  • Low base confidence (0.60) prevents false positives on similar numeric patterns
  • Healthcare keywords boost confidence significantly when in medical context
  • Test/sample keywords reduce confidence to filter documentation examples
  • Context signals (phi, hipaa) integrate with SIEM for compliance workflows

Financial: Account Number with Validation

Detect account numbers using Luhn checksum validation:

[[scanners]]
name = "financial_account"
regex = "FA-([0-9]{12})"
redaction_pattern = "FA-XXXXXXXXXXXX"
base_confidence = 0.75
description = "Financial account numbers with check digit"
context_signals = ["financial", "pci", "account"]

[scanners.confidence_boost]
keywords = ["account", "balance", "transaction", "payment", "transfer", "deposit"]
boost_amount = 0.20
proximity = 200

[scanners.validation]
validator = "luhn"
min_confidence = 0.60
invalid_patterns = [
    "^FA-0{12}$",
    "^FA-123456789012$",
    "^FA-9{12}$"
]

Why this works:

  • Luhn validator rejects numbers that fail checksum (random digit sequences)
  • Invalid patterns filter known test data
  • Minimum confidence threshold adds another layer of filtering
  • Financial keywords boost real occurrences in transaction contexts

Engineering: Multi-Part Project Code

Detect complex identifiers with multiple capture groups:

[[scanners]]
name = "internal_project"
regex = "IPROJ-([A-Z]{2})-([0-9]{4})-([A-Z]{1})"
redaction_pattern = "IPROJ-XX-XXXX-X"
base_confidence = 0.80
description = "Internal project codes (region-number-phase)"
context_signals = ["internal", "project", "confidential"]

[scanners.confidence_boost]
keywords = ["confidential", "restricted", "internal", "proprietary"]
boost_amount = 0.15
proximity = 150

[scanners.confidence_reduce]
keywords = ["template", "example", "placeholder", "documentation"]
boost_amount = 0.30
proximity = 100

Pattern breakdown:

  • ([A-Z]{2}) - Two-letter region code (e.g., US, EU, AP)
  • ([0-9]{4}) - Four-digit project number
  • ([A-Z]{1}) - Single-letter phase indicator (A-Z)

Redaction mapping:

InputOutput
IPROJ-US-1234-AIPROJ-XX-XXXX-X
IPROJ-EU-9999-CIPROJ-XX-XXXX-X

Comprehensive scanner combining all advanced features:

[[scanners]]
name = "legal_document"
regex = "DOC-([A-Z]{3})-([0-9]{6})"
redaction_pattern = "DOC-XXX-XXXXXX"
base_confidence = 0.55
description = "Legal document identifiers"
context_signals = ["legal", "confidential", "privileged"]

[scanners.confidence_boost]
keywords = ["attorney", "legal", "privileged", "confidential", "counsel", "litigation"]
boost_amount = 0.35
proximity = 300

[scanners.confidence_reduce]
keywords = ["example", "sample", "test", "template", "draft"]
boost_amount = 0.45
proximity = 150

[scanners.validation]
min_confidence = 0.50
invalid_patterns = [
    "^DOC-AAA-000000$",
    "^DOC-XXX-[0-9]{6}$",
    "^DOC-[A-Z]{3}-123456$"
]

Confidence scenarios:

ContextBaseBoostReduceFinal
No keywords0.550.55
“attorney-client” nearby0.55+0.350.90
“example document” nearby0.55-0.450.10 (rejected)
“confidential draft”0.55+0.35-0.450.45 (rejected)

The low base confidence (0.55) combined with aggressive reduce (-0.45) ensures that example/template documents are filtered even when “confidential” appears nearby.

Testing Custom Scanners

Validate your scanner configuration before deployment.

Validate Configuration

Check for syntax errors and unsafe patterns:

sudo aquilon-dlp --config /etc/aquilon/config.toml --validate-config

Successful validation:

Configuration valid
Loaded 3 custom scanners:
  - patient_id (bounded regex, confidence 0.60)
  - financial_account (bounded regex, confidence 0.75, validator: luhn)
  - internal_project (bounded regex, confidence 0.80)

Failed validation (unsafe pattern):

Configuration error: Scanner 'bad_scanner' has unsafe regex pattern
  Pattern: "ID-\d+"
  Error: Unbounded repetition detected
  Suggestion: Use bounded quantifiers like {1,20} instead of +

Scan Test Files

Test your scanner against sample data:

# Create test file
cat > /tmp/scanner_test.txt << 'EOF'
Patient PAT-12345678 visited on 2024-01-15.
Financial account FA-123456789012 balance.
Project IPROJ-US-1234-A is confidential.
Legal document DOC-ABC-123456 under attorney review.
EOF

# Run scan
sudo aquilon-dlp --config /etc/aquilon/config.toml --scan /tmp/scanner_test.txt

Expected output:

Scanning: /tmp/scanner_test.txt
Results:
  [patient_id] PAT-XXXXXXXX (confidence: 0.60, line 1)
    Context signals: healthcare, phi, hipaa
  [financial_account] FA-XXXXXXXXXXXX (confidence: 0.75, line 2)
    Context signals: financial, pci, account
    Validation: luhn passed
  [internal_project] IPROJ-XX-XXXX-X (confidence: 0.80, line 3)
    Context signals: internal, project, confidential
  [legal_document] DOC-XXX-XXXXXX (confidence: 0.90, line 4)
    Context signals: legal, confidential, privileged
    Confidence boosted by: "attorney"

Summary: 4 findings in 1 file

Testing Confidence Adjustments

Verify boost and reduce behavior:

# Test with boost keywords
cat > /tmp/boost_test.txt << 'EOF'
Patient medical record: PAT-12345678
EOF
sudo aquilon-dlp --config /etc/aquilon/config.toml --scan /tmp/boost_test.txt
# Expected: confidence 0.90 (0.60 base + 0.30 boost from "patient", "medical")

# Test with reduce keywords
cat > /tmp/reduce_test.txt << 'EOF'
Example patient ID: PAT-12345678 (test data)
EOF
sudo aquilon-dlp --config /etc/aquilon/config.toml --scan /tmp/reduce_test.txt
# Expected: confidence 0.20 (0.60 base - 0.40 reduce from "example", "test")

Testing Validation Rules

Verify checksum validation:

# Valid Luhn number (passes checksum)
echo "FA-374245455400126" > /tmp/valid_luhn.txt
sudo aquilon-dlp --config /etc/aquilon/config.toml --scan /tmp/valid_luhn.txt
# Expected: Match found

# Invalid Luhn number (fails checksum)
echo "FA-123456789012" > /tmp/invalid_luhn.txt
sudo aquilon-dlp --config /etc/aquilon/config.toml --scan /tmp/invalid_luhn.txt
# Expected: No match (fails Luhn validation)

Using Policy Integration

Test custom scanners through policies:

# Policy referencing custom scanner
cat > /tmp/policy_test.toml << 'EOF'
watch_paths = ["/tmp"]
exclude_paths = []

[[scanners]]
name = "patient_id"
regex = "PAT-([0-9]{8})"
redaction_pattern = "PAT-XXXXXXXX"
base_confidence = 0.60

[policies]
enabled_policies = ["test_policy"]

[policies.policy_configs.test_policy]
enabled = true
scanners = ["custom:patient_id"]
min_confidence = 0.5

[work_queue]
max_queue_size = 10000
submit_timeout_secs = 5

[worker]
num_workers = 0

[resource_limits]
enabled = false

[metrics]
bind_address = "127.0.0.1"
port = 9000

[cache]
enabled = true
ttl_secs = 0

[scan]
max_scan_size_mb = 40
max_recursion_depth = 5
EOF

sudo aquilon-dlp --config /tmp/policy_test.toml --scan /tmp/scanner_test.txt

Note the custom: prefix when referencing custom scanners in policies.

Troubleshooting

Common issues and solutions when working with custom scanners.

Configuration Errors

Error MessageCauseSolution
Unsafe regex pattern: unbounded repetitionPattern uses +, *, or unbounded {n,}Use bounded quantifiers: {1,20} instead of +, {0,100} instead of *
Invalid regex syntaxMalformed regular expressionCheck TOML escaping: use \\d or 'single quotes' or [0-9]
Mismatched capture groupsRegex capture count doesn’t match X sequencesAlign capture groups with redaction X runs
Scanner name already existsDuplicate name fieldEach scanner needs a unique name
Invalid base_confidenceValue outside 0.0-1.0 rangeUse values between 0.0 and 1.0

Pattern Not Matching

Symptom: Scanner configured but no matches found.

Diagnostic steps:

  1. Test regex separately:

    echo "EMP-123456" | grep -E "EMP-([0-9]{6})"
    
  2. Check TOML escaping:

    # These are all equivalent:
    regex = "\\d{6}"      # Double backslash in double quotes
    regex = '\d{6}'       # Single quotes (literal)
    regex = "[0-9]{6}"    # Character class (recommended)
    
  3. Verify file is being scanned:

    • Check watch_paths includes the file location
    • Check exclude_paths doesn’t exclude it
    • Verify file size is under max_scan_size_mb
  4. Check confidence threshold:

    • If using policies, verify min_confidence isn’t filtering matches
    • Check if confidence_reduce keywords are nearby

False Positives

Symptom: Scanner matches too many non-relevant patterns.

Solutions:

  1. Add validation rules:

    [scanners.validation]
    invalid_patterns = ["^ACCT-0{10}$", "^ACCT-12345"]
    min_confidence = 0.70
    
  2. Use confidence reduce:

    [scanners.confidence_reduce]
    keywords = ["example", "test", "sample", "demo"]
    boost_amount = 0.40
    proximity = 100
    
  3. Add checksum validation:

    [scanners.validation]
    validator = "luhn"  # or "mod10", "mod11", "iban"
    

Policy Integration Issues

Error MessageCauseSolution
Unknown scanner 'employee_id'Missing custom: prefixUse custom:employee_id in policy scanners list
Scanner 'custom:foo' not foundScanner not definedAdd [[scanners]] entry with name = "foo"
Policy references disabled scannerScanner defined but not enabledCheck scanner configuration is complete

Performance Issues

Symptom: Scanning is slow after adding custom scanners.

Solutions:

  1. Check pattern complexity:

    • Avoid nested alternations: (a|b|c) is fine, ((a|b)|(c|d)) is slow
    • Avoid overlapping patterns: [A-Za-z] + [a-z] creates backtracking
  2. Reduce proximity search:

    [scanners.confidence_boost]
    proximity = 100  # Smaller = faster (default is 200)
    
  3. Simplify validation:

    • invalid_patterns with simple patterns are fast
    • Complex regex in invalid_patterns can slow scanning

Redaction Issues

Symptom: Redacted output looks wrong.

IssueCauseSolution
Partial redactionCapture group mismatchEnsure X count matches capture group length
XXX for variable dataVariable-length captureUse fixed placeholder or document behavior
No prefix in outputPrefix not in patternAdd prefix outside capture group: PREFIX-([0-9]{6})

Example fix:

# Wrong - captures everything including prefix
regex = "(EMP-[0-9]{6})"
redaction_pattern = "XXXXXXXXXX"  # Loses prefix

# Correct - captures only sensitive part
regex = "EMP-([0-9]{6})"
redaction_pattern = "EMP-XXXXXX"  # Preserves prefix

Best Practices

Guidelines for building effective, maintainable custom scanners.

Pattern Design

  1. Always use bounded quantifiers

    • {6} for fixed length
    • {1,20} for variable length with maximum
    • Never use +, *, or {n,} (unbounded)
  2. Use character classes over escape sequences

    • [0-9] instead of \d (avoids TOML escaping issues)
    • [A-Za-z] instead of \w
    • [^a-z] for negation
  3. Capture only sensitive data

    # Good: prefix preserved, only digits captured
    regex = "EMP-([0-9]{6})"
    
    # Bad: entire match captured
    regex = "(EMP-[0-9]{6})"
    
  4. Test patterns before deployment

    echo "EMP-123456" | grep -E "EMP-([0-9]{6})"
    

Confidence Strategy

  1. Start with low base confidence (0.50-0.70)

    • Prevents over-alerting before context analysis
    • Allows boost/reduce to have meaningful effect
  2. Use boost for high-value context

    • Domain-specific keywords that indicate real data
    • Proximity 150-300 bytes for document context
  3. Use reduce aggressively for noise

    • Test, example, sample, demo, placeholder
    • Proximity 50-150 bytes for nearby indicators
  4. Document your confidence rationale

    description = "Patient IDs: low base (0.60) + medical boost (0.30) = 0.90 in healthcare docs"
    

Validation Rules

  1. Always add invalid_patterns for test data

    • Common sequences: all zeros, all ones, sequential (123456)
    • Known test values from documentation
  2. Use checksums when available

    • Financial accounts often have Luhn/mod10 digits
    • Reduces false positives by 90%+
  3. Set appropriate min_confidence

    • 0.50-0.60 for high-recall (find everything)
    • 0.70-0.80 for balanced precision/recall
    • 0.85+ for high-precision (minimize false positives)

Organization and Maintenance

  1. Use descriptive names

    name = "patient_mrn"        # Good: specific
    name = "id"                 # Bad: too generic
    
  2. Always include description

    description = "Medical Record Numbers: MRN-XXXXXXXX format, HIPAA-regulated"
    
  3. Use context_signals for SIEM integration

    context_signals = ["healthcare", "phi", "hipaa"]
    

    These tags appear in alerts and enable filtering/routing in your SIEM.

  4. Group related scanners

    # Healthcare scanners
    [[scanners]]
    name = "patient_mrn"
    # ...
    
    [[scanners]]
    name = "patient_ssn"
    # ...
    
    # Financial scanners
    [[scanners]]
    name = "account_number"
    # ...
    

Performance Optimization

  1. Order patterns by specificity

    • Most specific patterns first (fewer false matches)
    • Generic patterns last
  2. Minimize proximity for boost/reduce

    • Start with 100-150 bytes
    • Increase only if needed for context
  3. Avoid complex alternations

    # Slow: nested alternations
    regex = "((EMP|STAFF)-(ID|NUM))-([0-9]{6})"
    
    # Fast: separate scanners
    [[scanners]]
    name = "emp_id"
    regex = "EMP-ID-([0-9]{6})"
    
    [[scanners]]
    name = "staff_num"
    regex = "STAFF-NUM-([0-9]{6})"
    

Security Considerations

  1. Never log sensitive data in tests

    • Use obviously fake test data
    • Don’t use real examples in documentation
  2. Review patterns for over-matching

    • Simple patterns like [0-9]{9} match too broadly
    • Always include prefix/format markers
  3. Test with production-like data volume

    • Performance issues emerge at scale
    • Run against large sample files before deployment

Custom Scanner Integration

This guide covers integrating custom scanners with policies, SIEM systems, and fleet deployments. For creating custom scanners, see Custom Scanners.

Combining Built-in and Custom Scanners

Custom scanners work alongside built-in scanners in policies. The key difference is the naming convention:

  • Built-in scanners: Use the scanner name directly (e.g., ssn, email, iban)
  • Custom scanners: Use the custom: prefix (e.g., custom:employee_id)
[policies]
enabled_policies = ["data_protection"]

[policies.policy_configs.data_protection]
enabled = true
settings = { confidence_threshold = "0.7" }

For advanced policy composition (AND/OR rules, thresholds), see Policy Frameworks.

Example: GDPR with Custom Identifiers

A common scenario is extending GDPR compliance with organization-specific identifiers. This example combines built-in EU data scanners with custom project codes:

# Custom scanner for internal project codes
[[scanners]]
name = "project_code"
regex = "PROJ-([A-Z]{2})-([0-9]{4})"
redaction_pattern = "PROJ-XX-XXXX"
base_confidence = 0.80
context_signals = ["internal", "confidential"]

[scanners.confidence_boost]
keywords = ["confidential", "restricted", "internal"]
boost_amount = 0.15
proximity = 150

[scanners.confidence_reduce]
keywords = ["example", "template", "documentation"]
boost_amount = 0.30
proximity = 100

[policies]
enabled_policies = ["gdpr_extended"]

[policies.policy_configs.gdpr_extended]
enabled = true
settings = { confidence_threshold = "0.7" }

For complete GDPR scanner mappings and compliance guidance, see GDPR Compliance.

Reducing False Positives from Test Files

Development environments often contain test fixtures with fake sensitive data. Use these strategies to reduce false positives.

Path-Based Exclusions

Exclude entire directories from scanning using global exclude_paths:

watch_paths = ["/home/%%", "/var/data/%%"]

exclude_paths = [
    # Test directories
    "/home/*/projects/*/tests/%%",
    "/home/*/projects/*/test/%%",
    "/home/*/projects/*/__tests__/%%",

    # Test fixtures and mock data
    "/home/*/projects/*/fixtures/%%",
    "/home/*/projects/*/mock-data/%%",
    "/home/*/projects/*/testdata/%%",

    # Build artifacts
    "/home/*/projects/*/node_modules/%%",
    "/home/*/projects/*/target/%%",
    "/home/*/projects/*/.git/%%"
]

Keyword-Based Confidence Reduction

For files that can’t be excluded by path, use confidence_reduce to lower confidence when test-related keywords appear nearby:

[[scanners]]
name = "customer_id"
regex = "CUST-([0-9]{8})"
redaction_pattern = "CUST-XXXXXXXX"
base_confidence = 0.80

[scanners.confidence_reduce]
keywords = [
    # Test indicators
    "test", "spec", "mock", "fake", "dummy",
    # Documentation indicators
    "example", "sample", "demo", "placeholder",
    # Development indicators
    "fixture", "seed", "factory"
]
boost_amount = 0.50
proximity = 100

With base_confidence = 0.80 and boost_amount = 0.50, matches near test keywords drop to 0.30 confidence, which typically falls below policy thresholds.

For detailed confidence tuning patterns, see Custom Scanners - Confidence Tuning. For global configuration options, see Configuration.

SIEM Integration

Custom scanner findings flow to your SIEM through the OSQuery aquilon_dlp_alerts table.

How context_signals Flow to Alerts

The context_signals you define on custom scanners appear in alert metadata:

[[scanners]]
name = "patient_id"
regex = "PAT-([0-9]{8})"
redaction_pattern = "PAT-XXXXXXXX"
base_confidence = 0.85
context_signals = ["healthcare", "phi", "hipaa"]  # These flow to alerts

Key Alert Fields for Custom Scanners

Query custom scanner alerts via OSQuery:

SELECT
    timestamp,
    path,
    scanner,
    confidence,
    policy,
    severity,
    context
FROM aquilon_dlp_alerts
WHERE scanner LIKE 'custom:%'
ORDER BY timestamp DESC
LIMIT 100;

The context JSON field contains context_signals for SIEM filtering and routing.

Splunk Integration Example

Schedule OSQuery to export alerts, then query in Splunk:

index=osquery sourcetype=osquery:results name=aquilon_dlp_alerts
| spath input=columns.context
| search context_signals="*healthcare*"
| stats count by scanner, severity, policy

For complete SIEM integration including Elastic Stack, see Monitoring - SIEM Integration. For the full alert schema, see API Integration.

Fleet Deployment

Deploy custom scanner configurations across your fleet using MDM or configuration management tools.

Centralized Configuration

  1. Create a base configuration with your custom scanners and policies
  2. Deploy via MDM (Jamf, Intune, Kandji) to managed devices
  3. Verify deployment using OSQuery fleet queries

Example verification query to confirm custom scanners are active:

SELECT
    name,
    version,
    status
FROM aquilon_dlp_status
WHERE status = 'running';

Deployment Resources

Performance Considerations

Custom scanners add minimal overhead, but keep these guidelines in mind for large fleets:

Scanner Count

  • 10-20 custom scanners: Negligible performance impact
  • 20-50 custom scanners: Monitor scan latency metrics
  • 50+ custom scanners: Consider splitting into multiple policies by use case

Proximity Search Tuning

Large proximity values in confidence_boost/confidence_reduce increase memory usage per scan:

[scanners.confidence_boost]
keywords = ["confidential"]
boost_amount = 0.20
proximity = 100    # Recommended: 100-200 bytes
# proximity = 1000  # Avoid: increases memory per match

Monitoring Performance

Track scanner performance via Prometheus metrics:

  • aquilon_scan_duration_seconds - Per-file scan time
  • aquilon_scanner_matches_total - Matches by scanner name
  • aquilon_queue_depth - Work queue backlog

For metrics setup, see Monitoring.

CLI Reference

Aquilon DLP provides command-line tools for testing, validating, and debugging your DLP configuration before deploying to production.

Quick Reference

CommandPurpose
--validate-configValidate configuration file syntax and references
--list-scannersList all available scanners (built-in + custom)
--list-policiesList all available policies (built-in + custom)
--test-scannerTest a specific scanner against a file
--test-policyTest a specific policy against a file
--dry-runScan a file without database persistence
--maintenance-nowRun database maintenance immediately

Configuration Validation

--validate-config

Validates your configuration file for syntax errors, invalid regex patterns, and missing scanner references.

Syntax:

aquilon-dlp --validate-config <config-file>

Example:

aquilon-dlp --validate-config /etc/aquilon/config.toml

Output: Returns exit code 0 if valid, non-zero with error details if invalid.

Use when:

  • After editing configuration files
  • Before deploying configuration changes
  • Validating custom scanner regex patterns

Discovery Commands

--list-scanners

Lists all available scanners including built-in scanners and any custom scanners defined in your configuration.

Syntax:

aquilon-dlp --list-scanners [--config <config-file>]

Example:

aquilon-dlp --list-scanners

Sample output:

ssn
credit_card
email
phone
iban
ip_address
aws_key
custom:employee_id

Built-in scanners include:

  • ssn - US Social Security Numbers
  • credit_card - Credit/debit card numbers (Visa, MC, Amex, etc.)
  • email - Email addresses
  • phone - Phone numbers (US and international)
  • iban - International Bank Account Numbers
  • ip_address - IPv4 and IPv6 addresses
  • aws_key - AWS access keys and secrets
  • And more…

--list-policies

Lists all available policies including built-in compliance frameworks and any custom policies defined in your configuration.

Syntax:

aquilon-dlp --list-policies [--config <config-file>]

Example:

aquilon-dlp --list-policies

Sample output:

hipaa
gdpr
pci_dss
sox
ccpa
iso27001
custom:internal_data

Built-in policies include:

  • hipaa - Health Insurance Portability and Accountability Act
  • gdpr - General Data Protection Regulation
  • pci_dss - Payment Card Industry Data Security Standard
  • sox - Sarbanes-Oxley Act
  • ccpa - California Consumer Privacy Act
  • iso27001 - ISO/IEC 27001 Information Security

Testing Commands

--test-scanner

Tests a specific scanner against a file and outputs JSON results with any findings.

Syntax:

aquilon-dlp --test-scanner <scanner-name> --test-file <file-path>

Example:

aquilon-dlp --test-scanner ssn --test-file /var/test-data/sample-data.csv

Output: JSON with findings array:

{
  "scanner": "ssn",
  "file": "/tmp/test-ssn.txt",
  "findings": [
    {
      "matched_text": "123-45-6789",
      "position": 10,
      "confidence": 0.85,
      "redacted_text": "XXX-XX-6789"
    }
  ],
  "duration_ms": 5
}

Use when:

  • Developing custom scanners
  • Debugging detection issues
  • Verifying scanner behavior

--test-policy

Tests a specific policy against a file and outputs JSON results with any violations.

Syntax:

aquilon-dlp --test-policy <policy-name> --test-file <file-path>

Example:

aquilon-dlp --test-policy gdpr --test-file /var/test-data/sample-data.csv

Output: JSON with policy evaluation results:

{
   "policy": "gdpr",
   "file": "/var/test-data/sample-data-csv.tgz",
   "matched": true,
   "violations": [
      {
         "rule_id": "Article-4",
         "description": "Unprotected personal data detected - email address violates GDPR requirements",
         "severity": "Medium",
         "evidence_count": 1
      },
      {
         "rule_id": "Article-32",
         "description": "Unprotected financial personal data detected - violates GDPR security requirements",
         "severity": "High",
         "evidence_count": 1
      }
   ],
   "total_findings": 2,
   "scan_duration_ms": 19
}

Use when:

  • Developing custom policies
  • Testing policy rules
  • Verifying compliance detection

--dry-run

Scans a file using all configured scanners and policies without persisting to the database. Outputs JSON results to stdout.

Syntax:

aquilon-dlp --dry-run <file-path> [--config <config-file>]

Example:

aquilon-dlp --dry-run /var/test-data/sample-data.csv

Output: JSON with complete scan results:

{
   "file": "/var/test-data/sample-data-csv",
   "mime_type": "application/octet-stream",
   "file_size_bytes": 2929,
   "scan_duration_ms": 1648,
   "findings": [
      {
         "scanner": "PCI-DSS_policy",
         "matched_text": "xxxx-xxxx-xxxx-5516, xxxx-xxxx-xxxx-3020, xxxx-xxxx-xxxx-6147",
         "position": 0,
         "confidence": 1.0,
         "pattern_type": "cc",
         "redacted": "xxxx-xxxx-xxxx-5516, xxxx-xxxx-xxxx-3020, xxxx-xxxx-xxxx-6147"
      },
      {
         "scanner": "GDPR_policy",
         "matched_text": "xxxx-xxxx-xxxx-5516, xxxx-xxxx-xxxx-3020, xxxx-xxxx-xxxx-6147",
         "position": 0,
         "confidence": 1.0,
         "pattern_type": "cc",
         "redacted": "xxxx-xxxx-xxxx-5516, xxxx-xxxx-xxxx-3020, xxxx-xxxx-xxxx-6147"
      }
   ],
   "policies_matched": [
      "GDPR",
      "PCI-DSS"
   ],
   "total_findings": 2
}

Use when:

  • Testing files before enabling monitoring
  • Debugging why files are (or aren’t) flagged
  • One-off scans without affecting database
  • CI/CD pipeline integration

Maintenance Commands

--maintenance-now

Runs database maintenance tasks immediately and exits. This includes cleanup of old findings, cache eviction, and vacuum operations.

Syntax:

aquilon-dlp --maintenance-now [--config <config-file>]

Example:

aquilon-dlp --maintenance-now --config /etc/aquilon/config.toml

Output: JSON with maintenance results:

{
  "soft_deleted": 42,
  "hard_deleted": 15,
  "cache_evicted": 128,
  "pages_vacuumed": 1000,
  "duration_ms": 234,
  "errors": []
}

Use when:

  • Before database backups
  • After bulk data imports
  • To reclaim disk space immediately
  • Troubleshooting database issues

Testing Workflow

When developing custom scanners or policies, use this recommended workflow:

  1. Validate configuration after any changes:

    aquilon-dlp --validate-config /etc/aquilon/config.toml
    
  2. List available scanners to verify custom scanners loaded:

    aquilon-dlp --list-scanners --config /etc/aquilon/config.toml
    
  3. Test individual scanner against sample files:

    aquilon-dlp --test-scanner my_custom_scanner --test-file sample.txt
    
  4. Test policy to verify detection rules:

    aquilon-dlp --test-policy my_policy --test-file sample.txt
    
  5. Dry-run scan to see full results:

    aquilon-dlp --dry-run sample.txt --config /etc/aquilon/config.toml
    

Platform Notes

The binary name varies by platform and edition:

PlatformEditionBinary Name
LinuxBasicaquilon-dlp-basic
LinuxEnterpriseaquilon-dlp-enterprise
macOSEnterpriseaquilon-dlp (in app bundle)

Examples in this documentation use aquilon-dlp for simplicity. Substitute with your platform-specific binary name.

Policy Frameworks

Aquilon DLP includes built-in compliance policy frameworks that automatically classify findings and generate violations according to regulatory requirements. You can also create custom policies using TOML configuration.

Built-in Compliance Frameworks

Overview

FrameworkStandardKey ControlsEdition
GDPREU General Data Protection RegulationArticles 5, 32, 33All
CCPACalifornia Consumer Privacy ActSections 1798.100-199All
HIPAAHealth Insurance Portability and Accountability ActSections 164.306, 164.312Enterprise
PCI DSSPayment Card Industry Data Security StandardRequirements 3, 4, 12Enterprise
SOXSarbanes-Oxley ActSections 302, 404, 409Enterprise
ISO 27001Information Security ManagementControls A.8.12, A.5.12, A.8.11Enterprise
CUIControlled Unclassified InformationNIST SP 800-171Enterprise
CMMCCybersecurity Maturity Model CertificationDFARS 252.204-7012Enterprise
FedRAMPFederal Risk and Authorization ManagementNIST SP 800-53Enterprise
FISMAFederal Information Security Modernization ActFIPS 199, NIST SP 800-53Enterprise

GDPR (General Data Protection Regulation)

The GDPR policy detects EU personal data subject to data protection regulations.

Detected Data Types:

  • Personal identifiers (names, addresses, phone numbers)
  • Email addresses
  • National identification numbers
  • Financial account data
  • Health information

Configuration:

[policies]
enabled_policies = ["gdpr"]

[policies.policy_configs.gdpr]
enabled = true
settings = { confidence_threshold = "0.7", requires_cc_context = "true" }

Context-Aware Credit Card Detection:

By default, GDPR policy requires payment context keywords to detect credit card numbers. This reduces false positives from Luhn-valid numbers appearing in non-payment contexts (JSON logs, test files, etc.).

SettingDefaultEffect
requires_cc_context"true"CC findings require payment context keywords

Payment context keywords: payment, card, merchant, transaction, billing, invoice

To restore legacy behavior (alert on all Luhn-valid credit cards regardless of context):

settings = { requires_cc_context = "false" }

CCPA (California Consumer Privacy Act)

The CCPA policy detects California consumer personal information.

Detected Data Types:

  • Personal identifiers
  • Social Security numbers
  • Driver’s license numbers
  • Financial information
  • Geolocation data
  • Biometric information

Configuration:

[policies]
enabled_policies = ["ccpa"]

[policies.policy_configs.ccpa]
enabled = true
settings = { confidence_threshold = "0.7" }

HIPAA (Health Insurance Portability and Accountability Act)

Enterprise Edition Only

The HIPAA policy detects Protected Health Information (PHI).

Detected Data Types:

  • Medical record numbers
  • Health plan beneficiary numbers
  • Social Security numbers
  • Names with medical details
  • Dates of service
  • Provider information

Configuration:

[policies]
enabled_policies = ["hipaa"]

[policies.policy_configs.hipaa]
enabled = true
settings = { confidence_threshold = "0.8" }

PCI DSS (Payment Card Industry Data Security Standard)

Enterprise Edition Only

The PCI DSS policy detects payment card data.

Detected Data Types:

  • Credit card numbers (validated with Luhn algorithm)
  • Card security codes (CVV/CVC)
  • Cardholder names
  • Expiration dates
  • Magnetic stripe data

Configuration:

[policies]
enabled_policies = ["pci_dss"]

[policies.policy_configs.pci_dss]
enabled = true
settings = { alert_on_test_data = "false", requires_cc_context = "true" }

Context-Aware Credit Card Detection:

By default, PCI DSS policy requires payment context keywords to detect credit card numbers. This reduces false positives from Luhn-valid numbers appearing in non-payment contexts (JSON logs, test files, etc.).

SettingDefaultEffect
requires_cc_context"true"CC findings require payment context keywords

Payment context keywords: payment, card, merchant, transaction, billing, invoice

To restore legacy behavior (alert on all Luhn-valid credit cards regardless of context):

settings = { requires_cc_context = "false" }

SOX (Sarbanes-Oxley Act)

Enterprise Edition Only

The SOX policy detects financial data subject to internal controls.

Detected Data Types:

  • Financial statements
  • Account numbers
  • Transaction identifiers
  • Audit information
  • Executive communications

Configuration:

[policies]
enabled_policies = ["sox"]

[policies.policy_configs.sox]
enabled = true
settings = { confidence_threshold = "0.85" }

ISO 27001:2022

Enterprise Edition Only

The ISO 27001:2022 policy implements information security management controls, particularly Control A.8.12 (Data leakage prevention) which explicitly mandates DLP capabilities.

Features:

  • 4-level data classification: Restricted, Confidential, Internal, Public
  • Automatic classification of all 33 scanners by sensitivity
  • Configurable controls for data masking, encryption, access

Detected Data Types:

  • All categories classified by sensitivity level
  • Automatic assignment based on scanner type

Configuration:

[policies]
enabled_policies = ["iso27001"]

[policies.policy_configs.iso27001]
enabled = true
settings = { confidence_threshold = "0.7", enforce_data_masking = "true" }

Enabling Multiple Policies

You can enable multiple policies simultaneously:

[policies]
enabled_policies = ["gdpr", "hipaa", "pci_dss", "sox", "ccpa", "iso27001"]

Each policy evaluates scan findings independently and generates violations according to its regulatory framework. A single file might trigger alerts from multiple policies if it contains different types of sensitive data.

Custom Policies

Aquilon DLP supports custom policies and scanners to detect company-specific data patterns without writing code.

Creating Custom Scanners

Define scanners for proprietary identifiers:

[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})"
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.85
description = "ACME Corp employee IDs"
context_signals = ["hr", "confidential", "personnel"]

[scanners.confidence_boost]
keywords = ["employee", "personnel", "payroll", "badge"]
boost_amount = 0.10
proximity = 200

Scanner Fields:

FieldRequiredDescription
nameYesUnique identifier (alphanumeric + underscore)
regexYesPattern to match (must be bounded)
redaction_patternYesTemplate for redacting matches
base_confidenceYesBase confidence score (0.0 - 1.0)
descriptionNoHuman-readable description
context_signalsNoKeywords for classification
confidence_boostNoBoost confidence when keywords found nearby

Pattern Safety

All regex patterns must be bounded to prevent performance issues:

# SAFE - bounded patterns
[[scanners]]
name = "fixed_length"
regex = "EMP-([0-9]{6})"           # Fixed length

Unsafe patterns (unbounded) will be rejected: \d+, .*, [A-Z]+

Dictionary Scanners

Dictionary scanners detect words and phrases from configurable inline lists using the Aho-Corasick algorithm for efficient O(n) multi-pattern matching.

When to Use Dictionary Scanners

  • Detect lists of keywords or terms (medical terms, project codes, product names)
  • Match multi-word phrases (e.g., “social security number”, “patient record”)
  • Domain-specific vocabulary that doesn’t follow a regex pattern

Basic Configuration

[[dictionary_scanners]]
name = "medical_terms"
words = [
    "diagnosis",
    "prescription",
    "patient record",
    "medical history"
]
case_sensitive = false
match_whole_words = true
base_confidence = 0.85

Configuration Fields

FieldTypeDefaultDescription
nameStringRequiredUnique scanner identifier (alphanumeric + underscore)
wordsArrayRequiredWords and phrases to detect
case_sensitiveBooleanfalseCase-sensitive matching
match_whole_wordsBooleantrueMatch only at word boundaries
base_confidenceFloat0.8Base confidence score (0.0-1.0)
min_matchesIntegerNoneMinimum matches required to report
match_proximityIntegerNoneMaximum bytes between matches
descriptionStringNoneHuman-readable description
context_signalsArrayNoneKeywords for classification

Advanced: Match Constraints

Use min_matches and match_proximity to reduce false positives by requiring multiple terms to appear together:

[[dictionary_scanners]]
name = "hipaa_terms"
words = [
    "protected health information",
    "PHI",
    "patient",
    "medical record",
    "diagnosis",
    "treatment"
]
base_confidence = 0.75
min_matches = 2
match_proximity = 500

This configuration only reports findings when at least 2 terms appear within 500 bytes of each other.

Advanced: Confidence Adjustments

Boost or reduce confidence based on nearby keywords:

[[dictionary_scanners]]
name = "project_codenames"
words = ["Project Alpha", "Operation Gamma", "Initiative Delta"]
base_confidence = 0.70
boost_keywords = ["confidential", "restricted", "internal only"]
boost_amount = 0.20
reduce_keywords = ["example", "test", "demo", "sample"]
reduce_amount = 0.30

When “confidential” appears nearby, confidence increases from 0.70 to 0.90. When “test” appears nearby, confidence decreases from 0.70 to 0.40.

Referencing Dictionary Scanners in Policies

Dictionary scanners use the custom: prefix when referenced in policies:

[[custom_policies]]
name = "healthcare_data"
enabled = true
required_scanners = ["custom:medical_terms", "ssn", "email"]

[[custom_policies.rules]]
id = "phi_exposure"
severity = "high"

[custom_policies.rules.composition]
operator = "AND"
proximity = 500

[[custom_policies.rules.composition.conditions]]
scanner = "custom:medical_terms"
min_confidence = 0.70

[[custom_policies.rules.composition.conditions]]
scanner = "ssn"
min_confidence = 0.75

Built-in Validators

Validators provide checksum or format validation for regex matches, significantly reducing false positives by verifying that detected patterns are mathematically valid.

Available Validators

ValidatorAlgorithmUse Case
luhnLuhn (mod 10)Credit cards, IMEI numbers
mod10Modulo 10Various identifiers with check digits
mod11Modulo 11ISBN-10, some national IDs
ibanIBAN checksumInternational Bank Account Numbers

Using Validators in Custom Scanners

Add a validator to filter out matches that fail checksum validation:

[[scanners]]
name = "company_account"
regex = "ACCT-([0-9]{10})"
redaction_pattern = "ACCT-XXXXXXXXXX"
base_confidence = 0.85

[scanners.validation]
validator = "luhn"
min_confidence = 0.70
invalid_patterns = ["^0+$", "1234567890$"]

Validation Configuration Fields

FieldTypeDescription
validatorStringChecksum validator: luhn, mod10, mod11, iban
min_confidenceFloatMinimum confidence threshold (0.0-1.0)
invalid_patternsArrayRegex patterns to reject (e.g., all zeros)

Example: Credit Card with Luhn Validation

The built-in credit card scanner already uses Luhn validation internally. For custom patterns that should use Luhn:

[[scanners]]
name = "loyalty_card"
regex = "([0-9]{4})([0-9]{4})([0-9]{4})([0-9]{4})"
redaction_pattern = "XXXX-XXXX-XXXX-XXXX"
base_confidence = 0.80
description = "16-digit loyalty card numbers with Luhn check"

[scanners.validation]
validator = "luhn"
invalid_patterns = ["^0{16}$", "^1{16}$"]

This configuration:

  1. Matches any 16-digit number
  2. Validates it passes the Luhn checksum
  3. Rejects all-zeros and all-ones patterns
  4. Reports only valid matches

Confidence Scoring

Aquilon DLP uses weighted confidence scoring to reduce false positives. Confidence can be boosted by nearby keywords or reduced by negative indicators.

How Confidence Works

Each scanner assigns a base_confidence score (0.0 to 1.0). This score can be adjusted based on:

  • Nearby positive keywords → Boost confidence (more likely a real match)
  • Nearby negative keywords → Reduce confidence (likely a false positive)
  • Validator success → Maintains or boosts confidence
  • Validator failure → Match is discarded

Boosting Confidence with Keywords

When specific keywords appear near a match, boost the confidence:

[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})"
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.75

[scanners.confidence_boost]
keywords = ["employee", "badge", "payroll", "personnel", "HR"]
boost_amount = 0.15
proximity = 200

If “employee” or “payroll” appears within 200 bytes, confidence increases from 0.75 to 0.90.

Reducing Confidence with Negative Keywords

When negative keywords appear near a match, reduce the confidence to suppress likely false positives:

[[scanners]]
name = "ssn_custom"
regex = "([0-9]{3})-([0-9]{2})-([0-9]{4})"
redaction_pattern = "XXX-XX-XXXX"
base_confidence = 0.80

[scanners.confidence_reduce]
keywords = ["example", "test", "fake", "sample", "xxx", "000-00-0000"]
boost_amount = 0.50
proximity = 100

If “example” or “test” appears within 100 bytes, confidence is reduced by 0.50 (from 0.80 to 0.30).

Combining Boost and Reduce

You can use both boost and reduce on the same scanner:

[[scanners]]
name = "account_number"
regex = "ACC-([0-9]{8})"
redaction_pattern = "ACC-XXXXXXXX"
base_confidence = 0.70

[scanners.confidence_boost]
keywords = ["account", "balance", "statement", "transaction"]
boost_amount = 0.20
proximity = 150

[scanners.confidence_reduce]
keywords = ["example", "test", "demo", "documentation"]
boost_amount = 0.40
proximity = 100

Confidence calculation:

  • Base: 0.70
  • With “account” nearby: 0.70 + 0.20 = 0.90
  • With “test” nearby: 0.70 - 0.40 = 0.30
  • With both: Boost and reduce are applied independently based on proximity

Creating Custom Policies

Define policies to enforce business rules:

[[custom_policies]]
name = "employee_data_protection"
description = "Detects employee PII exposure"
enabled = true
required_scanners = ["ssn", "custom:employee_id", "email"]

[[custom_policies.rules]]
id = "employee_pii_leak"
severity = "high"
remediation = "Contact HR compliance - do not share file"

[custom_policies.rules.composition]
operator = "AND"
proximity = 500

[[custom_policies.rules.composition.conditions]]
scanner = "custom:employee_id"
min_confidence = 0.70

[[custom_policies.rules.composition.conditions]]
scanner = "ssn"
min_confidence = 0.75

[custom_policies.rules.exclusions]
file_patterns = ["*/hr/authorized/*", "*/payroll/approved/*"]

Rule Types

Composition Rules (AND/OR Logic)

Alert when multiple data types appear together:

[custom_policies.rules.composition]
operator = "AND"              # All conditions must match
proximity = 500               # Within 500 characters

[[custom_policies.rules.composition.conditions]]
scanner = "ssn"
min_confidence = 0.75

[[custom_policies.rules.composition.conditions]]
scanner = "email"
min_confidence = 0.70

Threshold Rules (Count-Based)

Alert when count exceeds threshold (bulk export detection):

[custom_policies.rules.threshold]
scanner = "custom:employee_id"
operator = "greater_equal"
count = 10

Operators: >, >=, <, <=, ==

Context Rules (Exclusions)

Control when rules fire based on details:

[custom_policies.rules.context]
requires_any = ["external", "public", "shared"]

[custom_policies.rules.exclusions]
file_patterns = ["*/hr/authorized/*"]
requires_context_signals = ["approved", "authorized"]

Scanner References

When referencing scanners in policies:

  • Built-in scanners: Use direct name (ssn, email, cc)
  • Custom scanners: Use custom: prefix (custom:employee_id)
# Scanner references in required_scanners
required_scanners = [
    "ssn",                      # Built-in
    "email",                    # Built-in
    "custom:employee_id",       # Custom
    "custom:project_code"       # Custom
]

Adding Custom Policies

Custom scanners and policies are defined directly in your main configuration file using [[scanners]] and [[custom_policies]] sections:

# In aquilon_dlp_config.toml
[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})"
base_confidence = 0.85

[[custom_policies]]
name = "employee_data_protection"
enabled = true
required_scanners = ["custom:employee_id", "ssn"]

Validate your configuration:

sudo aquilon-dlp --config /etc/aquilon/aquilon_dlp_config.toml --validate-config

Built-in Scanners

Aquilon DLP includes 50+ built-in scanner plugins across multiple categories:

CategoryScanner CountExamples
National IDs28EU, Americas, Asia-Pacific, Middle East national IDs
PII8SSN, email, phone, address, date of birth
Financial5Credit card, bank account, IBAN, CVV
Medical6MRN, NPI, MBI, medical device IDs
Government3Passport, driver’s license, vehicle identifier
Technical3API keys, database connections, crypto keys
Business5Executive communications, financial figures, audit docs

All scanners integrate automatically with compliance policies.

National ID Scanners

Aquilon DLP includes comprehensive national ID detection with country-specific checksum validation:

Europe (14 scanners):

CountryScannerFormatValidation
Francefrance_nir15 digits (NIR)Mod 97
Germanygermany_steurid11 digits (Steuer-ID)Format rules
Italyitaly_cf16 chars (Codice Fiscale)Mod 26
Spainspain_dni8-9 chars (DNI/NIE)Mod 23
Polandpoland_pesel11 digits (PESEL)Weighted mod 10
Netherlandsnetherlands_bsn9 digits (BSN)11-proof
Belgiumbelgium_nrn11 digits (NRN)Mod 97
UKuk_nino9 chars (NINO)Format rules
Swedensweden_personnummer10-12 digitsLuhn
Norwaynorway_fodselsnummer11 digitsDual mod-11
Finlandfinland_hetu11 chars (HETU)Mod 31
Portugalportugal_nif9 digits (NIF)Weighted mod 11
Romaniaromania_cnp13 digits (CNP)Weighted mod 11
Czech/Slovakiaczech_rodne_cislo9-10 digitsMod 11

Americas (4 scanners):

CountryScannerFormatValidation
Brazilbrazil_cpf11 digits (CPF)Dual mod 11
Canadacanada_sin9 digits (SIN)Luhn
Chilechile_rut8-9 chars (RUT)Mod 11
Argentinaargentina_cuit11 digits (CUIT/CUIL)Weighted mod 11

Asia-Pacific (8 scanners):

CountryScannerFormatValidation
Australiaaustralia_tfn9 digits (TFN)Weighted mod 11
Indiaindia_aadhaar12 digits (Aadhaar)Format rules
Indiaindia_pan10 chars (PAN)Format rules
South Koreasouth_korea_rrn13 digits (RRN)Weighted mod 11
Japanjapan_my_number12 digitsGovernment checksum
Chinachina_resident_id18 charsISO 7064 MOD 11-2
Taiwantaiwan_national_id10 charsWeighted mod 10
New Zealandnew_zealand_ird8-9 digits (IRD)Mod 11

Middle East & Africa (2 scanners):

CountryScannerFormatValidation
Israelisrael_teudat_zehut9 digitsLuhn variant
Turkeyturkey_tc_kimlik11 digits (TC Kimlik)Two-step checksum

Other Scanners

PII: ssn, email, phone, address, date_of_birth, biometric, facial_photo, ip_address

Financial: credit_card, cvv, bank_account, iban, account_number

Medical: mrn, medical_id, npi, mbi, medical_device, certificate_license

Government: passport, drivers_license, vehicle_identifier

Technical: api_key, crypto, database_connection

Business: business_ip, audit_docs, executive_comms, financial_figures, material_info

Web: web_url

Policy Metadata

Add metadata for compliance tracking:

[[custom_policies]]
name = "employee_data_protection"
enabled = true
required_scanners = ["ssn", "custom:employee_id"]

[custom_policies.metadata]
compliance_framework = "ACME_DATA_PROTECTION_2024"
owner = "hr-compliance@acme.com"
review_date = "2025-01-15"

Severity Levels

Policy violations can have severity levels:

SeverityDescriptionExample
criticalImmediate action requiredBulk SSN export
highUrgent investigationPII with contact info
mediumReview requiredSingle finding in unexpected location
lowInformationalContext-appropriate finding

Example: Complete Custom Configuration

# Custom scanner for employee IDs
[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})"
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.85
description = "ACME Corp employee IDs"
context_signals = ["hr", "personnel"]

[scanners.confidence_boost]
keywords = ["employee", "badge", "payroll"]
boost_amount = 0.10
proximity = 200

# Custom policy for employee protection
[[custom_policies]]
name = "employee_data_protection"
enabled = true
required_scanners = ["ssn", "custom:employee_id", "email"]

[custom_policies.metadata]
owner = "security@acme.com"
review_date = "2025-06-01"

# Rule 1: Employee ID with SSN
[[custom_policies.rules]]
id = "employee_pii_leak"
severity = "high"
remediation = "Contact HR compliance immediately"

[custom_policies.rules.composition]
operator = "AND"
proximity = 500

[[custom_policies.rules.composition.conditions]]
scanner = "custom:employee_id"
min_confidence = 0.70

[[custom_policies.rules.composition.conditions]]
scanner = "ssn"
min_confidence = 0.75

[custom_policies.rules.exclusions]
file_patterns = ["*/hr/authorized/*"]

# Rule 2: Bulk employee export
[[custom_policies.rules]]
id = "bulk_employee_export"
severity = "critical"
remediation = "Investigate potential data breach"

[custom_policies.rules.threshold]
scanner = "custom:employee_id"
operator = "greater_equal"
count = 50

Troubleshooting Policies

Common Errors

“Unsafe regex pattern” - Pattern is unbounded. Add length limits.

“Reserved policy name” - Cannot use HIPAA, PCI-DSS, GDPR, etc. as custom policy names.

“Unknown scanner” - Check scanner name and custom: prefix.

No Alerts Appearing

  1. Verify policy is enabled (enabled = true)
  2. Check confidence thresholds aren’t too high
  3. Verify rule conditions are met
  4. Check exclusions aren’t blocking alerts

See the Configuration guide for applying changes and checking logs.

Monitoring

Aquilon DLP exposes findings through osquery tables, enabling powerful querying, alerting, and integration with existing security infrastructure.

osquery Tables

Aquilon DLP provides the following table for monitoring:

TableDescription
aquilon_dlp_alertsPrimary alert table with all findings and triage status

For complete table schema and triage workflow, see OSQuery Integration.

Querying Alerts

Basic Queries

View recent alerts:

SELECT * FROM aquilon_dlp_alerts
ORDER BY timestamp DESC
LIMIT 10;

View alerts for specific file:

SELECT policy, severity, data_type, pattern
FROM aquilon_dlp_alerts
WHERE path LIKE '%specific-file-test%';

View alerts from last 24 hours:

SELECT * FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400);

Analyzing Patterns

Count alerts by policy:

SELECT policy, COUNT(*) as alert_count
FROM aquilon_dlp_alerts
GROUP BY policy
ORDER BY alert_count DESC;

Count alerts by severity:

SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY severity
ORDER BY
  CASE severity
    WHEN 'critical' THEN 1
    WHEN 'high' THEN 2
    WHEN 'medium' THEN 3
    WHEN 'low' THEN 4
  END;

Find most affected directories:

SELECT
  rtrim(path, replace(path, '/', '')) as directory,
  COUNT(*) as alert_count
FROM aquilon_dlp_alerts
GROUP BY directory
ORDER BY alert_count DESC
LIMIT 10;

View data types found:

SELECT data_type, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY data_type
ORDER BY count DESC;

Investigation Queries

Find all alerts for a specific user:

SELECT * FROM aquilon_dlp_alerts
WHERE path LIKE '/var/watch/%'
ORDER BY timestamp DESC;

Find files with multiple policy violations:

SELECT path, COUNT(DISTINCT policy) as policy_count
FROM aquilon_dlp_alerts
GROUP BY path
HAVING policy_count > 1
ORDER BY policy_count DESC;

Find high-severity alerts with multiple findings:

SELECT path, policy, data_type, confidence
FROM aquilon_dlp_alerts
WHERE severity IN ('critical', 'high')
ORDER BY timestamp DESC
LIMIT 20;

Alert Fields

The aquilon_dlp_alerts table contains these fields:

FieldTypeDescription
idTEXTUUID (finding_id) for row identification
timestampBIGINTUnix timestamp of detection
pathTEXTFull path to scanned file
scannerTEXTScanner that detected the finding
severityTEXTAlert severity (critical/high/medium/low)
policyTEXTPolicy that triggered the alert
data_typeTEXTType of sensitive data found
patternTEXTRedacted pattern that matched
confidenceINTEGERConfidence level (0-100)
match_countINTEGERNumber of matches in file
frameworksTEXTApplicable compliance frameworks
triage_statusTEXTTriage status (new/acknowledged/resolved/ignored)
triageTEXTJSON object with triage details (owner, comment, timestamp)
contextTEXTJSON object with file metadata and context

For details on the JSON column structure and querying, see OSQuery Integration.

Alert Triage

Aquilon DLP supports updating alert triage status directly through OSQuery UPDATE statements. This allows security analysts to acknowledge, investigate, and resolve alerts.

Triage Status Values

StatusDescription
newJust detected, needs review (default)
acknowledgedAnalyst is investigating
resolvedIssue has been handled
ignoredIntentionally skipped (false positive)

Example Triage Queries

View alerts needing review:

SELECT path, scanner, severity, policy
FROM aquilon_dlp_alerts
WHERE triage_status = 'new'
ORDER BY severity DESC;

Acknowledge an alert:

UPDATE aquilon_dlp_alerts
SET triage_status = 'acknowledged',
    triage = JSON_OBJECT('owner', 'analyst@company.com')
WHERE path LIKE '%ack-test%';

Resolve an alert:

UPDATE aquilon_dlp_alerts
SET triage_status = 'resolved',
    triage = JSON_OBJECT('comment', 'File removed from system')
WHERE path LIKE '%ack-test%';

For complete triage workflow documentation, see OSQuery Integration.

Log Analysis

Log Locations

PlatformLog Location
macOS/var/log/aquilon/aquilon-dlp.log
Linux/var/log/aquilon/aquilon-dlp.log

Viewing Logs

Real-time log monitoring:

# macOS/Linux
tail -f /var/log/aquilon/aquilon-dlp.log

Filter for errors:

grep -i error /var/log/aquilon/aquilon-dlp.log

Filter for specific file:

grep "document.pdf" /var/log/aquilon/aquilon-dlp.log

Log Levels

Configure log level using the RUST_LOG environment variable:

# Set log level
export RUST_LOG=info

# Available levels: error, warn, info, debug, trace
LevelDescription
errorOnly critical errors
warnErrors and warnings
infoGeneral operational messages
debugDetailed debugging information
traceExtremely verbose tracing

SIEM Integration

JSON Log Format

Configure JSON logs for SIEM ingestion using environment variables:

# Set log level and format
export RUST_LOG=info

# Logs are written to stdout in structured JSON format
# Redirect output as needed for your SIEM

The application uses the tracing crate which outputs structured JSON fields for easy parsing when configured appropriately in your init system.

osquery Fleet Management

Aquilon DLP integrates with osquery fleet management tools:

  • Fleet - kolide/fleet or fleetdm
  • Kolide - kolide.com
  • osquery directly - via distributed queries

Example distributed query:

SELECT * FROM aquilon_dlp_alerts
WHERE severity IN ('critical', 'high')
AND timestamp > (strftime('%s', 'now') - 3600);

Splunk Integration

Forward osquery results to Splunk:

  1. Configure osquery logger to file
  2. Use Splunk Universal Forwarder to ingest logs
  3. Create dashboards for DLP alerts

Example Splunk query:

index=osquery sourcetype=osquery:results name=aquilon_dlp_alerts
| stats count by policy, severity

Elastic Stack Integration

Forward to Elasticsearch:

  1. Configure osquery with Kafka or file logger
  2. Use Filebeat/Logstash to ingest
  3. Create Kibana dashboards

Alerting

osquery Scheduled Queries

Schedule regular alert checks:

{
  "schedule": {
    "dlp_critical_alerts": {
      "query": "SELECT * FROM aquilon_dlp_alerts WHERE severity = 'critical' AND triage_status = 'new'",
      "interval": 300
    }
  }
}

External Alerting

Integrate with external systems by:

  1. Using osquery scheduled queries to export results
  2. Configuring SIEM to forward alerts to PagerDuty/Slack
  3. Triggering SOAR playbooks on critical alerts

Performance Metrics

System Resource Usage

Monitor Aquilon DLP resource consumption:

  • Memory usage: Check process memory
  • CPU usage: Monitor during active scanning
  • Disk I/O: Cache database writes
# macOS/Linux - find Aquilon DLP process
ps aux | grep aquilon

# Watch resource usage
top -p $(pgrep aquilon)

Operational Dashboards

Key Metrics to Track

MetricQueryAlert Threshold
Critical alerts/hourCOUNT WHERE severity=‘critical’>0
High alerts/hourCOUNT WHERE severity=‘high’>10
New alerts needing triageCOUNT WHERE triage_status=‘new’>50

Example Dashboard Queries

Alerts by policy:

SELECT policy, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY policy
ORDER BY count DESC
LIMIT 5;

Triage status summary:

SELECT triage_status, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY triage_status;

Health Checks

Verify Extension Loading

SELECT * FROM osquery_extensions
WHERE name LIKE '%aquilon%';

Verify Table Availability

PRAGMA table_info(aquilon_dlp_alerts);

Test Query

osqueryi --connect /var/osquery/osquery.sock 'SELECT COUNT(*) as alert_count FROM aquilon_dlp_alerts;'

Troubleshooting Monitoring

No Data in Tables

  1. Verify extension is loaded:

    <!--SETUP
    /init-services.sh
    -->
    SELECT * FROM osquery_extensions;
    <!--ASSERT 
    rows >= 0
    -->
    
  2. Check osquery logs:

    journalctl -u osqueryd -f
    
  3. Verify configuration is loaded:

    cat /etc/aquilon/config.toml
    

Stale Data

  1. Check if files are being monitored (watch paths configured)
  2. Verify cache isn’t returning old results
  3. Check system time is correct

Performance Issues

  1. Reduce log verbosity
  2. Increase cache TTL
  3. Adjust concurrent scan limits
  4. Review excluded paths

See Configuration for performance tuning options.

Deployment Guide

This section covers production deployment strategies for Aquilon DLP, from single workstation installations to enterprise-wide fleet deployments.

Deployment Options

Single Node

Manual installation on individual machines. Best for:

  • Personal use and evaluation
  • Small teams (< 10 machines)
  • Development and testing environments

MDM Deployment

Automated deployment via Mobile Device Management. Best for:

  • Enterprise macOS fleets
  • Automated compliance enforcement
  • Zero-touch provisioning

Covers: Jamf Pro, Microsoft Intune, Kandji, and generic MDM platforms.

Enterprise Deployment

Large-scale deployment planning and fleet management. Best for:

  • Organizations with 100+ endpoints
  • Multi-platform environments (macOS + Linux)
  • Centralized monitoring and compliance reporting

Edition Differences

FeatureBasicEnterprise
PlatformsLinux onlymacOS + Linux
Policy frameworksGDPR, CCPAAll frameworks
SupportCommunityEnterprise SLA
MDM deploymentN/AFull support

Planning Checklist

Before deployment, ensure you have:

  • Identified target endpoints and their platforms
  • Selected appropriate edition (Basic or Enterprise)
  • Planned deployment method (manual, MDM, or scripted)
  • Prepared configuration for your environment
  • Defined compliance policies to enable
  • Planned monitoring and alerting strategy

Deployment Prerequisites

All Platforms

  • OSQuery 5.x installed (for table integration)
  • Network access to download binaries
  • Administrative/root privileges for installation

macOS (Enterprise Edition)

  • macOS 11.0 (Big Sur) or later
  • Full Disk Access permission
  • MDM enrollment (for automated deployment)

Linux

  • Ubuntu 22.04+, RHEL 9+, Debian 11+, CentOS Stream 9+, or Fedora 38+
  • x86_64 architecture
  • systemd for service management

Next Steps

  1. Evaluation: Start with Single Node to test on one machine
  2. Pilot: Deploy to 10-50 devices to validate in your environment
  3. Production: Use MDM or Enterprise guides for full rollout

Single Node Deployment

Manual installation of Aquilon DLP on individual workstations. This guide covers both Linux (Basic Edition) and macOS (Enterprise Edition) deployments.

Overview

Single node deployment is ideal for:

  • Evaluating Aquilon DLP before enterprise rollout
  • Small teams with fewer than 10 machines
  • Development and testing environments
  • Personal data protection

Linux Deployment

Prerequisites

  • Operating System: Ubuntu 20.04+, RHEL 8+, Debian 11+
  • Architecture: x86_64
  • Memory: 2GB RAM minimum
  • Disk Space: 500MB for application and database
  • Permissions: Root or sudo access

Installation Steps

Step 1: Download

Download the Basic Edition package for your distribution from your organization’s portal:

  • Ubuntu/Debian: aquilon-dlp-basic_VERSION_amd64.deb
  • RHEL/CentOS: aquilon-dlp-basic-VERSION.x86_64.rpm

Step 2: Verify Checksum

# Verify checksum (SHA256 file provided with download)
sha256sum -c aquilon-dlp-basic-linux.sha256

Expected output: aquilon-dlp-basic-linux: OK

Step 3: Install Binary

# Make executable
chmod +x aquilon-dlp-basic

# Move to system path
sudo mv aquilon-dlp-basic /usr/local/bin/

# Verify installation
aquilon-dlp-basic --version

Step 4: Create Configuration

# Create config directory
sudo mkdir -p /etc/aquilon-dlp

# Download sample configuration
sudo curl -o /etc/aquilon-dlp/aquilon_dlp_config.toml \
  https://raw.githubusercontent.com/aquilonsecurity/aquilon-dlp/main/docs/config-examples/aquilon_dlp_config_basic.toml

# Set permissions
sudo chmod 644 /etc/aquilon-dlp/aquilon_dlp_config.toml

Step 5: Configure Watch Paths

Edit /etc/aquilon-dlp/aquilon_dlp_config.toml:

# Monitor these directories
watch_paths = [
    "/home/%%",           # All user home directories
    "/var/www/%%",        # Web server files
    "/data/%%"            # Data directory
]

# Exclude unnecessary paths
exclude_paths = [
    "/home/*/.cache/%%",  # User caches
    "/home/*/.local/%%"   # Local application data
]

# Enable policies (Basic Edition: GDPR, CCPA only)
[policies]
enabled_policies = ["gdpr", "ccpa"]

[policies.policy_configs.gdpr]
enabled = true

[policies.policy_configs.ccpa]
enabled = true

Step 6: Validate Configuration

aquilon-dlp-basic --validate-config /etc/aquilon-dlp/aquilon_dlp_config.toml

Expected output: Configuration is valid.

Running as a Service

Create systemd service file /etc/systemd/system/aquilon-dlp.service:

[Unit]
Description=Aquilon DLP Basic Edition
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/aquilon-dlp-basic --config /etc/aquilon-dlp/aquilon_dlp_config.toml
Restart=on-failure
RestartSec=10s
User=root
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target

Enable and start:

sudo systemctl daemon-reload
sudo systemctl enable aquilon-dlp
sudo systemctl start aquilon-dlp
sudo systemctl status aquilon-dlp

Verification

# Check service status
sudo systemctl status aquilon-dlp

# View logs
sudo journalctl -u aquilon-dlp -f

# Query OSQuery tables (if OSQuery installed)
osqueryi "SELECT * FROM aquilon_dlp_alerts LIMIT 10;"


macOS Deployment

Note: macOS requires Enterprise Edition for native Endpoint Security monitoring.

Prerequisites

  • Operating System: macOS 11.0 (Big Sur) or later
  • Architecture: x86_64 or Apple Silicon
  • Memory: 2GB RAM minimum, 4GB recommended
  • Disk Space: 1GB for application and database
  • Permissions: Full Disk Access, Administrator privileges

Installation Steps

Step 1: Download

Download the Enterprise Edition package for macOS from your organization’s portal:

  • macOS: aquilon-dlp-enterprise-VERSION.pkg

Step 2: Verify Code Signature

# Verify Apple Developer ID signature
codesign -dvv aquilon-dlp-enterprise

# Expected output should include:
# Authority=Developer ID Application: Aquilon Security, LLC

Step 3: Install Binary

# Make executable
chmod +x aquilon-dlp-enterprise

# Move to system path
sudo cp aquilon-dlp-enterprise /usr/local/bin/

# Verify installation
aquilon-dlp-enterprise --version

Step 4: Grant Full Disk Access

  1. Open System Settings > Privacy & Security > Full Disk Access
  2. Click + to add /usr/local/bin/aquilon-dlp-enterprise
  3. Enable the checkbox for Aquilon DLP

Important: Full Disk Access is required for Endpoint Security file monitoring. Without it, the application cannot scan protected directories.

Step 5: Create Configuration

# Create config directory
sudo mkdir -p /etc/aquilon-dlp

# Download sample configuration
sudo curl -o /etc/aquilon-dlp/aquilon_dlp_config.toml \
  https://raw.githubusercontent.com/aquilonsecurity/aquilon-dlp/main/docs/config-examples/aquilon_dlp_config_enterprise.toml

# Set permissions
sudo chmod 644 /etc/aquilon-dlp/aquilon_dlp_config.toml

Step 6: Configure Watch Paths

Edit /etc/aquilon-dlp/aquilon_dlp_config.toml:

# Monitor these directories
watch_paths = [
    "/Users/%%",          # All user home directories
    "/Volumes/%%",        # External drives
    "/data/%%"            # Data directories
]

# Exclude unnecessary paths
exclude_paths = [
    "/Users/*/.cache/%%",     # User caches
    "/Users/*/Library/%%"     # Library (optional)
]

# Enable all Enterprise policy frameworks
[policies]
enabled_policies = ["gdpr", "ccpa", "hipaa", "pci_dss", "sox", "iso27001"]

[policies.policy_configs.gdpr]
enabled = true

[policies.policy_configs.ccpa]
enabled = true

[policies.policy_configs.hipaa]
enabled = true

[policies.policy_configs.pci_dss]
enabled = true

[policies.policy_configs.sox]
enabled = true

[policies.policy_configs.iso27001]
enabled = true

Running as a LaunchDaemon

Create /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.aquilonsecurity.dlp</string>
    <key>ProgramArguments</key>
    <array>
        <string>/usr/local/bin/aquilon-dlp-enterprise</string>
        <string>--config</string>
        <string>/etc/aquilon-dlp/aquilon_dlp_config.toml</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
    <key>KeepAlive</key>
    <true/>
    <key>StandardOutPath</key>
    <string>/var/log/aquilon-dlp/stdout.log</string>
    <key>StandardErrorPath</key>
    <string>/var/log/aquilon-dlp/stderr.log</string>
</dict>
</plist>

Load and start:

# Create log directory
sudo mkdir -p /var/log/aquilon-dlp

# Load daemon
sudo launchctl load /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist

# Check status
sudo launchctl list | grep aquilon

# View logs
tail -f /var/log/aquilon-dlp/stderr.log

Verification

# Check if running
sudo launchctl list | grep aquilon

# Expected log output (in /var/log/aquilon-dlp/stderr.log):
# Attempting to initialize Endpoint Security monitoring...
# Full Disk Access verified
# Endpoint Security client created successfully
# Endpoint Security monitoring active

# Query OSQuery tables (if OSQuery installed)
osqueryi "SELECT * FROM aquilon_dlp_alerts LIMIT 10;"


OSQuery Integration

Both editions integrate with OSQuery for monitoring and alerting.

Install OSQuery

Linux (Ubuntu/Debian):

curl -L https://pkg.osquery.io/deb/osquery_5.x_1.0.0_amd64.deb -o osquery.deb
sudo dpkg -i osquery.deb

Linux (RHEL/CentOS):

sudo yum install https://pkg.osquery.io/rpm/osquery-5.x-1.0.0.x86_64.rpm

macOS:

# Using Homebrew
brew install --cask osquery

# Or download PKG
curl -L https://pkg.osquery.io/darwin/osquery-5.x.pkg -o osquery.pkg
sudo installer -pkg osquery.pkg -target /

Configure Extension

Add to /etc/osquery/extensions.load:

/usr/local/bin/aquilon-dlp-basic --socket /var/osquery/osquery.em

(Replace aquilon-dlp-basic with aquilon-dlp-enterprise for macOS)

Query DLP Tables

-- Query alerts by severity
SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY severity;

Troubleshooting

Linux: Service Won’t Start

Check logs:

sudo journalctl -u aquilon-dlp -n 50

Common causes:

  • Invalid configuration file (run –validate-config)
  • Missing permissions on watch directories
  • Database lock (only one instance can run)

macOS: Full Disk Access Not Working

Symptoms: “Operation not permitted” errors

Solutions:

  1. Verify FDA in System Settings > Privacy & Security > Full Disk Access

  2. Remove and re-add the binary

  3. Restart the LaunchDaemon:

    sudo launchctl unload /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist
    sudo launchctl load /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist
    

Policy Not Available (Basic Edition)

Symptom: “Unknown policy ‘hipaa’, skipping”

Cause: Basic Edition only includes GDPR and CCPA

Solution: Remove enterprise policies from configuration:

[policies]
enabled_policies = ["gdpr", "ccpa"]  # Only these available in Basic Edition

For enterprise policies (HIPAA, PCI DSS, SOX, ISO 27001), upgrade to Enterprise Edition.

High Resource Usage

Symptoms: High CPU or memory consumption

Solutions:

  1. Add exclusions for high-churn directories
  2. Exclude large binary files (.app, .dmg, .iso)
  3. Reduce num_workers in configuration
  4. Adjust max_scan_size_mb to skip large files

Next Steps

MDM Deployment

Note: MDM deployment requires macOS Enterprise Edition.

Automated deployment of Aquilon DLP via Mobile Device Management (MDM) for enterprise macOS fleets.

Overview

MDM deployment enables:

  • Zero-touch provisioning of Full Disk Access permissions
  • Automated app installation across hundreds/thousands of Macs
  • Centralized configuration and compliance enforcement
  • Silent deployment without user interaction

Why MDM?

Aquilon DLP uses macOS Endpoint Security framework, which requires Full Disk Access (FDA). In enterprise environments:

  • Manual FDA grants don’t scale
  • Users may skip or misconfigure permissions
  • Compliance requires consistent deployment

MDM solves this by deploying PPPC (Privacy Preferences Policy Control) profiles that automatically grant FDA before app installation.


Prerequisites

  • MDM Platform: Jamf Pro, Microsoft Intune, Kandji, SimpleMDM, or compatible
  • macOS Version: 11.0 (Big Sur) or later
  • Signed App Bundle: Code-signed with Endpoint Security entitlement
  • Admin Access: MDM console with profile deployment permissions
  • Enrolled Devices: Target Macs enrolled in your MDM

Before You Begin

  1. Verify your signed app bundle has correct code requirement:

    ./scripts/extract_code_requirement.sh target/debug/aquilon-dlp.app
    
  2. Create a pilot group (10-50 devices) for initial testing

  3. Document your rollback plan in case of issues


Deployment Process

The deployment follows three phases, always in this order:

  1. Deploy PPPC Profile - Grants Full Disk Access permission
  2. Wait for Confirmation - Verify profile installation
  3. Deploy App - Install after FDA is granted

Critical: Deploy profile BEFORE app. macOS only applies PPPC grants during app installation.


Jamf Pro

Step 1: Upload PPPC Profile

  1. Navigate to: Computers > Configuration Profiles > + New

  2. Configure:

    • Display Name: Aquilon DLP - Full Disk Access
    • Category: Security
    • Distribution Method: Install Automatically
  3. Click Privacy Preferences Policy Control payload

  4. Click Upload and select deployment/mdm/pppc-jamf.mobileconfig

  5. Verify imported settings:

    • Identifier: dev.aquilon.dlp-plugin
    • System Policy All Files: Checked

Step 2: Scope and Deploy

  1. Click Scope tab
  2. Add target computer groups (start with pilot group)
  3. Click Save

Profile deploys on next check-in (typically 15-30 minutes).

Step 3: Verify Installation

On target Mac:

sudo profiles list | grep -i aquilon
# Expected: com.aquilonsecurity.dlp.pppc.jamf

Step 4: Package and Deploy App

  1. Create PKG installer:

    pkgbuild --root /path/to/aquilon-dlp.app \
             --identifier dev.aquilon.dlp-plugin \
             --version 0.1.0 \
             --install-location /Library/Application\ Support/aquilon-dlp.app \
             aquilon-dlp-0.1.0.pkg
    
  2. Upload to Jamf:

    • Settings > Computer Management > Packages > + New
    • Upload signed package
  3. Create installation policy:

    • Computers > Policies > + New
    • Add package with Install action
    • Scope to same groups as PPPC profile

Timeline

EventTiming
Profile propagates15-30 minutes
App installs15-30 minutes after profile
Total~60-90 minutes

Microsoft Intune

Step 1: Upload PPPC Profile

  1. Navigate to: Devices > macOS > Configuration profiles > + Create profile

  2. Select:

    • Platform: macOS
    • Profile type: Templates > Custom
  3. Configure:

    • Name: Aquilon DLP - Full Disk Access
    • Upload deployment/mdm/pppc-intune.mobileconfig
    • Deployment channel: Device channel

Step 2: Assign to Devices

  1. Click Assignments tab
  2. Add target Azure AD device groups
  3. Optionally add filter for macOS 11.0+

Step 3: Package App for Intune

Intune requires .intunemac format:

# Download Intune App Wrapping Tool from:
# https://github.com/msintuneappsdk/intune-app-wrapping-tool-mac

./IntuneAppUtil -c /path/to/aquilon-dlp.app \
                -o aquilon-dlp.intunemac \
                -n "0.1.0" \
                -v "0.1.0"

Step 4: Deploy App

  1. Navigate to: Apps > macOS > + Add
  2. App type: Line-of-business app
  3. Upload .intunemac file
  4. Configure app information
  5. Assign to same device groups as profile

Note: Wait 24 hours after profile deployment before deploying app, or use dynamic groups.

Timeline

EventTiming
Profile propagates1-8 hours
App installs1-8 hours after profile
Total~2-16 hours

Tip: Force sync via Company Portal > Settings > Sync to speed up check-ins.


Kandji

Step 1: Create Custom Profile

  1. Navigate to: Library > Custom Profiles > + Add Profile

  2. Configure:

    • Name: Aquilon DLP - Full Disk Access
    • Upload deployment/mdm/pppc-kandji.mobileconfig
    • Enforcement: Deploy Always
  3. Assign to target blueprints

Step 2: Create Custom App

  1. Navigate to: Library > Custom Apps > + Add App

  2. Upload PKG installer

  3. Configure:

    • Install Type: Package
    • Run as: System
  4. Set PPPC profile as dependency (optional but recommended)

  5. Assign to same blueprints

Timeline

EventTiming
Profile propagates15-60 minutes
App installs15-60 minutes after profile
Total~30-120 minutes

Generic MDM

For SimpleMDM, FileWave, Mosyle, or other platforms:

Profile Deployment

  1. Download deployment/mdm/pppc-generic.mobileconfig
  2. Upload to your MDM’s configuration profile section
  3. Assign to target devices/groups

App Deployment

  1. Package app as .pkg installer
  2. Upload to your MDM’s app distribution
  3. Deploy after confirming profile installation

Key Configuration

The profile must contain:

  • Bundle ID: dev.aquilon.dlp-plugin
  • Service: SystemPolicyAllFiles (Full Disk Access)
  • Code Requirement: Match your signed app

Verification

After deployment, verify on target Mac:

Check Profile Installation

sudo profiles list | grep -i aquilon
# Expected: com.aquilonsecurity.dlp.pppc.<mdm>
# Where <mdm> is: jamf, intune, or kandji

Check FDA Grant

sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
  "SELECT auth_value FROM access
   WHERE service = 'kTCCServiceSystemPolicyAllFiles'
   AND client = 'dev.aquilon.dlp-plugin';"
# Expected: 2

Check App Function

sudo /Library/Application\ Support/aquilon-dlp.app/Contents/MacOS/aquilon-dlp \
  --socket /tmp/osquery.sock

Expected output:

Attempting to initialize Endpoint Security monitoring...
Full Disk Access verified
Endpoint Security client created successfully
Endpoint Security monitoring active

Troubleshooting

FDA Not Granted After Installation

Cause: App installed before PPPC profile

Solution:

# 1. Verify profile is installed
sudo profiles list | grep aquilon

# 2. Remove app
sudo rm -rf /Library/Application\ Support/aquilon-dlp.app

# 3. Reinstall via MDM (triggers on next check-in)

System Settings Shows FDA Unchecked

Cause: Known macOS UI bug - checkbox doesn’t reflect TCC database

Solution: Trust the TCC database query. If auth_value = 2, FDA IS granted.

Warning: Do NOT manually toggle the checkbox - it may revoke the PPPC grant.

“Failed to create ES client” Error

Causes and solutions:

  1. FDA not granted: Check TCC database (see above)

  2. Not running as root: Use sudo

  3. ES entitlement missing: Check code signing

    codesign -d --entitlements - /Library/Application\ Support/aquilon-dlp.app
    

Code Requirement Mismatch

Symptom: Profile installed but TCC has no entry

Solution:

  1. Extract app’s actual code requirement:

    codesign -dr - /Library/Application\ Support/aquilon-dlp.app
    
  2. Update profile to match

  3. Redeploy profile and reinstall app

Profile Won’t Install

Solutions:

  1. Validate profile: plutil -lint deployment/mdm/pppc-*.mobileconfig

  2. Check device enrollment status

  3. Remove conflicting profiles:

    # Replace <mdm> with: jamf, intune, or kandji
    sudo profiles remove -identifier com.aquilonsecurity.dlp.pppc.<mdm>
    

Diagnostic Script

Save and run this script on target Mac:

#!/bin/bash
# FDA Troubleshooting Diagnostic

echo "=== Aquilon DLP FDA Diagnostic ==="
echo

echo "1. Profile Installation:"
profiles list | grep -q "com.aquilonsecurity.dlp.pppc" && \
  echo "✓ Profile installed" || echo "✗ Profile NOT installed (check for .jamf/.intune/.kandji suffix)"

echo "2. TCC Database Entry:"
AUTH=$(sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
  "SELECT auth_value FROM access WHERE service = 'kTCCServiceSystemPolicyAllFiles'
   AND client = 'dev.aquilon.dlp-plugin';" 2>/dev/null)
[ "$AUTH" = "2" ] && echo "✓ FDA granted" || echo "✗ FDA NOT granted"

echo "3. App Bundle:"
[ -d "/Library/Application Support/aquilon-dlp.app" ] && \
  echo "✓ App installed" || echo "✗ App NOT installed"

echo "4. Code Signature:"
codesign --verify /Library/Application\ Support/aquilon-dlp.app 2>/dev/null && \
  echo "✓ Valid signature" || echo "✗ Invalid signature"

echo "5. ES Entitlement:"
codesign -d --entitlements - /Library/Application\ Support/aquilon-dlp.app 2>&1 | \
  grep -q "endpoint-security" && \
  echo "✓ ES entitlement present" || echo "✗ ES entitlement missing"

echo "=== End Diagnostic ==="

Best Practices

Staged Rollout

  1. Pilot (Week 1): Deploy to IT/security team (10-50 devices)
  2. Early Adopters (Week 2): Expand to 100-500 devices
  3. Production (Week 3+): Roll out to all devices

Smart Groups

Create groups to track deployment status:

  • Profile Installed: Devices with PPPC profile
  • App Installed: Devices with app bundle
  • Needs Remediation: App installed but FDA not granted

Remediation Policy

Create automated remediation for FDA issues:

  1. Detect: App installed but FDA not in TCC
  2. Action: Remove app, trigger reinstall
  3. Monitor: Alert on repeated failures

Next Steps

Enterprise Deployment

Large-scale deployment planning and fleet management for Aquilon DLP across enterprise environments.

Overview

Enterprise deployment addresses:

  • Scaling to hundreds or thousands of endpoints
  • Multi-platform environments (macOS and Linux)
  • Centralized configuration management
  • Compliance reporting and monitoring
  • Fleet health and remediation

Planning

Deployment Scope

Before deploying, define your scope:

FactorConsiderations
EndpointsTotal count, platform mix, geographic distribution
ComplianceRequired frameworks (HIPAA, PCI DSS, SOX, ISO 27001)
PoliciesStandard vs custom, per-department variations
MonitoringAlert routing, SIEM integration, dashboards
SupportHelp desk preparation, escalation paths

Rollout Strategy

Recommended: Staged rollout

PhaseScopeDurationGoals
PilotIT/Security (10-50)1 weekValidate deployment, catch issues
Early AdopterWilling teams (100-500)1 weekBroader testing, refine process
GeneralAll remaining2-4 weeksFull production rollout

For each phase:

  1. Deploy configuration and profiles
  2. Monitor for issues (24-48 hours)
  3. Address any problems
  4. Proceed to next phase

Success Criteria

Define metrics before deployment:

  • Installation success rate > 99%
  • FDA grant rate (macOS) > 99%
  • Service running rate > 99%
  • Alert generation within 24 hours
  • No critical issues in pilot

Configuration Management

Centralized Configuration

For consistent deployment across endpoints, centralize configuration:

Option A: MDM-deployed configuration file

  • Deploy /etc/aquilon-dlp/aquilon_dlp_config.toml via MDM
  • Update by redeploying profile

Option B: Configuration management (Ansible, Chef, Puppet)

# Ansible example
- name: Deploy Aquilon DLP config
  template:
    src: aquilon_dlp_config.toml.j2
    dest: /etc/aquilon-dlp/aquilon_dlp_config.toml
    mode: '0644'
  notify: restart aquilon-dlp

Department-Specific Policies

Different departments may need different policies:

# Example: Finance department config
[policies]
enabled_policies = ["gdpr", "ccpa", "sox", "pci_dss"]

# Other departments would use different policies:
# - Healthcare: ["gdpr", "hipaa"]
# - Engineering: ["gdpr", "ccpa"]

Deploy department-specific configs via:

  • MDM smart groups/blueprints
  • Configuration management role assignments
  • AD group membership

Tracking Deployment

Track active installations:

  • Use MDM inventory reports
  • Query OSQuery fleet
  • Monitor Prometheus endpoint count

Monitoring and Alerting

OSQuery Fleet Queries

Schedule queries across your fleet:

-- Daily: Deployment health
SELECT
  hostname,
  (SELECT COUNT(*) FROM aquilon_dlp_alerts) AS total_alerts,
  (SELECT COUNT(*) FROM aquilon_dlp_alerts WHERE severity = 'critical') AS critical_alerts
FROM system_info;

-- Hourly: Alert summary
SELECT
  policy,
  severity,
  COUNT(*) AS count
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 3600)
GROUP BY policy, severity;

Prometheus Metrics

Configure Prometheus scraping:

# prometheus.yml
scrape_configs:
  - job_name: 'aquilon-dlp'
    static_configs:
      - targets: ['host1:9090', 'host2:9090', ...]
    # Or use service discovery
    file_sd_configs:
      - files:
        - 'targets/aquilon-dlp/*.json'

Key metrics to monitor:

  • aquilon_dlp_scans_total - Scan volume by policy
  • aquilon_dlp_alerts_total - Alert count by severity
  • aquilon_dlp_cache_hits_total - Cache efficiency
  • aquilon_dlp_scan_duration_seconds - Performance

Grafana Dashboards

Enterprise customers receive pre-built dashboards:

  • Compliance Overview: Policy coverage across fleet
  • Performance: Scan rates, latency, resource usage
  • Alerts: Real-time alert visualization

Contact support@aquilonsecurity.com for dashboard templates.

SIEM Integration

Forward alerts to your SIEM via:

Structured logging:

# Configure logging via environment variable
export RUST_LOG=info

# Logs are output to stdout in structured JSON format
# Configure your SIEM to ingest from osquery results or log files

Note: Direct syslog forwarding is a planned feature. Currently, integrate via OSQuery scheduled queries.

OSQuery scheduled queries: Configure OSQuery to forward aquilon_dlp_alerts to SIEM.


Fleet Health

Health Checks

Monitor endpoint health:

Service running:

# macOS
sudo launchctl list | grep -q "com.aquilonsecurity.dlp" && echo "Running" || echo "Stopped"

# Linux
systemctl is-active aquilon-dlp

Recent alerts:

SELECT * FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400);

FDA status (macOS):

sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
  "SELECT auth_value FROM access
   WHERE service = 'kTCCServiceSystemPolicyAllFiles'
   AND client = 'dev.aquilon.dlp-plugin';"

Common Issues

Service Not Running

Diagnosis:

# macOS
sudo launchctl list | grep aquilon
tail -100 /var/log/aquilon-dlp/stderr.log

# Linux
systemctl status aquilon-dlp
journalctl -u aquilon-dlp -n 100

Causes:

  • Configuration error (run –validate-config)
  • Database lock (another instance running)
  • Missing permissions

Remediation:

  1. Fix configuration issue
  2. Kill duplicate processes
  3. Restart service

FDA Not Granted (macOS)

Diagnosis:

# Check profile
sudo profiles list | grep aquilon

# Check TCC database
sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
  "SELECT auth_value FROM access
   WHERE service = 'kTCCServiceSystemPolicyAllFiles'
   AND client = 'dev.aquilon.dlp-plugin';"

Remediation:

  1. Verify PPPC profile installed
  2. Remove app bundle
  3. Reinstall via MDM
  4. Verify TCC entry shows auth_value = 2

No Alerts Generated

Diagnosis:

-- Check for recent alerts
SELECT COUNT(*) as alert_count, policy
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400)
GROUP BY policy;

Causes:

  • No sensitive data in monitored paths
  • Policies not enabled in configuration
  • Exclusions too broad

Remediation:

  1. Review enabled policies
  2. Check watch_paths include relevant directories
  3. Review exclude_paths for over-exclusion
  4. Test with known sensitive data

High Resource Usage

Diagnosis:

# Check CPU/memory (use aquilon-dlp-enterprise or aquilon-dlp-basic based on edition)
top -pid $(pgrep -f aquilon)

# Check alert count
osqueryi "SELECT COUNT(*) FROM aquilon_dlp_alerts;"

Causes:

  • Monitoring high-churn directories
  • Large files without size limits
  • Too many workers

Remediation:

# Add exclusions
exclude_paths = [
    "/Users/*/.cache/%%",
    "/home/*/.npm/%%",
    "**/*.iso",
    "**/*.dmg"
]

# Limit file size
[scan]
max_scan_size_mb = 100

# Reduce workers
[worker]
num_workers = 2  # Default is 4


Automated Remediation

MDM Remediation Policies

Jamf Pro - Extension Attribute for FDA status:

#!/bin/bash
AUTH=$(sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
  "SELECT auth_value FROM access
   WHERE service = 'kTCCServiceSystemPolicyAllFiles'
   AND client = 'dev.aquilon.dlp-plugin';" 2>/dev/null)

if [ "$AUTH" = "2" ]; then
    echo "<result>Granted</result>"
else
    echo "<result>Not Granted</result>"
fi

Smart Group for remediation:

  • Criteria: Extension Attribute “FDA Status” is “Not Granted”
  • Policy: Reinstall Aquilon DLP package

Ansible Remediation Playbook

---
- name: Remediate Aquilon DLP issues
  hosts: dlp_endpoints
  tasks:
    - name: Check service status
      service:
        name: aquilon-dlp
        state: started
        enabled: yes

    - name: Validate configuration
      command: aquilon-dlp --validate-config /etc/aquilon-dlp/aquilon_dlp_config.toml
      register: config_check
      failed_when: config_check.rc != 0

    - name: Restart if config changed
      service:
        name: aquilon-dlp
        state: restarted
      when: config_changed | default(false)


Compliance Reporting

Generating Reports

Use OSQuery to generate compliance reports:

-- HIPAA compliance summary
SELECT
  date(timestamp, 'unixepoch') AS date,
  COUNT(*) AS total_findings,
  SUM(CASE WHEN severity = 'critical' THEN 1 ELSE 0 END) AS critical,
  SUM(CASE WHEN severity = 'high' THEN 1 ELSE 0 END) AS high
FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA'
GROUP BY date(timestamp, 'unixepoch')
ORDER BY date DESC;

-- PCI DSS cardholder data exposure
SELECT
  path,
  timestamp,
  scanner,
  severity
FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
  AND scanner IN ('credit_card', 'cvv')
ORDER BY timestamp DESC;

Audit Trail

Maintain audit trails for compliance:

  • Findings: All alerts with timestamps
  • Remediation: Actions taken on findings
  • Coverage: Endpoints monitored

Export from OSQuery or configure SIEM to retain.


Disaster Recovery

Backup

Back up critical data:

  • Configuration files (/etc/aquilon-dlp/)
  • SQLite database (cache)
  • MDM profiles and packages

Recovery

Single endpoint recovery:

  1. Reinstall via MDM or manual deployment
  2. Deploy configuration
  3. Verify service running

Fleet-wide recovery:

  1. Verify MDM profiles and packages available
  2. Trigger reinstall via MDM policy
  3. Monitor deployment dashboard

Version Rollback

To roll back a problematic update:

  1. Upload previous version to MDM
  2. Deploy to affected endpoints
  3. Monitor for issues

Support

Enterprise Support Channels

Support Response Times

PriorityResponse Time
Critical (P1)4 hours
High (P2)8 hours
Normal (P3)24 hours

Providing Logs

When contacting support, include:

macOS:

# Collect logs
tail -n 500 /var/log/aquilon-dlp/stderr.log > dlp-logs.txt

# System info
system_profiler SPSoftwareDataType >> dlp-logs.txt

# FDA status
sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
  "SELECT * FROM access WHERE client LIKE '%aquilon%';" >> dlp-logs.txt

Linux:

# Collect logs
sudo journalctl -u aquilon-dlp -n 500 > dlp-logs.txt

# System info
uname -a >> dlp-logs.txt
cat /etc/os-release >> dlp-logs.txt

# Service status
systemctl status aquilon-dlp >> dlp-logs.txt


Next Steps

Admin Guide

This section covers system administration tasks for Aquilon DLP, including daily operations, maintenance, backup procedures, and disaster recovery.

Administrative Overview

Daily Operations

TaskFrequencyGuide
Check service statusDailyOperations
Review critical alertsDailyMonitoring
Check disk usageDailyOperations

Weekly Maintenance

TaskFrequencyGuide
Review scan statisticsWeeklyOperations
Check cache efficiencyWeeklyOperations
Verify backupsWeeklyBackup & Restore

Monthly Tasks

TaskFrequencyGuide
Database vacuumMonthlyOperations
Log rotation reviewMonthlyOperations
Performance auditMonthlyOperations

Prerequisites

Administrative tasks require:

  • Root/Administrator access to the system
  • OSQuery installed for monitoring queries
  • SSH access for remote administration

Key File Locations

Linux

PurposeLocation
Configuration/etc/aquilon-dlp/aquilon_dlp_config.toml
Database/var/lib/aquilon-dlp/aquilon_dlp.db
Logs/var/log/aquilon-dlp/ or systemd journal
Service file/etc/systemd/system/aquilon-dlp.service

macOS

PurposeLocation
Configuration/etc/aquilon-dlp/aquilon_dlp_config.toml
Database/var/lib/aquilon-dlp/aquilon_dlp.db
Logs/var/log/aquilon-dlp/
LaunchDaemon/Library/LaunchDaemons/com.aquilonsecurity.dlp.plist

Next Steps

Operations

Day-to-day operational tasks for managing Aquilon DLP.

Service Management

Linux (systemd)

# Check status
sudo systemctl status aquilon-dlp

# Start/stop/restart
sudo systemctl start aquilon-dlp
sudo systemctl stop aquilon-dlp
sudo systemctl restart aquilon-dlp

# Enable/disable at boot
sudo systemctl enable aquilon-dlp
sudo systemctl disable aquilon-dlp

# View recent logs
sudo journalctl -u aquilon-dlp -n 100
sudo journalctl -u aquilon-dlp -f  # Follow

macOS (launchd)

# Check status
sudo launchctl list | grep aquilon

# Load/unload (start/stop)
sudo launchctl load /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist
sudo launchctl unload /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist

# View logs
tail -f /var/log/aquilon-dlp/stderr.log

Configuration Reload

After configuration changes:

# Linux
sudo systemctl restart aquilon-dlp

# macOS
sudo launchctl unload /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist
sudo launchctl load /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist

Health Checks

Service Running

# Linux
systemctl is-active aquilon-dlp

# macOS
sudo launchctl list | grep -q "com.aquilonsecurity.dlp" && echo "Running" || echo "Stopped"

Recent Alerts

-- Alerts in last 24 hours
SELECT COUNT(*) as alerts_24h
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400);

Alert Generation

-- Alerts in last hour
SELECT COUNT(*) as recent_alerts
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 3600);

Log Management

Log Locations

  • Linux (systemd): journalctl -u aquilon-dlp
  • Linux (syslog): /var/log/syslog or /var/log/messages
  • macOS: /var/log/aquilon-dlp/*.log

Log Rotation (Linux)

Create /etc/logrotate.d/aquilon-dlp:

/var/log/aquilon-dlp/*.log {
    daily
    rotate 14
    compress
    delaycompress
    missingok
    notifempty
    create 0640 root root
    postrotate
        systemctl reload aquilon-dlp 2>/dev/null || true
    endscript
}

Log Levels

Adjust log verbosity via environment variable:

# Set in service environment (Linux)
# /etc/systemd/system/aquilon-dlp.service.d/override.conf
[Service]
Environment="RUST_LOG=aquilon_dlp=info"  # debug, info, warn, error

Resource Monitoring

Disk Usage

# Database size
du -sh /var/lib/aquilon-dlp/aquilon_dlp.db

# Log directory
du -sh /var/log/aquilon-dlp/

Process Resources

# CPU and memory
ps aux | grep aquilon-dlp

# Detailed (Linux)
top -p $(pgrep aquilon-dlp)

OSQuery Metrics

-- Alert statistics by severity
SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY severity;

-- Alerts by policy
SELECT policy, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY policy;

Database Maintenance

Vacuum

Reclaim space and optimize performance:

# Manual vacuum
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "VACUUM;"

# Check size before/after
ls -lh /var/lib/aquilon-dlp/aquilon_dlp.db

Integrity Check

sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "PRAGMA integrity_check;"
# Expected: ok

Query Performance

Check for slow queries:

sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "PRAGMA analysis_limit=1000; ANALYZE;"

Cache Management

Alert Statistics

-- Count alerts by severity
SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY severity;

Target: Review and triage critical/high severity alerts promptly

Clear Cache

To force re-scanning (use cautiously):

# Stop service first
sudo systemctl stop aquilon-dlp

# Clear cache table
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "DELETE FROM scan_cache;"

# Restart
sudo systemctl start aquilon-dlp

Cache Configuration

Tune cache settings:

[cache]
enabled = true
ttl_secs = 86400            # Cache TTL in memory (24 hours)
scan_cache_ttl_days = 7     # Database cache TTL

Alert Statistics

Current Status

-- Alert overview by triage status
SELECT
  triage_status,
  COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY triage_status;

-- Alert by scanner type
SELECT
  scanner,
  COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY scanner
ORDER BY count DESC;

Alert Trend

-- Recent alert activity
SELECT
  severity,
  COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY severity
ORDER BY
  CASE severity
    WHEN 'critical' THEN 1
    WHEN 'high' THEN 2
    WHEN 'medium' THEN 3
    ELSE 4
  END;

Performance Tuning

Worker Configuration

Adjust based on CPU cores:

[work_queue]
max_queue_size = 10000      # Work queue size
submit_timeout_secs = 5     # Timeout for queue submissions

[worker]
num_workers = 4             # Match CPU cores

Reduce I/O Load

[scan]
max_scan_size_mb = 100      # Skip large files

[resource_limits]
enabled = true
nice_level = 10             # Lower CPU priority (0-19)

High-Churn Directory Handling

Exclude directories that change frequently:

exclude_paths = [
    "/tmp/%%",
    "/var/cache/%%",
    "/home/*/.cache/%%"
]

Troubleshooting Operations

Service Won’t Start

  1. Check configuration:

    aquilon-dlp --validate-config /etc/aquilon-dlp/aquilon_dlp_config.toml
    
  2. Check logs for errors:

    sudo journalctl -u aquilon-dlp -n 50
    
  3. Check database lock:

    lsof /var/lib/aquilon-dlp/aquilon_dlp.db
    

High CPU Usage

  1. Check scan rate in logs
  2. Add exclusions for high-churn directories
  3. Increase nice level
  4. Reduce worker count

High Memory Usage

  1. Reduce max_entries in cache config
  2. Reduce queue_size in worker config
  3. Restart service to clear memory

Database Corruption

  1. Stop service
  2. Run integrity check
  3. If failed, restore from backup (see Backup & Restore)

Backup & Restore

Procedures for backing up and restoring Aquilon DLP data and configuration.

What to Back Up

ComponentLocationPriorityNotes
Configuration/etc/aquilon-dlp/aquilon_dlp_config.tomlCriticalApplication settings
Database/var/lib/aquilon-dlp/aquilon_dlp.dbHighFindings and cache
Custom policies/etc/aquilon-dlp/policies/HighIf using custom policies
Retention config/etc/aquilon-dlp/retention_config.tomlMediumCompliance retention settings

Backup Procedures

Configuration Backup

# Create backup directory
mkdir -p /backup/aquilon-dlp/$(date +%Y%m%d)

# Backup configuration
cp /etc/aquilon-dlp/aquilon_dlp_config.toml /backup/aquilon-dlp/$(date +%Y%m%d)/

# Backup custom policies (if any)
cp -r /etc/aquilon-dlp/policies/ /backup/aquilon-dlp/$(date +%Y%m%d)/ 2>/dev/null || true

# Backup retention config (if any)
cp /etc/aquilon-dlp/retention_config.toml /backup/aquilon-dlp/$(date +%Y%m%d)/ 2>/dev/null || true

Database Backup

Hot backup (service running):

# SQLite hot backup
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db ".backup /backup/aquilon-dlp/$(date +%Y%m%d)/aquilon_dlp.db"

# Verify backup
sqlite3 /backup/aquilon-dlp/$(date +%Y%m%d)/aquilon_dlp.db "PRAGMA integrity_check;"

Cold backup (service stopped):

# Stop service
sudo systemctl stop aquilon-dlp

# Copy database
cp /var/lib/aquilon-dlp/aquilon_dlp.db /backup/aquilon-dlp/$(date +%Y%m%d)/

# Restart service
sudo systemctl start aquilon-dlp

Complete Backup Script

Create /usr/local/bin/aquilon-dlp-backup.sh:

#!/bin/bash
set -e

BACKUP_DIR="/backup/aquilon-dlp/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$BACKUP_DIR"

echo "Backing up Aquilon DLP to $BACKUP_DIR"

# Configuration
cp /etc/aquilon-dlp/aquilon_dlp_config.toml "$BACKUP_DIR/"
cp /etc/aquilon-dlp/retention_config.toml "$BACKUP_DIR/" 2>/dev/null || true
cp -r /etc/aquilon-dlp/policies/ "$BACKUP_DIR/" 2>/dev/null || true

# Database (hot backup)
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db ".backup $BACKUP_DIR/aquilon_dlp.db"

# Verify
sqlite3 "$BACKUP_DIR/aquilon_dlp.db" "PRAGMA integrity_check;" > "$BACKUP_DIR/integrity.txt"

# Compress
tar -czf "$BACKUP_DIR.tar.gz" -C "$(dirname $BACKUP_DIR)" "$(basename $BACKUP_DIR)"
rm -rf "$BACKUP_DIR"

echo "Backup complete: $BACKUP_DIR.tar.gz"

Automated Backups

Add to crontab:

# Daily backup at 2 AM
0 2 * * * /usr/local/bin/aquilon-dlp-backup.sh >> /var/log/aquilon-dlp-backup.log 2>&1

Restore Procedures

Configuration Restore

# Stop service
sudo systemctl stop aquilon-dlp

# Restore configuration
cp /backup/aquilon-dlp/20240115/aquilon_dlp_config.toml /etc/aquilon-dlp/

# Restore custom policies (if any)
cp -r /backup/aquilon-dlp/20240115/policies/ /etc/aquilon-dlp/ 2>/dev/null || true

# Validate configuration
aquilon-dlp --validate-config /etc/aquilon-dlp/aquilon_dlp_config.toml

# Restart service
sudo systemctl start aquilon-dlp

Database Restore

# Stop service
sudo systemctl stop aquilon-dlp

# Backup current database (in case restore fails)
cp /var/lib/aquilon-dlp/aquilon_dlp.db /var/lib/aquilon-dlp/aquilon_dlp.db.bak

# Restore from backup
cp /backup/aquilon-dlp/20240115/aquilon_dlp.db /var/lib/aquilon-dlp/

# Verify restored database
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "PRAGMA integrity_check;"

# Restart service
sudo systemctl start aquilon-dlp

# Verify service health
sleep 5
systemctl status aquilon-dlp

Complete Restore Script

#!/bin/bash
set -e

BACKUP_FILE="$1"

if [ -z "$BACKUP_FILE" ]; then
    echo "Usage: $0 /path/to/backup.tar.gz"
    exit 1
fi

echo "Restoring from $BACKUP_FILE"

# Extract backup
TEMP_DIR=$(mktemp -d)
tar -xzf "$BACKUP_FILE" -C "$TEMP_DIR"
BACKUP_DIR=$(ls "$TEMP_DIR")

# Stop service
sudo systemctl stop aquilon-dlp

# Backup current state
mkdir -p /backup/aquilon-dlp/pre-restore
cp /etc/aquilon-dlp/aquilon_dlp_config.toml /backup/aquilon-dlp/pre-restore/
cp /var/lib/aquilon-dlp/aquilon_dlp.db /backup/aquilon-dlp/pre-restore/

# Restore configuration
cp "$TEMP_DIR/$BACKUP_DIR/aquilon_dlp_config.toml" /etc/aquilon-dlp/

# Restore database
cp "$TEMP_DIR/$BACKUP_DIR/aquilon_dlp.db" /var/lib/aquilon-dlp/

# Verify
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "PRAGMA integrity_check;"
aquilon-dlp --validate-config /etc/aquilon-dlp/aquilon_dlp_config.toml

# Cleanup
rm -rf "$TEMP_DIR"

# Restart service
sudo systemctl start aquilon-dlp

echo "Restore complete"

Verification

Post-Restore Checklist

  • Service starts successfully
  • Configuration validates without errors
  • Database integrity check passes
  • OSQuery tables return data
  • New findings are being generated

Verification Queries

-- Check database has data
SELECT COUNT(*) as total_alerts FROM aquilon_dlp_alerts;

-- Check recent activity
SELECT MAX(timestamp) as last_alert, COUNT(*) as total
FROM aquilon_dlp_alerts;

Log Review

After restore, check logs for errors:

# Linux
sudo journalctl -u aquilon-dlp -n 50 --no-pager

# macOS
tail -50 /var/log/aquilon-dlp/stderr.log

Retention Policy

Backup Retention

Recommended retention schedule:

Backup TypeRetention
Daily7 days
Weekly4 weeks
Monthly12 months

Cleanup Script

#!/bin/bash
BACKUP_DIR="/backup/aquilon-dlp"

# Remove backups older than 7 days
find "$BACKUP_DIR" -name "*.tar.gz" -mtime +7 -delete

echo "Cleaned up old backups"

Cloud Backup

AWS S3

# Upload to S3
aws s3 cp /backup/aquilon-dlp/20240115.tar.gz s3://my-bucket/aquilon-dlp/

# Restore from S3
aws s3 cp s3://my-bucket/aquilon-dlp/20240115.tar.gz /tmp/
./restore.sh /tmp/20240115.tar.gz

Azure Blob

# Upload to Azure
az storage blob upload \
  --container-name backups \
  --file /backup/aquilon-dlp/20240115.tar.gz \
  --name aquilon-dlp/20240115.tar.gz

Disaster Recovery

Planning and procedures for recovering Aquilon DLP in disaster scenarios.

Recovery Planning

Recovery Objectives

MetricTargetDescription
RTO (Recovery Time Objective)1 hourTime to restore service
RPO (Recovery Point Objective)24 hoursMaximum data loss acceptable

Critical Components

ComponentRecovery PriorityNotes
ConfigurationP1Required for service start
Service binaryP1Application itself
DatabaseP2Historical findings
CacheP3Can be rebuilt

Disaster Scenarios

Scenario 1: Single Endpoint Failure

Symptoms: Service down on one machine

Recovery:

  1. Restore from backup (see Backup & Restore)
  2. Or reinstall and reconfigure
# Restore configuration
cp /backup/aquilon-dlp/latest/aquilon_dlp_config.toml /etc/aquilon-dlp/

# Start service
sudo systemctl start aquilon-dlp

# Verify
sudo systemctl status aquilon-dlp

Scenario 2: Database Corruption

Symptoms: Service fails to start with database errors

Recovery:

# Stop service
sudo systemctl stop aquilon-dlp

# Check corruption
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "PRAGMA integrity_check;"

# If corrupted, restore from backup
cp /backup/aquilon-dlp/latest/aquilon_dlp.db /var/lib/aquilon-dlp/

# If no backup, recreate (loses history)
rm /var/lib/aquilon-dlp/aquilon_dlp.db
sudo systemctl start aquilon-dlp  # Creates new database

Scenario 3: Configuration Loss

Symptoms: Invalid or missing configuration

Recovery:

# Restore from backup
cp /backup/aquilon-dlp/latest/aquilon_dlp_config.toml /etc/aquilon-dlp/

# Or download default
curl -o /etc/aquilon-dlp/aquilon_dlp_config.toml \
  https://raw.githubusercontent.com/aquilonsecurity/aquilon-dlp/main/docs/config-examples/aquilon_dlp_config_enterprise.toml

# Validate
aquilon-dlp --validate-config /etc/aquilon-dlp/aquilon_dlp_config.toml

# Restart
sudo systemctl start aquilon-dlp

Scenario 4: Fleet-Wide Outage

Symptoms: Multiple endpoints affected

Recovery:

  1. Identify root cause (bad update, configuration push, etc.)
  2. Prepare fix (rollback version, configuration fix)
  3. Deploy fix via MDM or configuration management
  4. Monitor recovery

Version Rollback

Download Previous Version

Download the previous version from the Aquilon Security portal and save to /tmp/aquilon-dlp-previous.

Rollback Procedure

# Stop current service
sudo systemctl stop aquilon-dlp

# Backup current binary
cp /usr/local/bin/aquilon-dlp-enterprise /usr/local/bin/aquilon-dlp-enterprise.bak

# Install previous version
cp /tmp/aquilon-dlp-previous /usr/local/bin/aquilon-dlp-enterprise
chmod +x /usr/local/bin/aquilon-dlp-enterprise

# Restart
sudo systemctl start aquilon-dlp

# Verify version
aquilon-dlp-enterprise --version

MDM Rollback

  1. Upload previous version to MDM
  2. Deploy to affected endpoints
  3. Monitor deployment status

Recovery Procedures

Minimal Recovery (Configuration Only)

Fastest recovery - loses historical data but restores monitoring:

  1. Download fresh binary from the Aquilon Security portal
  2. Install to /usr/local/bin/aquilon-dlp-enterprise
  3. Restore configuration from backup or use default
  4. Restart aquilon-dlp service
# Restore configuration from backup
cp /backup/aquilon-dlp/latest/aquilon_dlp_config.toml /etc/aquilon-dlp/

# Restart service
sudo systemctl restart aquilon-dlp

Full Recovery (With History)

Complete recovery with all historical data:

# 1. Install binary from Aquilon Security portal
# Save to: /usr/local/bin/aquilon-dlp-enterprise

# 2. Restore from backup
tar -xzf /backup/aquilon-dlp/latest.tar.gz -C /tmp/
cp /tmp/backup/aquilon_dlp_config.toml /etc/aquilon-dlp/
cp /tmp/backup/aquilon_dlp.db /var/lib/aquilon-dlp/

# 3. Verify integrity
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "PRAGMA integrity_check;"

# 4. Restart service
sudo systemctl restart aquilon-dlp

macOS Recovery

FDA Re-grant After Recovery

After recovery on macOS, FDA may need re-granting:

  1. Check profile:

    sudo profiles list | grep aquilon
    
  2. If missing, redeploy PPPC profile via MDM

  3. Reinstall app:

    sudo rm -rf /Library/Application\ Support/aquilon-dlp.app
    # MDM will reinstall on next check-in
    
  4. Verify FDA:

    sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
      "SELECT auth_value FROM access WHERE client = 'dev.aquilon.dlp-plugin';"
    

Verification

Post-Recovery Checklist

  • Service running: systemctl status aquilon-dlp
  • Configuration valid: –validate-config
  • Database accessible: OSQuery tables return data
  • Findings generating: New alerts appearing
  • Monitoring active: Prometheus metrics available
  • macOS: FDA granted (if applicable)

Recovery Test Queries

-- Service health - verify table exists
SELECT COUNT(*) as total_alerts FROM aquilon_dlp_alerts;

-- Recent activity
SELECT COUNT(*) as alerts_24h
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400);

-- Alert breakdown
SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY severity;

Automated Recovery

Systemd Auto-Restart

Configure in service file:

[Service]
Restart=on-failure
RestartSec=10s
StartLimitBurst=5
StartLimitIntervalSec=60s

Health Check Script

#!/bin/bash
# /usr/local/bin/aquilon-dlp-healthcheck.sh

if ! systemctl is-active --quiet aquilon-dlp; then
    echo "Service down, attempting restart"
    systemctl start aquilon-dlp
    sleep 10

    if ! systemctl is-active --quiet aquilon-dlp; then
        echo "CRITICAL: Service failed to start"
        # Send alert to monitoring system
        exit 1
    fi
fi

exit 0

Add to crontab:

*/5 * * * * /usr/local/bin/aquilon-dlp-healthcheck.sh

Communication Plan

During Outage

  1. Notify security team of reduced DLP coverage
  2. Update incident ticket
  3. Monitor recovery progress

Post-Recovery

  1. Verify all endpoints recovered
  2. Check for data gaps in findings
  3. Document root cause
  4. Update runbooks if needed

Prevention

Regular Testing

  • Monthly: Test restore from backup
  • Quarterly: Full DR drill
  • Annually: Review and update DR plan

Monitoring

Set up alerts for:

  • Service down
  • Database corruption
  • Configuration validation failures
  • Scan rate drops

Architecture

This page provides an architectural overview of Aquilon DLP, including system details, component architecture, and deployment topologies.

System Context

Aquilon DLP operates within an enterprise security ecosystem, integrating with OSQuery for system monitoring and exposing findings to SIEM systems for alerting and compliance reporting.

graph TB
    subgraph "Enterprise Environment"
        SA[Security Analysts]
        SysAdmin[System Administrators]
        SIEM[SIEM/Alerting System]

        subgraph "Monitored System"
            FS[File System]
            OSQ[OSQuery]
            AquilonDLP[Aquilon DLP]
        end
    end

    SA -->|Query Alerts| OSQ
    SA -->|Review Findings| SIEM
    SysAdmin -->|Configure| AquilonDLP

    FS -->|File Events| AquilonDLP
    AquilonDLP -->|Scan Results| OSQ
    OSQ -->|Export| SIEM

    style AquilonDLP fill:#4a90e2,stroke:#2e5c8a,stroke-width:3px,color:#fff
    style OSQ fill:#7cb342,stroke:#558b2f,stroke-width:2px,color:#fff
    style SIEM fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff

Key Interactions:

  1. File System Monitoring: Aquilon DLP monitors directories for new/modified files
  2. Scan and Detect: Files are parsed, decompressed (if needed), and scanned for sensitive data
  3. OSQuery Integration: Findings exposed via aquilon_dlp_alerts and related tables
  4. SIEM Export: OSQuery exports alerts to enterprise SIEM systems
  5. Analyst Queries: Security analysts query findings via OSQuery or SIEM dashboards

Component Architecture

Aquilon DLP uses a plugin-based architecture with three primary layers: Scanner Engine, File Handler Layer, and Policy Engine.

graph TB
    subgraph "Aquilon DLP Core"
        FW[File Watcher]
        FH[File Handler Layer]
        SE[Scanner Engine]
        PE[Policy Engine]
        DB[(SQLite Cache)]
        OSE[OSQuery Extension Interface]
    end

    subgraph "File Handlers (9 Formats)"
        ZIP[ZIP Handler]
        TAR[TAR Handler]
        GZIP[GZIP Handler]
        PDF[PDF Handler]
        DOCX[DOCX Handler]
        XLSX[XLSX Handler]
        SEVEN[7-Zip Handler]
        RAR[RAR Handler]
        TEXT[Text Handler]
    end

    subgraph "Scanner Plugins (50+ Scanners)"
        SSN[SSN Scanner]
        CC[Credit Card Scanner]
        EMAIL[Email Scanner]
        PHONE[Phone Scanner]
        PASSPORT[Passport Scanner]
        NATID[National ID Scanners]
        MORE[... 40+ more scanners]
    end

    subgraph "Policy Frameworks"
        HIPAA[HIPAA Framework]
        PCI[PCI DSS Framework]
        GDPR[GDPR Framework]
        CCPA[CCPA Framework]
        SOX[SOX Framework]
        ISO[ISO 27001 Framework]
    end

    FW -->|New/Modified Files| FH
    FH --> ZIP
    FH --> TAR
    FH --> GZIP
    FH --> PDF
    FH --> DOCX
    FH --> XLSX
    FH --> SEVEN
    FH --> RAR
    FH --> TEXT

    ZIP -->|Extracted Text| SE
    TAR -->|Extracted Text| SE
    GZIP -->|Decompressed Text| SE
    PDF -->|Extracted Text| SE
    DOCX -->|Extracted Text| SE
    XLSX -->|Extracted Text| SE
    SEVEN -->|Extracted Text| SE
    RAR -->|Extracted Text| SE
    TEXT -->|Raw Text| SE

    SE --> SSN
    SE --> CC
    SE --> EMAIL
    SE --> PHONE
    SE --> PASSPORT
    SE --> NATID
    SE --> MORE

    SSN -->|Findings| PE
    CC -->|Findings| PE
    EMAIL -->|Findings| PE
    PHONE -->|Findings| PE
    PASSPORT -->|Findings| PE
    NATID -->|Findings| PE
    MORE -->|Findings| PE

    PE --> HIPAA
    PE --> PCI
    PE --> GDPR
    PE --> CCPA
    PE --> SOX
    PE --> ISO

    HIPAA -->|Violations| DB
    PCI -->|Violations| DB
    GDPR -->|Violations| DB
    CCPA -->|Violations| DB
    SOX -->|Violations| DB
    ISO -->|Violations| DB

    DB -->|Query Interface| OSE

    style FW fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
    style SE fill:#7cb342,stroke:#558b2f,stroke-width:2px,color:#fff
    style PE fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff
    style DB fill:#9c27b0,stroke:#6a1b9a,stroke-width:2px,color:#fff

Layer Descriptions:

File Handler Layer (9 Handlers)

Processes various file formats and containers:

  • Archive Handlers: ZIP, TAR, GZIP, 7-Zip, RAR (recursive extraction)
  • Document Handlers: PDF, DOCX, XLSX (text extraction)
  • Text Handler: Plain text, source code, config files

Key Feature: Recursive descent into nested archives (e.g., ZIP inside TAR inside GZIP)

Scanner Engine (50+ Plugins)

Detects sensitive data patterns:

  • National ID Scanners: 28 country-specific national IDs (EU, Americas, Asia-Pacific, Middle East)
  • Identity Scanners: SSN, passport, driver’s license
  • Financial Scanners: Credit cards, bank accounts, IBAN
  • Healthcare Scanners: Medical record numbers, NPI, MBI
  • Contact Scanners: Emails, phone numbers, physical addresses
  • Credential Scanners: API keys, tokens, crypto keys, database connections

Key Feature: Stream-based scanning with O(1) memory usage (constant memory regardless of file size)

Policy Engine (6 Frameworks)

Maps findings to compliance requirements:

  • 🏢 HIPAA: Healthcare PHI detection (Enterprise only)
  • 🏢 PCI DSS: Payment card data (Enterprise only)
  • 🏢 SOX: Financial data (Enterprise only)
  • 🏢 ISO 27001: Information security (Enterprise only)
  • GDPR: EU personal data (All editions)
  • CCPA: California consumer data (All editions)

Key Feature: Multi-framework evaluation (single file can trigger multiple policy violations)

Database Cache (SQLite)

Stores scan results with:

  • Hash-based deduplication: Skip rescanning unchanged files
  • Metadata indexing: Fast lookups by path, policy, severity
  • Retention policies: Configurable cleanup of old findings

Performance: 5.4M operations/sec query throughput


Deployment Topology

Aquilon DLP supports multiple deployment models depending on organizational needs.

graph TB
    subgraph "Single-Node Deployment"
        subgraph "Host System"
            FS1[File System]
            OSQ1[OSQuery]
            AQD1[Aquilon DLP]
            DB1[(SQLite Cache)]
        end

        FS1 --> AQD1
        AQD1 --> DB1
        DB1 --> OSQ1
        OSQ1 -->|Export| SIEM1[SIEM/Alerting]
    end

    subgraph "Enterprise Deployment (Distributed)"
        subgraph "Fleet (100s-1000s of hosts)"
            subgraph "Host 1"
                FS2[File System]
                OSQ2[OSQuery]
                AQD2[Aquilon DLP]
                DB2[(Cache)]
            end

            subgraph "Host 2"
                FS3[File System]
                OSQ3[OSQuery]
                AQD3[Aquilon DLP]
                DB3[(Cache)]
            end

            subgraph "Host N"
                FS4[File System]
                OSQ4[OSQuery]
                AQD4[Aquilon DLP]
                DB4[(Cache)]
            end
        end

        subgraph "Central Infrastructure"
            FLEET[OSQuery Fleet Manager]
            CENTRAL_SIEM[Central SIEM]
            DASHBOARD[Compliance Dashboard]
        end

        OSQ2 --> FLEET
        OSQ3 --> FLEET
        OSQ4 --> FLEET

        FLEET --> CENTRAL_SIEM
        CENTRAL_SIEM --> DASHBOARD
    end

    subgraph "🍎 MDM Deployment (macOS)"
        subgraph "MDM System"
            JAMF[Jamf Pro / Intune / Kandji]
            CONFIG[Configuration Profiles]
            PKG[Aquilon DLP PKG]
        end

        subgraph "macOS Fleet"
            MAC1[MacBook 1]
            MAC2[MacBook 2]
            MACN[MacBook N]
        end

        JAMF -->|Deploy PKG| MAC1
        JAMF -->|Deploy PKG| MAC2
        JAMF -->|Deploy PKG| MACN

        CONFIG -->|Full Disk Access| MAC1
        CONFIG -->|Full Disk Access| MAC2
        CONFIG -->|Full Disk Access| MACN
    end

    style AQD1 fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
    style AQD2 fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
    style AQD3 fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
    style AQD4 fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff

Deployment Models:

Single-Node Deployment

Best for:

  • Small teams (< 5 servers for Basic Edition)
  • Development/staging environments
  • Proof-of-concept deployments

Architecture:

  • Aquilon DLP runs on each monitored system
  • Local SQLite cache stores findings
  • OSQuery exposes findings locally
  • Optional SIEM export for centralized alerting

Setup Time: ~5 minutes per host

Enterprise Deployment (Distributed)

Best for:

  • Large organizations (100s-1000s of hosts)
  • Multi-site deployments
  • Compliance-driven environments (healthcare, finance)

Architecture:

  • Aquilon DLP deployed on every monitored host
  • OSQuery Fleet Manager aggregates findings across fleet
  • Central SIEM processes alerts and generates compliance reports
  • Compliance Dashboard provides executive visibility

Key Features:

  • Unlimited server licensing (Enterprise Edition)
  • All policy frameworks (HIPAA, PCI DSS, SOX, ISO 27001, GDPR, CCPA)
  • Enterprise support with 4-hour SLA for critical issues

🍎 MDM Deployment (macOS)

Best for:

  • macOS fleet management (Enterprise Edition only)
  • Organizations using Jamf Pro, Microsoft Intune, or Kandji
  • Zero-touch deployment for new devices

Architecture:

  • PKG installer deployed via MDM system
  • Configuration profiles grant Full Disk Access
  • Launch Daemon ensures Aquilon DLP starts on boot
  • Integration with OSQuery for monitoring

Key Features:

  • Automated deployment to 100s-1000s of Macs
  • Centralized configuration management
  • Native Endpoint Security integration
  • User-transparent operation

Data Flow

Understanding how data flows through Aquilon DLP:

1. File Monitoring

macOS (Enterprise Edition):

  • Native Endpoint Security API monitors file system events
  • Events filtered by watch paths and exclusions
  • New/modified files queued for scanning

Linux (All Editions):

  • inotify-based file system monitoring
  • Recursive directory watching with pattern matching
  • Event deduplication to prevent scan storms

2. File Processing

File Detected → File Handler Selection → Format Processing → Text Extraction

Handler Selection:

  • Based on file extension and magic number detection
  • Archive handlers recursively process nested containers
  • Document handlers extract text from structured formats
  • Text handler processes plain text files directly

Example Flow (nested archive):

report.zip → ZIP Handler
  ├─ data.tar → TAR Handler
  │   ├─ records.txt → Text Handler → Scanner Engine
  │   └─ patient.pdf → PDF Handler → Scanner Engine
  └─ summary.docx → DOCX Handler → Scanner Engine

3. Scanning

Stream-Based Processing:

  • Text streamed to scanner plugins (not loaded into memory)
  • All 50+ scanners run concurrently on same stream
  • Constant O(1) memory usage regardless of file size
  • 5.4M operations/sec throughput

Finding Generation:

  • Each scanner reports matches with details (line number, surrounding text)
  • Metadata captured: file path, scanner type, confidence score
  • Findings passed to Policy Engine for evaluation

4. Policy Evaluation

Framework Matching:

  • Each finding evaluated against enabled policy frameworks
  • HIPAA: Checks for PHI patterns (SSN + medical details)
  • PCI DSS: Validates credit card numbers with checksums
  • GDPR/CCPA: Identifies EU/CA personal data
  • SOX: Detects financial records requiring retention
  • ISO 27001: Flags sensitive information assets

Severity Assignment:

  • Critical: SSN, credit cards, passport numbers
  • High: Email addresses, phone numbers (in sensitive contexts)
  • Medium: Generic PII without strong identifiers
  • Low: Informational findings (email domains)

5. Storage and Exposure

SQLite Cache:

  • Hash-based deduplication (skip unchanged files)
  • Indexed by path, policy, severity, timestamp
  • Configurable retention (default 90 days)
  • Vacuum and optimization on schedule

OSQuery Tables:

  • aquilon_dlp_alerts: Findings with policy violations, triage status, and metadata

SIEM Export:

  • OSQuery scheduled queries export to SIEM
  • JSON format with full details
  • Configurable alert thresholds and grouping
  • Integration with Splunk, Elasticsearch, QRadar, etc.

Performance Characteristics

Aquilon DLP is optimized for production workloads with minimal system impact:

Memory Usage

  • O(1) Memory: Constant memory regardless of file size
  • Stream Processing: Files scanned incrementally (no full load)
  • Typical Usage: 50-150MB per process (depending on plugin count)
  • Archive Handling: Temporary extraction cleaned up immediately

Throughput

  • Scanner Engine: 5.4M operations/sec (single-threaded)
  • File Processing: Limited by disk I/O (linear scaling)
  • Concurrent Scanning: Configurable worker pool (default 4 workers)
  • Archive Decompression: Streamed (no disk spooling for small files)

Latency

  • Small Files (< 1MB): Sub-millisecond scan time
  • Medium Files (1-100MB): Milliseconds to seconds
  • Large Files (> 100MB): Seconds (configurable skip threshold)
  • Archives: Proportional to number of contained files

Optimization Strategies

Cache Hit Rate:

  • Hash-based deduplication: ~85-95% cache hits in typical environments
  • Skip rescanning unchanged files
  • Invalidation on modification timestamp change

Exclusion Patterns:

  • Exclude high-churn directories (caches, temp files)
  • Skip binary-only files (executables, images)
  • Configurable max file size (default 100MB)

Worker Tuning:

  • Adjust num_workers based on CPU cores
  • Default 4 workers balances throughput and system impact
  • Increase for I/O-bound workloads, decrease for CPU-constrained systems

Plugin Architecture

Aquilon DLP’s extensibility comes from its plugin-based design:

Scanner Plugin Interface

All scanners implement the StreamScanner trait:

#![allow(unused)]
fn main() {
pub trait StreamScanner {
    fn scan(&self, content: &str) -> anyhow::Result<Vec<Finding>>;
    fn scanner_type(&self) -> &str;
}
}

Benefits:

  • New scanners added without modifying core engine
  • Independent testing and versioning
  • Community contributions possible

File Handler Plugin Interface

Handlers implement the FileHandler trait:

#![allow(unused)]
fn main() {
pub trait FileHandler {
    fn can_handle(&self, path: &Path) -> bool;
    fn process(&self, path: &Path) -> anyhow::Result<Vec<String>>;
}
}

Benefits:

  • Support new formats without core changes
  • Recursive container handling (archives in archives)
  • Fallback to text handler if format unknown

Policy Framework Interface

Frameworks implement the PolicyFramework trait:

#![allow(unused)]
fn main() {
pub trait PolicyFramework {
    fn evaluate(&self, findings: &[Finding]) -> Vec<PolicyViolation>;
    fn framework_name(&self) -> &str;
}
}

Benefits:

  • Custom compliance frameworks
  • Org-specific rules via TOML policies
  • Combine multiple frameworks (e.g., HIPAA + PCI DSS)

Security Considerations

Data Handling

  • No External Transmission: All scanning happens locally
  • Local Cache Only: Findings stored in local SQLite database
  • Configurable Retention: Auto-delete old findings (compliance requirement)
  • Access Control: Cache file permissions restrict to root/admin

macOS Endpoint Security

🍎 Enterprise Edition:

  • Native Endpoint Security framework (requires entitlements)
  • System Extension approval required
  • Full Disk Access permission for comprehensive monitoring
  • Code signed and notarized for enterprise deployment

Linux Security

  • inotify Limits: Configurable watch limits (sysctl tuning)
  • File Permissions: Respects existing file ACLs
  • Systemd Integration: Runs as systemd service with restart policies
  • SELinux Support: Compatible with enforcing mode (policy module available)

Next Steps

OSQuery Integration

Aquilon DLP exposes security findings through OSQuery virtual tables. This guide covers the available tables, column schemas, query examples, and alert triage workflows.

Overview

Aquilon DLP registers as an OSQuery extension, providing custom tables that can be queried using standard SQL. All interaction with Aquilon DLP data occurs through OSQuery queries.

Prerequisites

  • OSQuery installed and running
  • Aquilon DLP extension loaded (automatic with package installation)

aquilon_dlp_alerts Table

The primary table for accessing DLP findings and managing alert triage.

Column Reference

ColumnTypeDescription
idTEXTUUID (finding_id) for row identification
timestampBIGINTUnix timestamp of detection
pathTEXTFull path to the file containing the finding
scannerTEXTScanner that detected the data (e.g., ssn, credit_card, iban)
severityTEXTAlert severity: critical, high, medium, low, info
policyTEXTPolicy that generated the violation (e.g., HIPAA, PCI, GDPR)
data_typeTEXTCategory of sensitive data detected
patternTEXTPattern or regex that matched
confidenceINTEGERScanner confidence (0-100)
match_countINTEGERNumber of matches found in file
frameworksTEXTApplicable compliance frameworks
triage_statusTEXTTriage state: new, acknowledged, resolved, ignored
triageTEXTJSON object with triage details (see below)
contextTEXTJSON object with file metadata and context (see below)

JSON Columns

The triage and context columns contain JSON data for flexible querying.

triage Column

Contains triage workflow information:

{
  "owner": "analyst@company.com",
  "comment": "False positive - test data file",
  "timestamp": 1727794245
}

Empty fields are omitted. An alert with no triage data has an empty object: {}

context Column

Contains file metadata, text snippets, and container information:

{
  "snippet": "...text around the match...",
  "keywords": ["ssn", "pii"],
  "file": {
    "hash": "d7c4529ffe273e1dc...",
    "size": 20666,
    "container": {
      "path": "archive.zip/inner.txt",
      "depth": 1
    }
  },
  "metadata": {"gdpr_article": "Article-4"}
}

The container object is only present when the finding is inside an archive (depth > 0).

Querying JSON Columns

Use SQLite JSON_EXTRACT to query JSON fields:

-- Extract file hash from context
SELECT path, JSON_EXTRACT(context, '$.file.hash') as file_hash
FROM aquilon_dlp_alerts
LIMIT 5;
-- Filter by file size
SELECT path, scanner, JSON_EXTRACT(context, '$.file.size') as size
FROM aquilon_dlp_alerts
WHERE CAST(JSON_EXTRACT(context, '$.file.size') AS INTEGER) > 10000;
-- Find findings inside containers
SELECT path, JSON_EXTRACT(context, '$.file.container.path') as container_path
FROM aquilon_dlp_alerts
WHERE JSON_EXTRACT(context, '$.file.container.depth') > 0;
-- Query triage owner
SELECT path, JSON_EXTRACT(triage, '$.owner') as owner
FROM aquilon_dlp_alerts
WHERE JSON_EXTRACT(triage, '$.owner') IS NOT NULL;

Basic Queries

View Recent Alerts

-- All alerts from last 24 hours
SELECT path, scanner, severity, policy, timestamp
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400)
ORDER BY timestamp DESC;

Filter by Severity

-- Critical and high severity alerts
SELECT path, scanner, policy, confidence, match_count
FROM aquilon_dlp_alerts
WHERE severity IN ('critical', 'high')
ORDER BY severity, timestamp DESC;

Filter by Policy

-- HIPAA violations
SELECT path, scanner, severity, data_type
FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA'
ORDER BY timestamp DESC;

-- PCI DSS violations
SELECT path, scanner, severity, match_count
FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
ORDER BY timestamp DESC;

Group by Scanner Type

-- Count findings by scanner
SELECT scanner, COUNT(*) as count,
       AVG(confidence) as avg_confidence
FROM aquilon_dlp_alerts
GROUP BY scanner
ORDER BY count DESC;

Files with Multiple Finding Types

-- Files containing multiple types of sensitive data
SELECT path,
       GROUP_CONCAT(DISTINCT scanner) as scanners,
       COUNT(*) as total_findings
FROM aquilon_dlp_alerts
GROUP BY path
HAVING COUNT(DISTINCT scanner) > 1
ORDER BY total_findings DESC
LIMIT 20;

Container/Archive Findings

-- Findings within archives (ZIP, TAR, etc.)
SELECT path,
       JSON_EXTRACT(context, '$.file.container.path') as container_path,
       JSON_EXTRACT(context, '$.file.container.depth') as container_depth,
       scanner, severity
FROM aquilon_dlp_alerts
WHERE JSON_EXTRACT(context, '$.file.container.depth') > 0
ORDER BY container_depth DESC, timestamp DESC;

Compliance Reporting

HIPAA PHI Summary

SELECT
  scanner,
  severity,
  COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA'
GROUP BY scanner, severity
ORDER BY
  CASE severity
    WHEN 'critical' THEN 1
    WHEN 'high' THEN 2
    WHEN 'medium' THEN 3
    ELSE 4
  END;

PCI DSS Cardholder Data

SELECT path, scanner, match_count, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
  AND scanner IN ('credit_card', 'magnetic_stripe', 'cvv')
ORDER BY timestamp DESC;

GDPR Personal Data

SELECT
  scanner,
  COUNT(*) as exposures,
  COUNT(DISTINCT path) as files_affected
FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
GROUP BY scanner
ORDER BY exposures DESC;

Alert Triage

The aquilon_dlp_alerts table supports UPDATE operations for managing alert lifecycle. This allows security analysts to acknowledge, investigate, and resolve alerts directly through OSQuery.

Triage Status Values

StatusDescription
newJust detected, needs review (default)
acknowledgedAnalyst is investigating
resolvedIssue has been handled
ignoredIntentionally skipped (false positive, acceptable risk)

Updating Triage Status

Use OSQuery UPDATE statements to manage alert triage. The triage_status column is a flat column, while triage is a JSON column containing owner, comment, and timestamp.

Acknowledge an Alert

UPDATE aquilon_dlp_alerts
SET triage_status = 'acknowledged',
    triage = JSON_OBJECT('owner', 'analyst@company.com', 'comment', 'Investigating potential data exposure')
WHERE path = '/data/reports/customer_export.csv'
  AND scanner = 'ssn';

Resolve an Alert

UPDATE aquilon_dlp_alerts
SET triage_status = 'resolved',
    triage = JSON_OBJECT('owner', JSON_EXTRACT(triage, '$.owner'), 'comment', 'File moved to secure location and access restricted')
WHERE path = '/data/reports/customer_export.csv'
  AND scanner = 'ssn';

Mark as False Positive

UPDATE aquilon_dlp_alerts
SET triage_status = 'ignored',
    triage = JSON_OBJECT('owner', 'security-team', 'comment', 'False positive - test data file with synthetic SSNs')
WHERE path = '/test/fixtures/sample_data.txt';

Bulk Triage by Policy

-- Acknowledge all new PCI alerts for investigation
UPDATE aquilon_dlp_alerts
SET triage_status = 'acknowledged',
    triage = JSON_OBJECT('owner', 'pci-compliance-team')
WHERE policy = 'PCI_DSS'
  AND triage_status = 'new';

Triage Workflow Queries

Alerts Needing Review

-- New alerts requiring triage
SELECT path, scanner, severity, policy, timestamp
FROM aquilon_dlp_alerts
WHERE triage_status = 'new'
ORDER BY
  CASE severity
    WHEN 'critical' THEN 1
    WHEN 'high' THEN 2
    ELSE 3
  END,
  timestamp DESC;

My Assigned Alerts

-- Alerts assigned to specific analyst
SELECT path, scanner, severity, triage_status,
       JSON_EXTRACT(triage, '$.comment') as triage_comment
FROM aquilon_dlp_alerts
WHERE JSON_EXTRACT(triage, '$.owner') = 'analyst@company.com'
  AND triage_status IN ('new', 'acknowledged')
ORDER BY timestamp DESC;

Triage Summary

-- Overview of triage status
SELECT
  triage_status,
  COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY triage_status
ORDER BY
  CASE triage_status
    WHEN 'new' THEN 1
    WHEN 'acknowledged' THEN 2
    WHEN 'resolved' THEN 3
    WHEN 'ignored' THEN 4
  END;

Recently Resolved

-- Alerts resolved in last 7 days
SELECT path, scanner,
       JSON_EXTRACT(triage, '$.owner') as triage_owner,
       JSON_EXTRACT(triage, '$.comment') as triage_comment,
       JSON_EXTRACT(triage, '$.timestamp') as triage_timestamp
FROM aquilon_dlp_alerts
WHERE triage_status = 'resolved'
  AND JSON_EXTRACT(triage, '$.timestamp') > (strftime('%s', 'now') - 604800)
ORDER BY triage_timestamp DESC;

Triage Notes

  • The triage.timestamp is automatically set when you update triage fields
  • INSERT and DELETE operations are not supported - alerts are generated only by the scanner
  • Triage updates persist in the SQLite database
  • Multiple alerts for the same file/scanner combination can be updated individually or in bulk

aquilon_config Table (Enterprise)

Enterprise Only: This table is only available in Aquilon DLP Enterprise edition.

The configuration table exposes all Aquilon DLP settings as queryable rows with a simple key-value schema. Scalar values are stored as strings, and arrays are stored as JSON arrays.

Column Reference

ColumnTypeDescription
keyTEXTConfiguration key (dot-notation, e.g., scan.max_scan_size_mb)
valueTEXTCurrent value (scalars as strings, arrays as JSON arrays)

Basic Queries

View All Configuration

-- View all configuration keys and their current values
SELECT key, value
FROM aquilon_config
ORDER BY key;

Query Scan Settings

-- All scan-related configuration
SELECT key, value
FROM aquilon_config
WHERE key LIKE 'scan.%'
ORDER BY key;

View Array Configuration (as JSON)

Arrays are stored as JSON arrays. Use json_each() to expand them:

-- View array keys (value contains JSON array)
SELECT key, value
FROM aquilon_config
WHERE key IN ('watch_paths', 'exclude_paths', 'policies.enabled_policies')
ORDER BY key;
-- Expand array to individual rows
SELECT c.key, j.value AS item
FROM aquilon_config c, json_each(c.value) j
WHERE c.key = 'watch_paths';

Check Specific Setting

-- Get current value of a specific setting
SELECT key, value
FROM aquilon_config
WHERE key = 'scan.max_scan_size_mb';

Modifying Configuration

The aquilon_config table supports runtime modifications for mutable settings. Changes take effect immediately and persist to the configuration file.

Operation Rules:

  • UPDATE: Set value for any mutable key (scalars as strings, arrays as JSON)
  • DELETE: Reset mutable key to its compiled default value
  • INSERT: Not supported (keys are predefined)
  • Immutable keys: Cannot be modified at runtime

Update Settings

Use UPDATE to modify any mutable configuration value:

-- Update max scan size (integer setting)
UPDATE aquilon_config
SET value = '100'
WHERE key = 'scan.max_scan_size_mb';
-- Enable/disable cache (boolean setting)
UPDATE aquilon_config
SET value = 'false'
WHERE key = 'cache.enabled';
-- Update CPU limit (float setting)
UPDATE aquilon_config
SET value = '75.5'
WHERE key = 'resource_limits.max_cpu_percent';

Update Array Settings

Arrays are stored as JSON. Use UPDATE with a JSON array value:

-- Set exclusion paths (replaces entire array)
UPDATE aquilon_config
SET value = '["/var/log/*", "/tmp/*", "*.swp"]'
WHERE key = 'exclude_paths';
-- Set watch paths
UPDATE aquilon_config
SET value = '["/home/%%", "/data/sensitive/*"]'
WHERE key = 'watch_paths';

Reset to Default

Use DELETE to reset a mutable key to its compiled default value:

-- Reset exclusion paths to default
DELETE FROM aquilon_config
WHERE key = 'exclude_paths';
-- Reset cache TTL to default
DELETE FROM aquilon_config
WHERE key = 'cache.ttl_secs';

Configuration Operation Reference

OperationMutable KeysImmutable Keys
SELECT
UPDATE✓ (value as string or JSON array)
DELETE✓ (resets to default)
INSERT

Common mutable scalar keys:

  • scan.max_scan_size_mb - Maximum file size to scan (MB)
  • scan.max_findings_per_scanner - Limit findings per scanner
  • cache.enabled - Enable/disable file hash cache
  • cache.ttl_secs - Cache time-to-live
  • resource_limits.enabled - Enable CPU/memory limits
  • resource_limits.max_cpu_percent - CPU usage limit

Common mutable array keys (use JSON arrays):

  • watch_paths - Paths to monitor for changes
  • exclude_paths - Glob patterns for paths to skip
  • policies.enabled_policies - Active policy names

Error Cases

Understanding error messages helps diagnose configuration issues. The examples below show common mistakes and the errors they produce.

Immutable Key Error

Attempting to modify or delete an immutable key:

-- This will fail: database_path is immutable
UPDATE aquilon_config
SET value = '/new/path/aquilon.db'
WHERE key = 'database_path';

Error: Key 'database_path' is immutable

-- This will also fail
DELETE FROM aquilon_config
WHERE key = 'database_path';

Error: Key 'database_path' is immutable

INSERT Not Supported

INSERT operations are not supported (keys are predefined):

INSERT INTO aquilon_config (key, value)
VALUES ('custom_key', 'some_value');

Error: INSERT not supported, use UPDATE to modify values

Invalid Value Type

Providing wrong type for a key:

-- This will fail: expects integer
UPDATE aquilon_config
SET value = 'not_a_number'
WHERE key = 'scan.max_scan_size_mb';

Error: Invalid value for 'scan.max_scan_size_mb': expected Integer

Invalid JSON for Array

Providing invalid JSON for an array key:

-- This will fail: not valid JSON array
UPDATE aquilon_config
SET value = '/path1, /path2'
WHERE key = 'exclude_paths';

Error: Invalid value for 'exclude_paths': expected JSON array

Value Out of Range

Providing a value outside allowed range:

-- This will fail: max_scan_size_mb has limits
UPDATE aquilon_config
SET value = '999999'
WHERE key = 'scan.max_scan_size_mb';

Error: Value 999999 out of range for 'scan.max_scan_size_mb' (1-10000)

Troubleshooting with Config Table

The config table helps diagnose scanning issues by exposing current settings.

Why Isn’t My File Being Scanned?

Check if the file path matches an exclusion pattern:

-- Check current exclusion patterns (stored as JSON array)
SELECT key, value
FROM aquilon_config
WHERE key = 'exclude_paths';
-- Expand exclusion patterns for easier reading
SELECT j.value AS excluded_pattern
FROM aquilon_config c, json_each(c.value) j
WHERE c.key = 'exclude_paths';

Check Active Policies

Verify which policies are enabled:

-- List enabled policies (stored as JSON array)
SELECT key, value
FROM aquilon_config
WHERE key = 'policies.enabled_policies';
-- Expand policies for easier reading
SELECT j.value AS policy_name
FROM aquilon_config c, json_each(c.value) j
WHERE c.key = 'policies.enabled_policies';

Diagnose Performance Issues

Check resource limits and scan settings:

-- Check resource and scan limits
SELECT key, value
FROM aquilon_config
WHERE key LIKE 'resource_limits.%'
   OR key LIKE 'scan.%'
ORDER BY key;

Verify Cache Configuration

Check if caching is enabled and its settings:

-- Check cache settings
SELECT key, value
FROM aquilon_config
WHERE key LIKE 'cache.%'
ORDER BY key;

Command Line Usage

Interactive Queries

# Start OSQuery interactive shell
osqueryi

# Run a query
osqueryi "SELECT * FROM aquilon_dlp_alerts LIMIT 10;"

JSON Output

# Get results as JSON for scripting
osqueryi --json "SELECT path, scanner, severity FROM aquilon_dlp_alerts WHERE severity = 'critical';"

Scheduled Queries

Configure scheduled queries in /etc/osquery/osquery.conf:

{
  "schedule": {
    "dlp_critical_alerts": {
      "query": "SELECT * FROM aquilon_dlp_alerts WHERE severity = 'critical' AND triage_status = 'new'",
      "interval": 300,
      "description": "Critical DLP alerts needing triage"
    },
    "dlp_daily_summary": {
      "query": "SELECT scanner, severity, COUNT(*) as count FROM aquilon_dlp_alerts WHERE timestamp > (strftime('%s', 'now') - 86400) GROUP BY scanner, severity",
      "interval": 86400,
      "description": "Daily DLP finding summary"
    }
  }
}

Compliance Overview

Aquilon DLP includes built-in compliance policy frameworks that automatically classify findings and generate violations according to regulatory requirements.

Available Frameworks

FrameworkDescriptionKey ControlsEdition
GDPREU General Data Protection RegulationArticles 5, 32, 33All
CCPACalifornia Consumer Privacy Act§1798.100-199All
HIPAAHealth Insurance Portability and Accountability Act§164.306, §164.312Enterprise
PCI DSSPayment Card Industry Data Security StandardRequirements 3, 4, 12Enterprise
SOXSarbanes-Oxley ActSections 302, 404, 409Enterprise
ISO 27001Information Security ManagementControls A.8.12, A.5.12, A.8.11Enterprise
CUIControlled Unclassified InformationNIST SP 800-171Enterprise
CMMCCybersecurity Maturity Model CertificationDFARS 252.204-7012Enterprise
FedRAMPFederal Risk and Authorization ManagementNIST SP 800-53Enterprise
FISMAFederal Information Security Modernization ActFIPS 199, NIST SP 800-53Enterprise

How Policy Frameworks Work

Each policy framework:

  1. Evaluates scan findings from all 50+ scanner plugins
  2. Applies regulatory logic to determine violations
  3. Classifies severity based on data type and details
  4. Generates metadata for compliance reporting

Example Flow

File scanned → SSN detected → HIPAA evaluates → PHI violation (Critical)
                           → PCI DSS evaluates → No violation (SSN not PAN)
                           → GDPR evaluates → Personal data violation (High)

Enabling Policies

Configure policies in aquilon_dlp_config.toml:

[policies]
enabled_policies = ["gdpr", "hipaa", "pci_dss", "sox", "iso27001", "cui", "cmmc", "fedramp", "fisma"]

# Optional: customize specific policies
# [policies.policy_configs.hipaa]
# settings = { covered_entity = "true" }

# [policies.policy_configs.pci_dss]
# settings = { merchant_level = "2" }

# [policies.policy_configs.cmmc]
# settings = { level = "2" }

Policy Configuration Options

Each policy supports configuration options:

OptionDescriptionDefault
enabledEnable/disable the policytrue
confidence_thresholdMinimum scanner confidence to generate violation0.7
sensitivity_levelAdjust severity calculation2 (1-3)

Framework-Specific Settings

HIPAA:

  • covered_entity: Whether organization is a HIPAA covered entity

PCI DSS:

  • merchant_level: PCI merchant level (1-4)
  • version: PCI DSS version (3.2.1 or 4.0)

ISO 27001:

  • enforce_data_masking: Require data masking for violations
  • classification_level: Default classification (restricted/confidential/internal/public)

Violation Severity Levels

All frameworks use consistent severity levels:

LevelDescriptionTypical Response
CriticalImmediate breach riskImmediate investigation
HighSignificant exposureInvestigate within 24 hours
MediumModerate riskInvestigate within 7 days
LowMinor concernReview during regular audit

Compliance Reporting

OSQuery Queries

Query violations by policy:

-- All HIPAA critical findings
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA' AND severity = 'critical';
-- Policy violation summary
SELECT policy, severity, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY policy, severity;
-- Recent violations by framework
SELECT policy, path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400)
ORDER BY timestamp DESC;

Audit Trail

Each violation includes metadata for audit purposes:

  • Policy: Framework that generated the violation
  • Severity: Risk classification
  • Scanner: Detection method
  • Context: Surrounding text for validation
  • Timestamp: Detection time
  • File path: Location of finding

Custom Policies

Beyond built-in frameworks, create custom policies for:

  • Company-specific identifiers
  • Internal compliance requirements
  • Industry-specific patterns

See Policy Frameworks for custom policy creation.

Next Steps

  • HIPAA - Healthcare data protection
  • PCI DSS - Payment card security
  • SOX - Financial controls
  • ISO 27001 - Information security management
  • GDPR - EU data protection
  • CCPA - California consumer privacy
  • CUI - Controlled Unclassified Information (NIST SP 800-171)
  • CMMC - DoD contractor certification
  • FedRAMP - Federal cloud authorization
  • FISMA - Federal agency security

HIPAA Compliance

Note: HIPAA policy framework requires Enterprise Edition.

The Health Insurance Portability and Accountability Act (HIPAA) policy framework detects Protected Health Information (PHI) exposure and generates violations according to HIPAA Security Rule requirements.

Overview

HIPAA establishes standards for protecting sensitive patient health information. Aquilon DLP’s HIPAA policy helps covered entities and business associates comply with:

  • §164.306 - Security standards: General rules
  • §164.312 - Technical safeguards
  • §164.308 - Administrative safeguards

Protected Health Information (PHI)

The HIPAA policy detects the following PHI categories:

PHI CategoryScannersSeverity
Social Security NumbersssnCritical
Medical Record Numbersmedical_record_numberCritical
Health Plan IDshealth_plan_idCritical
National Provider IDs (NPI)npiHigh
Date of Birthdate_of_birthHigh
Email (patient contact)emailMedium
Phone NumbersphoneMedium
AddressesaddressMedium

International Patient Populations

Healthcare organizations serving international patients may encounter national identification numbers from other countries. Aquilon DLP includes 28 country-specific national ID scanners for comprehensive coverage.

Common International IDs in Healthcare

RegionScannersUse Case
Europefrance_nir, germany_steurid, uk_nino, + 11 moreEU/EEA patients, medical tourism
Americasbrazil_cpf, canada_sin, + 2 moreCross-border healthcare
Asia-Pacificindia_aadhaar, japan_my_number, + 6 moreInternational patients

Note: While SSN remains the primary identifier for US healthcare, organizations with international patient populations should enable additional national ID scanners. All scanners use country-specific checksum validation.

See Policy Frameworks for the complete list of all 28 national ID scanners.

Scanner Mappings

Critical Severity

These findings always generate Critical violations under HIPAA:

  • SSN: Direct patient identifier
  • Medical Record Number: Unique patient identifier
  • Health Plan Beneficiary Number: Insurance identifier

High Severity

  • NPI: Healthcare provider identifier (may indicate patient-provider relationship)
  • Date of Birth: Combined with other data enables patient identification
  • Biometric Data: Fingerprints, retinal scans, voice prints

Medium Severity

  • Contact Information: Email, phone when in healthcare details
  • Geographic Data: Address, ZIP codes (smaller than state)

Configuration

Basic Configuration

[policies]
enabled_policies = ["hipaa"]

Advanced Configuration

[policies.policy_configs.hipaa]
settings = { covered_entity = "true", confidence_threshold = "0.8", sensitivity_level = "3" }

Configuration Options

OptionDescriptionDefault
covered_entityIndicates organization is a HIPAA covered entityfalse
confidence_thresholdMinimum scanner confidence (0.0-1.0)0.7
sensitivity_levelSeverity multiplier (1=low, 2=medium, 3=high)2

Context Detection

The HIPAA policy elevates severity when healthcare details is detected:

Healthcare Context Keywords

  • Medical terms: patient, diagnosis, prescription, treatment
  • Healthcare entities: hospital, clinic, pharmacy, physician
  • Insurance terms: claim, coverage, beneficiary, EOB

Example

Finding: SSN "122-15-6289"
Context: "Patient record for treatment on 03/15/2024"

Result: Severity elevated from High → Critical due to healthcare details

Violation Metadata

Each HIPAA violation includes:

{
  "policy": "HIPAA",
  "severity": "critical",
  "phi_category": "ssn",
  "safeguard": "technical",
  "requirement": "164.312(a)(1)",
  "breach_notification_required": true
}

Compliance Reporting

Query PHI Exposures

-- All PHI exposures requiring breach notification
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA'
  AND severity = 'critical';

-- PHI by category
SELECT scanner, COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA'
GROUP BY scanner
ORDER BY count DESC;

Breach Risk Assessment

Under HIPAA Breach Notification Rule, unauthorized access to PHI requires risk assessment considering:

  1. Nature and extent of PHI involved
  2. Unauthorized person who accessed PHI
  3. Whether PHI was actually viewed or acquired
  4. Extent to which risk has been mitigated

Aquilon DLP findings provide evidence for factors 1 and 4.

Best Practices

Monitoring Strategy

  1. Alert on Critical immediately: SSN, MRN, Health Plan IDs
  2. Daily review of High: NPI, DOB exposures
  3. Weekly audit of Medium: Contact information in healthcare contexts

Remediation Workflow

  1. Identify: Aquilon DLP detects PHI exposure
  2. Assess: Determine if breach occurred
  3. Contain: Remove or encrypt exposed data
  4. Document: Record incident for compliance
  5. Notify: Follow breach notification requirements if applicable

Integration with Incident Response

Forward HIPAA critical alerts to your incident response system:

-- Real-time HIPAA breach candidates
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA'
  AND severity = 'critical'
  AND timestamp > datetime('now', '-1 hour');

PCI DSS Compliance

Note: PCI DSS policy framework requires Enterprise Edition.

The Payment Card Industry Data Security Standard (PCI DSS) policy framework detects cardholder data exposure and generates violations according to PCI DSS requirements.

Overview

PCI DSS protects cardholder data during payment card transactions. Aquilon DLP’s PCI DSS policy helps merchants and service providers comply with:

  • Requirement 3: Protect stored cardholder data
  • Requirement 4: Encrypt transmission of cardholder data
  • Requirement 12: Maintain information security policy

Cardholder Data Elements

The PCI DSS policy detects:

Data ElementScannerSeverityPCI Category
Primary Account Number (PAN)credit_cardCriticalCHD
Cardholder Namecredit_cardHighCHD
Service Codecredit_cardHighCHD
Expiration Datecredit_cardMediumCHD
CVV/CVC/CVV2cvvCriticalSAD
PIN/PIN BlockpinCriticalSAD
Magnetic Stripe Datamagnetic_stripeCriticalSAD

CHD = Cardholder Data (may be stored if protected) SAD = Sensitive Authentication Data (must never be stored)

KYC and International Compliance

Payment processors and card issuers operating internationally often collect national identification numbers for Know Your Customer (KYC) verification. Aquilon DLP includes 28 country-specific national ID scanners to detect this data.

International Identity Verification

RegionScannersKYC Use Case
Europegermany_steurid, uk_nino, france_nir, + 11 moreEU PSD2 compliance, strong customer authentication
Americasbrazil_cpf, canada_sin, + 2 moreCross-border merchant onboarding
Asia-Pacificindia_aadhaar, india_pan, + 6 moreRegional payment network compliance

Note: While PCI DSS focuses on cardholder data, organizations subject to anti-money laundering (AML) and KYC regulations should monitor for national IDs collected during identity verification.

See Policy Frameworks for the complete list of all 28 national ID scanners.

Scanner Mappings

Critical Severity

Always Critical under PCI DSS:

  • CVV/CVC: Sensitive authentication data - must never be stored
  • Full PAN: Primary account number without masking
  • Magnetic Stripe: Track data must never be stored

High Severity

  • Masked PAN: Partial card numbers (first 6/last 4 may be stored)
  • Cardholder Name: When associated with PAN

Medium Severity

  • Expiration Date: Lower risk when isolated
  • Partial Card Data: Fragments that may indicate CHD

Configuration

Basic Configuration

[policies]
enabled_policies = ["pci_dss"]

Advanced Configuration

[policies.policy_configs.pci_dss]
settings = { merchant_level = "2", version = "4.0", confidence_threshold = "0.85" }

Configuration Options

OptionDescriptionDefault
merchant_levelPCI merchant level (1-4)2
versionPCI DSS version (3.2.1 or 4.0)4.0
confidence_thresholdMinimum scanner confidence0.8
detect_test_cardsFlag test card numbersfalse

PAN Detection

Supported Card Networks

  • Visa (4xxx)
  • Mastercard (51-55xx, 2221-2720)
  • American Express (34xx, 37xx)
  • Discover (6011, 644-649, 65xx)
  • JCB (3528-3589)
  • Diners Club (36xx, 38xx)

Luhn Validation

All detected PANs are validated using the Luhn algorithm to reduce false positives.

Context Analysis

The policy analyzes surrounding details to determine if numbers are actual PANs:

"Order #4111111111111111" → Likely PAN (Critical)
"Transaction ID: 4111111111111111" → Needs review (High)

Violation Metadata

Each PCI DSS violation includes:

{
  "policy": "PCI_DSS",
  "severity": "critical",
  "data_element": "pan",
  "card_network": "visa",
  "requirement": "3.4",
  "is_sad": false,
  "masked_value": "411111******1111"
}

Compliance Reporting

Query Cardholder Data Exposures

-- All unmasked PANs (critical PCI violation)
SELECT path, scanner, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
  AND severity = 'critical';

-- SAD storage violations (immediate action required)
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
  AND scanner IN ('cvv', 'magnetic_stripe');

-- CHD exposure by file type
SELECT
  SUBSTR(path, -4) as extension,
  COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
GROUP BY extension;

QSA Audit Support

Generate reports for Qualified Security Assessor (QSA) audits:

-- Cardholder Data Environment (CDE) scope
SELECT
  rtrim(path, replace(path, '/', '')) as directory,
  COUNT(*) as finding_count
FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
GROUP BY directory;

Best Practices

Monitoring Strategy

  1. Immediate alert: CVV, magnetic stripe, PIN data
  2. Same-day review: Full PAN exposures
  3. Weekly audit: Partial PAN, cardholder names

SAD Handling

Sensitive Authentication Data must never be stored:

CVV found → Immediate deletion required
Mag stripe found → Immediate deletion required
PIN found → Immediate deletion required

CDE Scope Reduction

Use findings to identify and reduce Cardholder Data Environment:

  1. Locate all CHD storage
  2. Determine if storage is necessary
  3. Delete or encrypt as appropriate
  4. Update CDE documentation

SOX Compliance

Note: SOX policy framework requires Enterprise Edition.

The Sarbanes-Oxley Act (SOX) policy framework detects exposure of financial data and internal controls information that could impact financial reporting integrity.

Overview

SOX establishes requirements for public company financial reporting and internal controls. Aquilon DLP’s SOX policy helps organizations comply with:

  • Section 302: Corporate responsibility for financial reports
  • Section 404: Management assessment of internal controls
  • Section 409: Real-time issuer disclosures

Protected Data Categories

The SOX policy detects:

Data CategoryScannersSeveritySOX Section
Financial Account Numbersbank_account, iban, aba_routingCritical302, 404
Tax Identifiersein, ssnCritical302
Internal Financial Datafinancial_keywordHigh404
Audit Documentationaudit_keywordHigh404
Executive Communicationsexec_keywordMedium302

International Subsidiaries

Multinational corporations with global subsidiaries must protect employee and financial data across jurisdictions. Aquilon DLP includes 28 country-specific national ID scanners for comprehensive coverage.

Global Employee and Tax Data

RegionScannersSOX Relevance
Europegermany_steurid, france_nir, uk_nino, + 11 moreEU subsidiary employee tax data
Americasbrazil_cpf, argentina_cuit, + 2 moreLatin American subsidiary payroll
Asia-Pacificindia_pan, japan_my_number, + 6 moreAPAC subsidiary financial records

Note: SOX Section 404 internal controls extend to material subsidiaries. Unauthorized exposure of subsidiary employee tax identifiers or financial data may indicate control deficiencies.

See Policy Frameworks for the complete list of all 28 national ID scanners.

Scanner Mappings

Critical Severity

Financial data requiring immediate protection:

  • Bank Account Numbers: Direct access to company funds
  • IBAN/SWIFT: International financial identifiers
  • ABA Routing Numbers: US bank routing information
  • EIN: Employer Identification Number
  • Tax Documents: Tax returns, W-2s, 1099s

High Severity

Internal controls and audit information:

  • Financial Statements: Balance sheets, P&L, cash flow
  • Audit Working Papers: Internal audit documentation
  • Control Documentation: SOX control matrices, test results
  • Material Information: Pre-earnings, M&A data

Medium Severity

  • Executive Communications: C-suite financial discussions
  • Budget Data: Forecasts, projections
  • Vendor Financial Data: AP/AR information

Configuration

Basic Configuration

[policies]
enabled_policies = ["sox"]

Advanced Configuration

[policies.policy_configs.sox]
settings = { confidence_threshold = "0.75", sensitivity_level = "3", detect_material_info = "true" }

Configuration Options

OptionDescriptionDefault
confidence_thresholdMinimum scanner confidence0.7
sensitivity_levelSeverity multiplier2
detect_material_infoFlag material non-public informationtrue
quiet_period_daysDays before earnings (heightened sensitivity)14

Context Detection

The SOX policy elevates severity when financial context is detected. Enable the sox_financial context profile for automatic keyword detection:

[context]
enabled_profiles = ["sox_financial"]  # Add to existing profiles

Financial Context Keywords

The sox_financial profile detects:

  • Strong indicators: 10-K, 10-Q, 8-K, SEC filing, GAAP, IFRS, PCAOB, balance sheet, income statement, SOX 404, material weakness
  • Weak indicators: revenue, earnings, quarterly, annual, profit, EBITDA, margin, budget, forecast, audit

Note: The SOX policy requires explicit financial context signals for financial_figures findings to avoid false positives on arbitrary currency amounts (e.g., $10 in shell scripts).

Quiet Period Detection

During earnings quiet periods, severity is elevated for:

  • Financial projections
  • Earnings estimates
  • Material business changes

Violation Metadata

Each SOX violation includes:

{
  "policy": "SOX",
  "severity": "critical",
  "data_category": "financial_account",
  "sox_section": "302",
  "material_info": false,
  "control_impact": "financial_reporting"
}

Compliance Reporting

Query Financial Data Exposures

-- All critical financial data exposures
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'SOX'
  AND severity = 'critical';

-- Internal controls documentation exposure
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'SOX'
  AND scanner LIKE '%audit%';

-- Financial data by department (based on path)
SELECT
  CASE
    WHEN path LIKE '%/finance/%' THEN 'Finance'
    WHEN path LIKE '%/accounting/%' THEN 'Accounting'
    WHEN path LIKE '%/treasury/%' THEN 'Treasury'
    ELSE 'Other'
  END as department,
  COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'SOX'
GROUP BY department;

Audit Committee Reporting

Generate reports for audit committee:

-- SOX control deficiency indicators
SELECT
  date(timestamp) as date,
  severity,
  COUNT(*) as findings
FROM aquilon_dlp_alerts
WHERE policy = 'SOX'
  AND timestamp > datetime('now', '-30 days')
GROUP BY date, severity
ORDER BY date DESC;

Best Practices

Monitoring Strategy

  1. Immediate alert: Bank accounts, tax IDs, material info
  2. Daily review: Financial statements, audit documentation
  3. Weekly audit: Executive communications, budget data

Control Environment

Use findings to strengthen internal controls:

  1. Identify: Where financial data is stored
  2. Assess: Whether storage is appropriate
  3. Remediate: Move to secure locations
  4. Document: Update control documentation

Segregation of Duties

Monitor for inappropriate access patterns:

-- Financial data in non-finance directories
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'SOX'
  AND severity = 'critical'
  AND path NOT LIKE '%/finance/%'
  AND path NOT LIKE '%/accounting/%';

ISO 27001 Compliance

Note: ISO 27001 policy framework requires Enterprise Edition.

The ISO 27001:2022 policy framework implements information security management controls with a focus on data leakage prevention.

Overview

ISO 27001:2022 is the international standard for information security management. Aquilon DLP’s ISO 27001 policy implements key controls:

  • A.8.12: Data leakage prevention (NEW in 2022 revision)
  • A.5.12: Classification of information
  • A.8.11: Data masking

Note: Control A.8.12 explicitly mandates DLP capabilities, making this a core requirement for ISO 27001:2022 certification.

Data Classification Levels

The ISO 27001 policy uses a four-level classification system:

LevelDescriptionExamplesSeverity
RestrictedHighest sensitivityCryptographic keys, master passwordsCritical
ConfidentialBusiness-criticalFinancial data, PII, trade secretsHigh
InternalInternal use onlyEmployee data, internal policiesMedium
PublicNo restrictionsMarketing materials, public docsLow

Scanner Classifications

All 50+ scanners are automatically classified:

Restricted (Critical)

  • private_key, api_key, jwt, aws_access_key
  • credit_card, cvv
  • ssn (in certain contexts)

Confidential (High)

  • ssn, passport, drivers_license
  • bank_account, iban
  • health_record, medical_record_number

Internal (Medium)

  • email, phone, address
  • date_of_birth
  • employee_id

Public (Low)

  • Generic patterns without sensitive details

Global PII Coverage

ISO 27001 is an international standard. Organizations operating across multiple jurisdictions need comprehensive national ID detection. Aquilon DLP includes 28 country-specific national ID scanners with checksum validation.

Europe (14 scanners)

CountryScannerFormatValidation
Francefrance_nir15 digits (NIR)Mod 97
Germanygermany_steurid11 digits (Steuer-ID)Format rules
Italyitaly_cf16 chars (Codice Fiscale)Mod 26
Spainspain_dni8-9 chars (DNI/NIE)Mod 23
Polandpoland_pesel11 digits (PESEL)Weighted mod 10
Netherlandsnetherlands_bsn9 digits (BSN)11-proof
Belgiumbelgium_nrn11 digits (NRN)Mod 97
UKuk_nino9 chars (NINO)Format rules
Swedensweden_personnummer10-12 digitsLuhn
Norwaynorway_fodselsnummer11 digitsDual mod-11
Finlandfinland_hetu11 chars (HETU)Mod 31
Portugalportugal_nif9 digits (NIF)Weighted mod 11
Romaniaromania_cnp13 digits (CNP)Weighted mod 11
Czech/Slovakiaczech_rodne_cislo9-10 digitsMod 11

Americas (4 scanners)

CountryScannerFormatValidation
Brazilbrazil_cpf11 digits (CPF)Dual mod 11
Canadacanada_sin9 digits (SIN)Luhn
Chilechile_rut8-9 chars (RUT)Mod 11
Argentinaargentina_cuit11 digits (CUIT/CUIL)Weighted mod 11

Asia-Pacific (8 scanners)

CountryScannerFormatValidation
Australiaaustralia_tfn9 digits (TFN)Weighted mod 11
Indiaindia_aadhaar12 digits (Aadhaar)Verhoeff
Indiaindia_pan10 chars (PAN)Format rules
South Koreasouth_korea_rrn13 digits (RRN)Weighted mod 11
Japanjapan_my_number12 digitsGovernment checksum
Chinachina_resident_id18 charsISO 7064 MOD 11-2
Taiwantaiwan_national_id10 charsWeighted mod 10
New Zealandnew_zealand_ird8-9 digits (IRD)Mod 11

Middle East & Africa (2 scanners)

CountryScannerFormatValidation
Israelisrael_teudat_zehut9 digitsLuhn variant
Turkeyturkey_tc_kimlik11 digits (TC Kimlik)Two-step checksum

Note: All national ID scanners use country-specific context keywords to increase detection confidence and reduce false positives.

See Policy Frameworks for detailed scanner documentation.

Configuration

Basic Configuration

[policies]
enabled_policies = ["iso27001"]

Advanced Configuration

[policies.policy_configs.iso27001]
settings = { confidence_threshold = "0.7", enforce_data_masking = "true", classification_level = "confidential" }

Configuration Options

OptionDescriptionDefault
confidence_thresholdMinimum scanner confidence0.7
enforce_data_maskingRequire data masking in outputfalse
classification_levelDefault classification levelconfidential
control_a812_strictStrict A.8.12 enforcementtrue

Control Implementation

Control A.8.12 - Data Leakage Prevention

Aquilon DLP directly implements A.8.12 by:

  1. Monitoring data at rest: Scans file systems for sensitive data
  2. Classification: Automatically classifies detected data
  3. Alerting: Generates violations for inappropriate storage
  4. Reporting: Provides audit trails for compliance

Control A.5.12 - Classification of Information

Each finding includes classification metadata:

{
  "classification_level": "confidential",
  "classification_reason": "Contains SSN (direct identifier)",
  "handling_requirements": ["encryption_at_rest", "access_logging"]
}

Control A.8.11 - Data Masking

When enforce_data_masking is enabled, detected values are masked:

Original: 122-45-6789
Masked: ***-**-6789

Violation Metadata

Each ISO 27001 violation includes:

{
  "policy": "ISO27001",
  "severity": "high",
  "classification": "confidential",
  "iso_control": "A.8.12",
  "control_name": "Data leakage prevention",
  "handling_requirements": [
    "encrypt_at_rest",
    "restrict_access",
    "audit_logging"
  ]
}

Compliance Reporting

Query by Classification Level

-- All restricted data exposures (immediate action)
SELECT path, scanner, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'ISO27001'
  AND severity = 'critical';

-- Classification distribution
SELECT severity as classification, COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'ISO27001'
GROUP BY severity
ORDER BY count DESC;

-- Control A.8.12 compliance status
SELECT
  date(timestamp) as date,
  COUNT(*) as findings
FROM aquilon_dlp_alerts
WHERE policy = 'ISO27001'
GROUP BY date
ORDER BY date DESC
LIMIT 30;

Certification Audit Support

Generate reports for ISO 27001 auditors:

-- Data leakage prevention evidence (Control A.8.12)
SELECT
  'Files with Findings' as metric,
  (SELECT COUNT(DISTINCT path) FROM aquilon_dlp_alerts WHERE policy = 'ISO27001') as value
UNION ALL
SELECT
  'Total Findings',
  (SELECT COUNT(*) FROM aquilon_dlp_alerts WHERE policy = 'ISO27001')
UNION ALL
SELECT
  'Critical Findings',
  (SELECT COUNT(*) FROM aquilon_dlp_alerts
   WHERE policy = 'ISO27001' AND severity = 'critical');

Best Practices

Monitoring Strategy

  1. Immediate alert: Restricted classification findings
  2. Daily review: Confidential data exposures
  3. Weekly audit: Internal data, classification accuracy

Information Security Management System (ISMS)

Use Aquilon DLP findings to support ISMS:

  1. Risk Assessment: Identify data exposure risks
  2. Risk Treatment: Implement controls based on classification
  3. Monitoring: Continuous compliance monitoring
  4. Improvement: Refine policies based on findings

Statement of Applicability (SoA)

Document control implementation:

ControlImplementationAquilon DLP Support
A.8.12DLP monitoringPrimary implementation
A.5.12ClassificationAutomatic classification
A.8.11Data maskingOptional enforcement

Certification Support

Pre-Audit Checklist

  • ISO 27001 policy enabled and configured
  • All data locations included in watch_paths
  • Classification levels match organization’s scheme
  • Historical findings retained for audit period
  • Remediation process documented

Evidence Collection

Collect evidence for auditors:

-- Export findings for audit period
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'ISO27001'
  AND timestamp BETWEEN '2024-01-01' AND '2024-12-31'
ORDER BY timestamp;

GDPR Compliance

The General Data Protection Regulation (GDPR) policy framework detects personal data exposure and generates violations according to EU data protection requirements.

Availability: GDPR policy is included in all editions (Basic and Enterprise).

Overview

GDPR establishes requirements for protecting personal data of EU residents. Aquilon DLP’s GDPR policy helps data controllers and processors comply with:

  • Article 5: Principles relating to processing of personal data
  • Article 32: Security of processing
  • Article 33: Notification of personal data breach

Personal Data Categories

The GDPR policy detects:

Data CategoryScannersSeverityGDPR Article
National ID Numbersssn, EU national IDs (see below)Critical9 (Special)
Financial Dataiban, credit_card, bank_accountHigh9
Health Datahealth_record, medical_record_numberCritical9
Biometric DatabiometricCritical9
EmailemailMedium4
PhonephoneMedium4
AddressaddressMedium4
Date of Birthdate_of_birthMedium4
PassportpassportHigh4

EU/EEA National ID Coverage

The GDPR policy includes 15 specialized scanners for national identification numbers across EU and EEA member states. Each scanner validates country-specific checksum algorithms to reduce false positives.

European National IDs

CountryScannerFormatValidation
Francefrance_nir15 digits (NIR)Mod 97
Germanygermany_steurid11 digits (Steuer-ID)Format rules
Italyitaly_cf16 chars (Codice Fiscale)Mod 26
Spainspain_dni8-9 chars (DNI/NIE)Mod 23
Polandpoland_pesel11 digits (PESEL)Weighted mod 10
Netherlandsnetherlands_bsn9 digits (BSN)11-proof
Belgiumbelgium_nrn11 digits (NRN)Mod 97
UKuk_nino9 chars (NINO)Format rules
Swedensweden_personnummer10-12 digitsLuhn
Norwaynorway_fodselsnummer11 digitsDual mod-11
Finlandfinland_hetu11 chars (HETU)Mod 31
Portugalportugal_nif9 digits (NIF)Weighted mod 11
Romaniaromania_cnp13 digits (CNP)Weighted mod 11
Czech/Slovakiaczech_rodne_cislo9-10 digitsMod 11
Turkeyturkey_tc_kimlik11 digits (TC Kimlik)Two-step checksum

Note: Turkey’s KVKK (Kişisel Verilerin Korunması Kanunu) is modeled on GDPR. Turkish national IDs are included for organizations processing Turkish residents’ data under GDPR-equivalent requirements.

Context Detection

Each national ID scanner uses country-specific context keywords to increase detection confidence:

  • Nordic: personnummer, fødselsnummer, henkilötunnus, Skatteverket, Folkeregisteret
  • Western Europe: NIR, Steuer-ID, codice fiscale, DNI, BSN, NRN, NINO
  • Eastern Europe: PESEL, CNP, rodné číslo
  • Turkey: TC Kimlik, Kimlik No, Nüfus

See the Policy Frameworks guide for the complete list of all 28 national ID scanners across all regions.

Special Category Data

Article 9 special category data receives elevated severity:

  • Racial or ethnic origin
  • Political opinions
  • Religious beliefs
  • Trade union membership
  • Genetic data
  • Biometric data
  • Health data
  • Sex life or sexual orientation

Scanner Mappings

Critical Severity

Special category data under Article 9:

  • Health Data: Medical records, health information
  • Biometric Data: Fingerprints, facial recognition
  • National IDs: SSN, government-issued identifiers (when combined with health details)

High Severity

Personal data enabling direct identification:

  • Financial Identifiers: IBAN, credit cards, bank accounts
  • Travel Documents: Passport numbers
  • National IDs: In general contexts

Medium Severity

Personal data requiring details:

  • Contact Information: Email, phone, address
  • Dates: Date of birth
  • Names: When combined with other data

Configuration

Basic Configuration

[policies]
enabled_policies = ["gdpr"]

Advanced Configuration

[policies.policy_configs.gdpr]
settings = { confidence_threshold = "0.7", sensitivity_level = "2", detect_special_categories = "true" }

Configuration Options

OptionDescriptionDefault
confidence_thresholdMinimum scanner confidence0.7
sensitivity_levelSeverity multiplier (1-3)2
detect_special_categoriesElevate Article 9 datatrue

Context Detection

The GDPR policy analyzes context to determine severity. Enable the gdpr_phone context profile for phone number classification:

[context]
enabled_profiles = ["gdpr_phone"]  # Add to existing profiles

Phone Number Context

The gdpr_phone profile distinguishes personal from business phone numbers:

  • Personal indicators (triggers violation): mobile, cell, home phone, personal, private, emergency contact
  • Business indicators (suppresses violation): office, fax, support, helpdesk, extension, toll-free, switchboard

Note: Phone numbers without personal context do NOT trigger GDPR violations. A bare phone number like 555-123-4567 requires nearby keywords like “mobile” or “cell” to be flagged.

EU Context Keywords

  • EU member states: Germany, France, Italy, Spain, etc.
  • GDPR terms: data subject, controller, processor, consent
  • Languages: Non-English European languages increase confidence

Employee vs Customer Context

Employee data in HR systems may have reduced severity (legitimate interest):

Finding: Email "employee@company.com"
Context: "HR records for performance review"

Result: Severity reduced from High → Medium (employee details)

Customer data maintains full severity.

Violation Metadata

Each GDPR violation includes:

{
  "policy": "GDPR",
  "severity": "high",
  "data_category": "personal_data",
  "special_category": false,
  "gdpr_article": "5(1)(f)",
  "lawful_basis_required": true,
  "breach_notification_hours": 72
}

Compliance Reporting

Query Personal Data Exposures

-- All GDPR violations requiring attention
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
ORDER BY severity DESC, timestamp DESC;

Special Category Data (Article 9)

-- Special category data requiring elevated protection
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
  AND severity = 'critical';

Personal Data by Type

-- Personal data grouped by scanner type
SELECT scanner, COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
GROUP BY scanner
ORDER BY count DESC;

Breach Notification Support

Under Article 33, breaches must be reported within 72 hours:

-- Recent critical findings (potential breach)
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
  AND severity = 'critical'
  AND timestamp > datetime('now', '-72 hours');

Data Subject Rights

Aquilon DLP findings support data subject rights:

Article 15 - Right of Access

Locate all personal data for a data subject:

-- Find all data for specific identifier
SELECT path, scanner, JSON_EXTRACT(context, '$.snippet') as snippet
FROM aquilon_dlp_alerts
WHERE JSON_EXTRACT(context, '$.snippet') LIKE '%email@example.com%';

Article 17 - Right to Erasure

Verify deletion completeness:

-- Confirm no remaining data after erasure request
SELECT * FROM aquilon_dlp_alerts
WHERE JSON_EXTRACT(context, '$.snippet') LIKE '%data_subject_id%';

Article 20 - Right to Data Portability

Identify structured personal data:

-- Portable data formats
SELECT path, scanner
FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
  AND (path LIKE '%.json'
       OR path LIKE '%.csv'
       OR path LIKE '%.xml');

Best Practices

Monitoring Strategy

  1. Immediate alert: Special category data (Article 9)
  2. Daily review: High severity personal data
  3. Weekly audit: Medium severity, details accuracy

Data Mapping

Use findings to maintain data mapping:

  1. Identify: Where personal data is stored
  2. Classify: By data category and lawful basis
  3. Document: In Records of Processing Activities
  4. Review: Regularly for accuracy

Privacy by Design

Integrate findings into development:

-- Personal data in development environments
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
  AND (path LIKE '%/dev/%'
       OR path LIKE '%/test/%'
       OR path LIKE '%/staging/%');

Cross-Border Considerations

EU-Specific Context

The policy detects EU details to determine applicability:

  • EU country names or codes
  • EU-specific identifiers (IBAN, national IDs)
  • EU languages

International Transfers

Monitor for personal data in locations outside EU:

-- Potential international transfers
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
  AND path LIKE '%/external/%';

CCPA Compliance

The California Consumer Privacy Act (CCPA) and California Privacy Rights Act (CPRA) policy framework detects California consumer personal information and generates violations according to California privacy requirements.

Overview

CCPA/CPRA establishes privacy rights for California consumers and obligations for businesses handling their personal information. Aquilon DLP’s CCPA policy helps organizations comply with:

  • 1798.100 - Right to Know (consumer data collection disclosure)
  • 1798.105 - Right to Delete
  • 1798.120 - Right to Opt-Out (sale of personal information)
  • CPRA 2023 - Enhanced sensitive personal information categories

Personal Information Categories

The CCPA policy detects the following personal information categories:

CategoryScannersSeverity
Direct Identifiersssn, drivers_licenseCritical
Contact Informationemail, phone, addressHigh
Financial Informationcredit_card, bank_accountCritical
Geolocation Dataip_address, addressHigh
Biometric InformationbiometricCritical
Professional/EmploymentContext-based detectionMedium

CPRA Sensitive Personal Information

CPRA (effective 2023) added enhanced protections for sensitive personal information:

  • Social Security numbers
  • Driver’s license and state ID numbers
  • Financial account credentials
  • Precise geolocation
  • Racial/ethnic origin
  • Religious beliefs
  • Biometric data
  • Health information
  • Sexual orientation

Configuration

Basic Configuration

[policies]
enabled_policies = ["ccpa"]

Advanced Configuration

[policies.policy_configs.ccpa]
settings = { california_business = "true", sensitivity_level = "2", detect_sensitive_pi = "true", confidence_threshold = "0.7" }

Configuration Options

OptionDescriptionDefault
california_businessWhether organization does business in Californiatrue
sensitivity_levelCompliance strictness (1=basic, 2=standard, 3=strict)2
detect_sensitive_piDetect CPRA sensitive personal informationtrue
detect_consumer_dataDetect commercial/behavioral datatrue
confidence_thresholdMinimum scanner confidence (0.0-1.0)0.7

Context Detection

The CCPA policy uses context signals to determine applicability and severity:

California Context Keywords

  • Location terms: California, CA, Calif
  • Regulation terms: CCPA, CPRA, consumer privacy
  • Business terms: consumer, customer, resident

Consumer Context Keywords

  • Consumer terms: consumer, customer, subscriber, member
  • Commercial terms: purchase, transaction, order, account
  • Marketing terms: profile, preference, behavioral, targeting

Violation Metadata

Each CCPA violation includes:

{
  "policy": "CCPA",
  "severity": "high",
  "pi_category": "direct_identifier",
  "cpra_sensitive": true,
  "consumer_rights": ["right_to_know", "right_to_delete"],
  "section": "1798.100"
}

Compliance Reporting

Query Consumer PI Exposures

-- All CCPA findings
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'CCPA'
ORDER BY timestamp DESC;

Sensitive PI Detection

-- Critical findings (sensitive PI under CPRA)
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'CCPA'
  AND severity = 'critical';

Best Practices

Consumer Rights Support

CCPA grants consumers specific rights. Aquilon DLP findings help you:

Right to Know (1798.100):

  • Identify what personal information you’ve collected
  • Document categories of PI by data type

Right to Delete (1798.105):

  • Locate all instances of a consumer’s data
  • Verify deletion completeness

Right to Opt-Out (1798.120):

  • Identify data used for sales/sharing
  • Track third-party data exposure

Monitoring Strategy

  1. Alert on Critical immediately: SSN, financial data, biometrics
  2. Daily review of High: Contact information, geolocation
  3. Weekly audit of Medium: Professional/employment context

Remediation Workflow

  1. Identify: Aquilon DLP detects PI exposure
  2. Classify: Determine PI category and CPRA sensitivity
  3. Assess: Evaluate consumer rights implications
  4. Remediate: Secure or delete exposed data
  5. Document: Record for compliance audit

CCPA vs GDPR

Both policies protect personal data but have different scopes:

AspectCCPAGDPR
ScopeCalifornia residentsEU residents
ThresholdRevenue/data volume basedAny processing
ConsentOpt-out modelOpt-in model
PenaltiesUp to $7,500/violationUp to 4% revenue

Organizations serving both jurisdictions should enable both policies:

[policies]
enabled_policies = ["gdpr", "ccpa"]

CUI Compliance

Note: CUI policy framework requires Enterprise Edition.

The Controlled Unclassified Information (CUI) policy framework detects CUI exposure and generates violations according to NIST SP 800-171 requirements for federal contractors.

Overview

CUI is government-created or government-possessed information that requires safeguarding per 32 CFR Part 2002 and NIST SP 800-171. Aquilon DLP’s CUI policy helps federal contractors comply with:

  • 3.1.1 - Limit system access to authorized users
  • 3.1.2 - Limit system access to permitted transactions/functions
  • 3.1.3 - Control CUI flow per authorizations
  • 3.8.1 - Protect system media (physical and digital)
  • 3.8.2 - Limit CUI access to authorized users

CUI Categories

The CUI policy detects multiple categories defined by the CUI Registry:

CategoryDescriptionSeverity
Basic CUIStandard CUI without specified handlingHigh
Specified CUI (SP-*)CUI with additional safeguard requirementsCritical
FCIFederal Contract InformationHigh
CDICovered Defense Information (DFARS 252.204-7012)Critical
CTIControlled Technical Information (DoD 5230.24)Critical

Detection Methods

Government-Specific Scanners

ScannerDetectsSeverity
cui_markingCUI banners, markings (CUI, CUI//SP-*, CONTROLLED)Critical
export_controlITAR, EAR, ECCN markingsCritical
gov_identifierDoD EDI-PI identifiersHigh

PII in Government Context

When PII appears with government context signals, it triggers CUI violations:

ScannerGovernment Context RequiredSeverity
ssnFederal employee/contractor contextCritical
email.gov/.mil domain or federal contextMedium
api_keyGovernment system credentialsCritical
database_connectionFederal database stringsCritical
cryptoEncryption keys in government contextCritical

Configuration

Basic Configuration

[policies]
enabled_policies = ["cui"]

Advanced Configuration

[policies.policy_configs.cui]
settings = { detect_basic_cui = "true", detect_specified_cui = "true", detect_fci = "true", detect_cdi = "true", detect_cti = "true", confidence_threshold = "0.7" }

Configuration Options

OptionDescriptionDefault
detect_basic_cuiDetect standard CUI markingstrue
detect_specified_cuiDetect CUI//SP-* specified markingstrue
detect_fciDetect Federal Contract Informationtrue
detect_cdiDetect Covered Defense Informationtrue
detect_ctiDetect Controlled Technical Informationtrue
confidence_thresholdMinimum scanner confidence (0.0-1.0)0.7

Context Detection

The CUI policy uses context signals to determine CUI category and severity:

Government Context Keywords

  • Federal terms: federal, government, agency, DoD, contractor, grantee
  • Contract terms: contract, DFARS, FAR, solicitation, RFP, task order, IDIQ
  • Classification: controlled, unclassified, FOUO, CUI, SBU, LES
  • Document types: statement of work, SOW, PWS, CDRL, DD254

Defense Context Keywords

  • Defense terms: defense, military, DoD, Pentagon, armed forces, warfighter
  • Contractor terms: prime, subcontractor, DIB, defense industrial base, CAGE code
  • Technical terms: technical data, specifications, engineering, schematics, drawings
  • Programs: ACAT, PEO, PM, acquisition, milestone

CUI Marking Patterns

The policy detects standard CUI banner and footer markings:

  • Banner formats: CUI, CONTROLLED, CUI//SP-*
  • Specified markings: CUI//SP-EXPT, CUI//SP-CTI, CUI//SP-PRVCY
  • Legacy markings: FOUO, SBU, LES (mapped to CUI categories)
  • Distribution statements: Distribution A-F, EXPORT CONTROLLED

Source Code Context

CUI in development environments receives elevated severity:

  • Repository indicators: .git, src/, lib/, include/
  • Code file extensions: .c, .cpp, .h, .py, .java, .rs
  • Build systems: Makefile, CMakeLists.txt, Cargo.toml

Example Context Flow

Finding: SSN "123-45-6789"
Context: "DFARS contractor employee records for contract W911NF-20-C-0001"

Result: Severity elevated to Critical (CDI context - DFARS contract number)
Finding: CUI marking "CUI//SP-CTI"
Context: Found in file.cpp within git repository

Result: Critical violation (CUI spillage into source code)

Violation Metadata

Each CUI violation includes:

{
  "policy": "CUI",
  "severity": "critical",
  "cui_category": "cdi",
  "nist_control": "3.8.1",
  "regulation": "DFARS 252.204-7012"
}

Compliance Reporting

Query CUI Exposures

-- All CUI exposures
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'CUI'
ORDER BY timestamp DESC;

Defense Contract Compliance

For DFARS compliance reporting:

-- CDI and CTI exposures (DFARS 252.204-7012)
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'CUI'
  AND severity = 'critical';

Best Practices

Monitoring Strategy

  1. Alert on Critical immediately: CUI markings, export control, CDI/CTI
  2. Daily review of High: FCI, government identifiers
  3. Weekly audit: PII in government context

CUI Category Prioritization

Different CUI categories require different response times:

Immediate Response (< 1 hour):

  • CUI//SP-* (Specified CUI with additional safeguards)
  • CDI (Covered Defense Information under DFARS)
  • CTI (Controlled Technical Information)
  • Export-controlled data (ITAR/EAR)

Same-Day Response:

  • Basic CUI markings
  • FCI (Federal Contract Information)
  • Government credentials/API keys

Weekly Review:

  • PII with government context (no explicit CUI marking)
  • Legacy markings (FOUO, SBU) requiring reclassification

Spillage Response Procedures

When CUI is detected outside authorized boundaries:

  1. Contain: Immediately restrict access to the file/location
  2. Preserve: Do not delete - preserve for incident investigation
  3. Notify: Alert your Facility Security Officer (FSO) or ISSO
  4. Document: Record in incident tracking system
  5. Assess: Determine if spillage constitutes a reportable incident
  6. Remediate: Securely delete or move to authorized storage
  7. Report: Update SPRS score if control failure identified

NIST SP 800-171 Assessment Support

Use Aquilon DLP findings to support your NIST assessment:

  • 3.1.1/3.1.2 (Access Control): Unauthorized access detected by CUI exposure
  • 3.8.1 (Media Protection): CUI on unprotected storage locations
  • 3.8.2 (Media Access): CUI accessible to unauthorized users
  • 3.13.1 (Boundary Protection): CUI spillage outside authorization boundary

Remediation Workflow

  1. Identify: Aquilon DLP detects CUI exposure
  2. Classify: Determine CUI category (Basic, Specified, CDI, CTI)
  3. Assess: Evaluate spillage scope and potential impact
  4. Contain: Move to authorized storage or encrypt
  5. Document: Record for NIST SP 800-171 assessment
  6. Report: Include in SPRS score if applicable
  7. Prevent: Implement controls to prevent recurrence

CMMC Compliance

Note: CMMC policy framework requires Enterprise Edition.

The Cybersecurity Maturity Model Certification (CMMC) policy framework helps 350,000+ Defense Industrial Base (DIB) contractors achieve CMMC compliance for DoD contract eligibility.

Overview

CMMC 2.0 establishes cybersecurity requirements for organizations handling Federal Contract Information (FCI) and Controlled Unclassified Information (CUI) in defense contracts. Aquilon DLP’s CMMC policy helps contractors comply with:

  • FAR 52.204-21 - Basic Safeguarding of Covered Contractor Information Systems
  • DFARS 252.204-7012 - Safeguarding Covered Defense Information
  • DFARS 252.204-7019 - Notice of NIST SP 800-171 Assessment
  • DFARS 252.204-7020 - NIST SP 800-171 DoD Assessment

CMMC Levels

LevelData TypesControlsAssessment
Level 1FCI only17 practicesSelf-assessment
Level 2FCI + CUI110 practices (NIST SP 800-171)Self or third-party
Level 3FCI + CUI + Enhanced110+ practices (includes SP 800-172)Government-led

Detection Methods

Government-Specific Scanners

ScannerDetectsCMMC LevelSeverity
cui_markingCUI banners, markings2+Critical
export_controlITAR, EAR, ECCN markings2+Critical
gov_identifierDoD EDI-PI identifiersAllHigh

PII Relevant to Defense Contracts

ScannerRelevanceSeverity
ssnEmployee/subcontractor PIICritical
emailGovernment communicationsMedium
api_keySystem credentialsCritical
cryptoEncryption keysCritical
bank_accountContract payment dataHigh

Configuration

Basic Configuration (Level 2)

[policies]
enabled_policies = ["cmmc"]

Level-Specific Configuration

[policies.policy_configs.cmmc]
settings = { level = "2", confidence_threshold = "0.7" }

Configuration Options

OptionDescriptionDefault
levelCMMC level (1, 2, or 3)2
detect_cui_markingsDetect CUI banners/markingstrue
detect_export_controlDetect ITAR/EAR markingstrue
detect_piiDetect PII in defense contexttrue
detect_credentialsDetect API keys, database stringstrue
confidence_thresholdMinimum scanner confidence (0.0-1.0)0.7

Context Detection

Defense Industrial Base Context

  • Contract terms: prime contractor, subcontractor, DIB, defense contract, teaming agreement
  • DoD terms: DoD, Department of Defense, Pentagon, armed forces, military branch names
  • Program terms: CAGE code, DUNS, SAM registration, UEI, contract number (W/N prefixes)
  • Roles: contracting officer, COR, COTR, program manager, DCMA

Technical Context

  • Technical data: engineering drawings, specifications, schematics, BOMs, ICDs
  • Export control: ITAR, EAR, ECCN, defense article, USML category
  • System terms: CDS, cross-domain, classified system, enclave, authorization boundary
  • Development: source code, firmware, software, algorithm, design document

Contract Vehicle Context

Different contract types affect CMMC applicability:

  • Prime contracts: Direct DoD contracts requiring flow-down
  • Subcontracts: DFARS flow-down requirements apply
  • SBIR/STTR: Small business innovation research with CUI potential
  • GSA Schedule: May include DoD task orders
  • OTA: Other Transaction Agreements with DoD

Supply Chain Context

Multi-tier supply chain indicators:

  • Tier references: Tier 1, Tier 2, subcontractor, supplier
  • Flow-down terms: DFARS flow-down, 252.204-7012, prime requirements
  • Assessment references: SPRS, NIST assessment, POA&M, SSP

Example Context Flow

Finding: Database connection string with credentials
Context: "DFARS contract W52P1J-21-C-0045 subcontractor portal"

Result: Critical violation (CMMC Level 2 - credentials in defense contract context)
Finding: Technical drawing (.dwg file)
Context: File metadata contains "CAGE: 1ABC2" and "ECCN: 9A515"

Result: Critical violation (export-controlled technical data)

Violation Metadata

Each CMMC violation includes:

{
  "policy": "CMMC",
  "severity": "critical",
  "cmmc_level": 2,
  "data_type": "cui",
  "dfars_clause": "DFARS 252.204-7012",
  "sprs_relevant": true
}

Compliance Reporting

SPRS Score Support

Query findings that may affect your Supplier Performance Risk System (SPRS) score:

-- All CMMC findings by severity
SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'CMMC'
GROUP BY severity
ORDER BY count DESC;

Pre-Assessment Audit

Before a CMMC assessment:

-- Critical CUI exposures requiring remediation
SELECT path, scanner, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'CMMC'
  AND severity = 'critical'
ORDER BY timestamp DESC;

Best Practices

By CMMC Level

Level 1 (FCI only):

  • Focus on basic PII protection
  • Monitor for accidental data spillage
  • Self-assessment annually with affirmation

Level 2 (CUI):

  • Enable all CUI detection settings
  • Implement continuous monitoring
  • Document findings for POA&M
  • Prepare for third-party assessment (C3PAO)

Level 3 (Enhanced):

  • Strict alerting on any detection
  • Integration with SIEM
  • Real-time incident response
  • Government-led assessment preparation

SPRS Score Impact Assessment

Aquilon DLP findings can identify gaps affecting your SPRS score:

Score-Impacting Findings:

  • Unencrypted CUI storage → impacts AC.L2-3.1.19 (-5 points)
  • Credentials in plaintext → impacts IA.L2-3.5.10 (-5 points)
  • Missing access controls → impacts AC.L2-3.1.1 (-5 points)

Using DLP for SPRS Improvement:

  1. Query critical findings to identify control gaps
  2. Map findings to NIST SP 800-171 controls
  3. Document remediation in POA&M
  4. Re-scan to verify remediation
  5. Update SPRS score with improved controls

Level-Based Remediation Priorities

Level 1 Remediation Focus:

  1. Remove FCI from unauthorized locations
  2. Ensure basic access controls on FCI systems
  3. Document FCI boundaries

Level 2 Remediation Focus:

  1. Eliminate CUI spillage outside enclave
  2. Implement encryption for CUI at rest and in transit
  3. Remove hardcoded credentials from CUI systems
  4. Document in System Security Plan (SSP)

Level 3 Remediation Focus:

  1. Zero tolerance for any critical findings
  2. Implement advanced threat detection
  3. Enhanced logging and monitoring
  4. Prepare for government assessment evidence

Assessment Preparation

  1. Inventory: Use Aquilon to discover where CUI resides
  2. Categorize: Map findings to CMMC practice requirements
  3. Scope: Define assessment boundary using DLP data
  4. Remediate: Address critical exposures before assessment
  5. Document: Export findings for POA&M evidence
  6. Evidence: Generate compliance reports for assessors
  7. Monitor: Maintain continuous compliance post-assessment

FedRAMP Compliance

Note: FedRAMP policy framework requires Enterprise Edition.

The Federal Risk and Authorization Management Program (FedRAMP) policy framework helps cloud service providers (CSPs) protect federal data and achieve FedRAMP authorization.

Overview

FedRAMP provides a standardized approach to security assessment for cloud products and services used by federal agencies. Aquilon DLP’s FedRAMP policy implements NIST SP 800-53 control families:

  • AC - Access Control: System access and authorization
  • AU - Audit and Accountability: Audit logging and review
  • IA - Identification and Authentication: User/device identification
  • MP - Media Protection: Digital and physical media safeguards
  • SC - System and Communications Protection: Communication security
  • SI - System and Information Integrity: Integrity protection

FedRAMP Baselines

BaselineImpact LevelControlsUse Cases
LowLow impact~125Public-facing sites, low-sensitivity data
ModerateModerate impact~325Most federal applications, PII
HighHigh impact~421Law enforcement, emergency services, financial

Detection Methods

Cloud-Specific Scanners

ScannerDetectsSeverity
cui_markingCUI in cloud storageCritical
api_keyCloud service credentialsCritical
database_connectionDatabase connection stringsCritical
cryptoEncryption keysCritical

Federal Data in Cloud Context

ScannerCloud Context RequiredSeverity
ssnMulti-tenant cloud environmentCritical
email.gov domain or federal agencyMedium
ip_addressFederal network rangesHigh
gov_identifierDoD EDI-PI in cloud systemsHigh

Configuration

Basic Configuration (Moderate Baseline)

[policies]
enabled_policies = ["fedramp"]

Baseline-Specific Configuration

[policies.policy_configs.fedramp]
settings = { baseline = "moderate", confidence_threshold = "0.7" }

Configuration Options

OptionDescriptionDefault
baselineFedRAMP baseline (low, moderate, high)moderate
detect_cuiDetect CUI in cloud storagetrue
detect_credentialsDetect API keys, connection stringstrue
detect_piiDetect PII in multi-tenant environmentstrue
confidence_thresholdMinimum scanner confidence (0.0-1.0)0.7

Context Detection

Cloud Context Keywords

  • Cloud terms: cloud, SaaS, IaaS, PaaS, tenant, multi-tenant, serverless
  • Provider terms: AWS, Azure, GCP, FedRAMP authorized, GovCloud, Azure Government
  • Service terms: API, endpoint, webhook, microservice, Lambda, Functions
  • Storage: S3, Blob, object storage, bucket, container registry

Federal Agency Context

  • Agency terms: federal, agency, government, GSA, FedRAMP PMO
  • Authorization terms: ATO, authorization, JAB, P-ATO, agency ATO
  • Compliance terms: continuous monitoring, ConMon, POA&M, 3PAO, SSP

Cloud Infrastructure Context

Multi-tenant and infrastructure indicators:

  • Tenant isolation: tenant ID, account ID, subscription, organization
  • Network: VPC, VNET, security group, NSG, firewall rules
  • Identity: IAM, RBAC, service principal, managed identity
  • Secrets: Key Vault, Secrets Manager, Parameter Store

Authorization Boundary Context

FedRAMP authorization boundaries require clear data classification:

  • Boundary terms: authorization boundary, system boundary, enclave
  • Data flow: ingress, egress, data flow diagram, DFD
  • Interconnection: ISA, MOU, interconnection security agreement
  • External: external system, third-party, SaaS integration

Example Context Flow

Finding: API key "AKIA..." in configuration file
Context: "AWS GovCloud deployment for agency.gov"

Result: Critical violation (cloud credentials in federal context)
Finding: SSN in database export
Context: "Multi-tenant SaaS platform, FedRAMP Moderate ATO"

Result: Critical violation (PII in shared cloud environment)

Violation Metadata

Each FedRAMP violation includes:

{
  "policy": "FedRAMP",
  "severity": "critical",
  "baseline": "moderate",
  "nist_control": "SC-28",
  "control_family": "System and Communications Protection"
}

Compliance Reporting

Authorization Boundary Monitoring

-- All FedRAMP findings
SELECT severity, scanner, COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'FedRAMP'
GROUP BY severity, scanner
ORDER BY count DESC;

Continuous Monitoring Support

FedRAMP requires continuous monitoring (ConMon). Query for recent issues:

-- Last 30 days of FedRAMP findings
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'FedRAMP'
  AND timestamp > datetime('now', '-30 days')
ORDER BY timestamp DESC;

Best Practices

By Baseline

Low Baseline:

  • Monitor for basic data exposure
  • Focus on API key and credential leaks
  • Annual assessment with ConMon

Moderate Baseline:

  • Enable full PII detection
  • Monitor CUI in cloud storage
  • Integrate with SIEM for ConMon
  • Monthly vulnerability scanning integration

High Baseline:

  • Strict alerting on any detection
  • Real-time incident response integration
  • Enhanced audit logging
  • Weekly vulnerability correlation

Baseline-Specific Remediation

Low Baseline Remediation:

  1. Remove exposed credentials from repositories
  2. Rotate any detected API keys
  3. Document in POA&M if not immediately remediable

Moderate Baseline Remediation:

  1. Encrypt PII at rest and in transit
  2. Implement tenant isolation for sensitive data
  3. Remove CUI from unauthorized storage locations
  4. Enable audit logging for all access
  5. Update SSP with control implementations

High Baseline Remediation:

  1. Zero tolerance - immediate remediation required
  2. Incident response activation for any critical finding
  3. Document in 24-hour significant change report
  4. Review authorization boundary for spillage

ConMon Integration Patterns

Integrate DLP findings into your Continuous Monitoring program:

Daily Operations:

  • Query critical findings for immediate response
  • Correlate with vulnerability scan results
  • Update incident tracking system

Monthly Reporting:

  • Generate finding trends for ConMon report
  • Map findings to NIST SP 800-53 controls
  • Update POA&M with remediation progress

Annual Assessment:

  • Export historical findings for 3PAO review
  • Demonstrate control effectiveness
  • Support reauthorization evidence

Authorization Maintenance

  1. Discover: Use Aquilon to identify sensitive data in cloud boundaries
  2. Classify: Map findings to NIST SP 800-53 controls
  3. Scope: Validate authorization boundary accuracy
  4. Remediate: Address findings before assessment
  5. Report: Include DLP findings in ConMon reports
  6. Evidence: Generate assessment-ready reports
  7. Maintain: Continuous monitoring for authorization renewal

FISMA Compliance

Note: FISMA policy framework requires Enterprise Edition.

The Federal Information Security Modernization Act (FISMA) policy framework helps federal agencies and contractors protect federal information systems according to NIST guidelines.

Overview

FISMA requires federal agencies to develop, document, and implement information security programs. Aquilon DLP’s FISMA policy implements FIPS 199 categorization and NIST SP 800-53 controls:

  • AC - Access Control: Limit system access to authorized users
  • AU - Audit and Accountability: Create and retain audit records
  • IA - Identification and Authentication: Identify users and devices
  • MP - Media Protection: Protect digital and physical media
  • SC - System and Communications Protection: Protect communications
  • SI - System and Information Integrity: Protect information integrity

FIPS 199 Impact Levels

Impact LevelConfidentialityDescriptionControls
LowLimited adverse effectPublic-facing systems~127
ModerateSerious adverse effectMost agency systems~325
HighSevere/catastrophic effectNational security, financial~421

Detection Methods

Federal System Scanners

ScannerDetectsSeverity
cui_markingCUI in federal systemsCritical
gov_identifierDoD EDI-PI, federal IDsHigh
export_controlITAR/EAR controlled dataCritical

PII in Federal Context

ScannerFederal Context RequiredSeverity
ssnFederal employee/citizen recordsCritical
email.gov/.mil communicationsMedium
addressFederal facility addressesMedium
date_of_birthPersonnel recordsHigh
api_keyFederal system credentialsCritical

Configuration

Basic Configuration (Moderate Impact)

[policies]
enabled_policies = ["fisma"]

Impact Level Configuration

[policies.policy_configs.fisma]
settings = { impact_level = "moderate", confidence_threshold = "0.7" }

Configuration Options

OptionDescriptionDefault
impact_levelFIPS 199 impact level (low, moderate, high)moderate
detect_cuiDetect CUI markingstrue
detect_piiDetect PII in federal contexttrue
detect_credentialsDetect system credentialstrue
confidence_thresholdMinimum scanner confidence (0.0-1.0)0.7

Context Detection

Federal Agency Context

  • Agency terms: federal, agency, government, bureau, department, administration
  • Specific agencies: DoD, VA, HHS, DHS, Treasury, DOJ, DOE, NASA, USDA
  • System terms: FISMA, ATO, authorization boundary, system owner, ISSO, ISSM
  • Roles: authorizing official, AO, system security officer, privacy officer

Contractor Context

  • Contractor terms: contractor, grantee, subrecipient, awardee
  • Contract terms: FAR, DFARS, task order, contract vehicle, BPA, IDIQ
  • Compliance: NIST, RMF, POA&M, SSP, SAR, CAP
  • Oversight: DCAA, OIG, GAO, inspector general

State/Local Government

  • Terms: state, county, municipal, local government, tribal
  • Programs: grants.gov, federal funding, pass-through, SLFRF
  • Compliance: Single Audit, 2 CFR 200, Uniform Guidance

System Categorization Context

FIPS 199 categorization indicators:

  • Impact terms: confidentiality, integrity, availability, CIA
  • Levels: low impact, moderate impact, high impact
  • Categories: national security, PII, financial, law enforcement
  • Documents: system security plan, SSP, contingency plan, BIA

Personnel Context

Federal personnel data receives elevated severity:

  • HR terms: SF-86, OPM, personnel file, background investigation
  • Clearance: security clearance, TS, SCI, Q clearance, L clearance
  • Benefits: FEHB, TSP, retirement, pension, FERS, CSRS
  • Records: eOPF, employee record, personnel action, SF-50

Example Context Flow

Finding: SSN "123-45-6789"
Context: "OPM background investigation SF-86 supplemental"

Result: Critical violation (PII in federal personnel context - high sensitivity)
Finding: Email list with .gov addresses
Context: "DHS employee directory for FISMA-moderate system"

Result: High violation (federal employee PII requiring protection)

Violation Metadata

Each FISMA violation includes:

{
  "policy": "FISMA",
  "severity": "critical",
  "fips_199_level": "moderate",
  "nist_control": "AC-3",
  "control_family": "Access Control",
  "rmf_step": "assess"
}

Compliance Reporting

FISMA Metrics

-- FISMA findings by severity for reporting
SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'FISMA'
GROUP BY severity
ORDER BY count DESC;

POA&M Support

Query findings for Plan of Action and Milestones:

-- Critical findings for POA&M
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'FISMA'
  AND severity IN ('critical', 'high')
ORDER BY timestamp DESC;

Best Practices

By Impact Level

Low Impact:

  • Monitor for basic data exposure
  • Focus on public-facing system boundaries
  • Annual assessment cycle

Moderate Impact:

  • Enable full PII detection
  • Monitor for CUI spillage
  • Regular compliance reporting
  • Quarterly POA&M updates

High Impact:

  • Strict alerting on any detection
  • Real-time incident response
  • Enhanced audit trail integration
  • Weekly security status reporting

RMF Step-by-Step Integration

Step 1 - Categorize:

Use Aquilon DLP to support system categorization:

  • Discover data types stored and processed
  • Identify PII, CUI, and sensitive data locations
  • Document information types for FIPS 199 assessment
  • Generate evidence for categorization decision

Step 2 - Select:

Map DLP findings to control requirements:

  • AC-3 (Access Enforcement): Unauthorized access detection
  • MP-2 (Media Access): Sensitive data on removable media
  • SC-28 (Protection of Information at Rest): Unencrypted sensitive data
  • SI-4 (Information System Monitoring): DLP as monitoring control

Step 3 - Implement:

Deploy DLP as part of control implementation:

  • Configure policies matching system impact level
  • Integrate alerts with security operations
  • Document DLP coverage in SSP

Step 4 - Assess:

Use findings for control assessment:

  • Generate reports for Security Assessment Report (SAR)
  • Provide evidence of control effectiveness
  • Document findings requiring POA&M entries

Step 5 - Authorize:

Include DLP in authorization package:

  • Control implementation evidence
  • Monitoring capability documentation
  • Risk acceptance for any open findings

Step 6 - Monitor:

Continuous monitoring with Aquilon:

  • Ongoing detection of new exposures
  • Trend analysis for ISCM reporting
  • POA&M remediation verification

ATO Package Documentation

Generate DLP reports for authorization packages:

Required Documentation:

  • System boundary sensitive data inventory
  • Control implementation evidence (AC, MP, SC, SI families)
  • Monitoring capability description
  • Incident detection and response integration

Assessment Evidence:

  • Historical finding trends
  • Remediation timelines
  • False positive rates and tuning

Annual FISMA Reporting

Aquilon findings support FISMA metrics including:

  • Number of systems with sensitive PII
  • Data spillage incidents
  • Remediation timelines
  • Control effectiveness measures
  • CIO FISMA metrics support

Troubleshooting

Common issues and solutions for Aquilon DLP.

Installation Issues

“osquery not found” during installation

Aquilon DLP requires osquery 5.0.1 or later. Install osquery first:

macOS:

# Using Homebrew
brew install --cask osquery

# Or download PKG from https://github.com/osquery/osquery/releases

Ubuntu/Debian:

wget https://pkg.osquery.io/deb/osquery_5.10.2-1.linux_amd64.deb
sudo apt install ./osquery_5.10.2-1.linux_amd64.deb

CentOS/RHEL:

wget https://pkg.osquery.io/rpm/osquery-5.10.2-1.linux.x86_64.rpm
sudo dnf install ./osquery-5.10.2-1.linux.x86_64.rpm

Verify installation:

osqueryd --version
# Expected: osqueryd version 5.0.1 or later

“Unsupported osquery version”

Upgrade osquery to 5.0.1 or later from osquery releases.

“Signature verification failed” (macOS)

The PKG may be corrupted. Re-download from the official source and verify:

spctl -a -v aquilon-dlp-enterprise.pkg

“Installation already in progress” (macOS)

Another installation is running. If a previous installation crashed:

sudo rm -rf /var/run/aquilon-install.lock


macOS Issues

Full Disk Access Not Granted

Aquilon DLP requires Full Disk Access for file monitoring.

Diagnosis:

# Check TCC database
sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
  "SELECT auth_value FROM access
   WHERE service = 'kTCCServiceSystemPolicyAllFiles'
   AND client = 'dev.aquilon.dlp-plugin';"
# Expected: 2 (granted)

Solution:

  1. Open System Settings > Privacy & Security > Full Disk Access
  2. Click the lock icon and authenticate
  3. Click + and navigate to /opt/aquilon/aquilon-dlp.app
  4. Ensure the toggle is enabled
  5. Restart the service

For MDM deployments: Deploy a PPPC (Privacy Preferences Policy Control) profile. See MDM Deployment.

Extension Not Loading in osquery

Diagnosis:

# Check extension registered
cat /var/osquery/extensions.load

# Check osquery sees extension
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM osquery_extensions;'

Solutions:

  1. Verify Full Disk Access (see above)

  2. Restart osqueryd:

    sudo launchctl unload /Library/LaunchDaemons/io.osquery.agent.plist
    sudo launchctl load /Library/LaunchDaemons/io.osquery.agent.plist
    

    Note: OSQuery 5.0.1+ uses io.osquery.agent.plist. Older versions use com.facebook.osqueryd.plist.

  3. Check logs:

    tail -f /var/log/aquilon/aquilon-dlp.log
    

“Unsupported macOS version”

Aquilon DLP requires macOS 11.0 (Big Sur) or later:

sw_vers -productVersion


Linux Issues

Extension Not Loading

Diagnosis:

# Check extension registered
cat /etc/osquery/extensions.load

# Check osquery status
sudo systemctl status osqueryd

# Check logs
journalctl -u osqueryd -f

Solutions:

  1. Restart osqueryd:

    sudo systemctl restart osqueryd
    
  2. Check extension permissions:

    ls -la /usr/lib/osquery/extensions/aquilon-dlp-*.ext
    # Should be: -rwxr-xr-x root root
    

SELinux Blocking Access (RHEL/CentOS)

Diagnosis:

# Check for SELinux denials
sudo ausearch -m avc -ts recent

# Check SELinux status
getenforce

Solution:

# Restore security contexts
sudo restorecon -Rv /usr/lib/osquery/extensions/
sudo restorecon -Rv /etc/aquilon/

Permission Denied Errors

Diagnosis:

ls -la /usr/lib/osquery/extensions/aquilon-dlp-*.ext

Solution:

sudo chmod 755 /usr/lib/osquery/extensions/aquilon-dlp-*.ext
sudo chown root:root /usr/lib/osquery/extensions/aquilon-dlp-*.ext


OSQuery Integration Issues

Table Not Found

If aquilon_dlp_alerts table is not available:

# Verify extension is loaded
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM osquery_extensions WHERE name LIKE "%aquilon%";'

# Check table exists
osqueryi 'PRAGMA table_info(aquilon_dlp_alerts);'

If empty, the extension isn’t loaded. See platform-specific extension loading issues above.

SQL Query Errors

Common column name mistakes:

WrongCorrect
created_attimestamp
policy_namepolicy
pattern_matchedpattern
confidence_scoreconfidence
hashJSON_EXTRACT(context, ‘$.file.hash’)

Correct query example:

SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
ORDER BY timestamp DESC
LIMIT 10;

Configuration Issues

Configuration Not Loading

Diagnosis:

# Validate configuration
aquilon-dlp --validate-config /etc/aquilon/config.toml

Common errors:

  • Unknown field: Check field names match exactly (e.g., max_scan_size_mb not max_file_size_mb)
  • Unknown section: Use correct section names ([scan] not [scanner], [resource_limits] not [resources])
  • Invalid TOML: Check syntax with a TOML validator

watch_paths Not Working

Ensure watch_paths is at the top level of your config, not under a section:

# CORRECT - at top level
watch_paths = ["/home/%%", "/var/data/%%"]
[policies]
enabled_policies = ["gdpr", "ccpa"]
# WRONG - under [policies] section
[policies]
enabled_policies = ["gdpr", "ccpa"]
watch_paths = ["/home/%%"]  # This won't work!


Performance Issues

High CPU Usage

Diagnosis:

# Check process
top -pid $(pgrep -f aquilon)

# Check alert volume
osqueryi "SELECT COUNT(*) FROM aquilon_dlp_alerts;"

Solutions:

  1. Add exclusions for high-churn directories:

    exclude_paths = [
        "/home/*/.cache/%%",
        "/home/*/.npm/%%",
        "/var/log/%%",
        "**/*.iso",
        "**/*.dmg"
    ]
    
  2. Limit file size:

    [scan]
    max_scan_size_mb = 40
    
  3. Reduce workers:

    [worker]
    num_workers = 2
    
  4. Enable resource limits:

    [resource_limits]
    enabled = true
    max_cpu_percent = 50.0
    max_memory_mb = 512
    

High Memory Usage

Solutions:

  1. Reduce cache size:

    [cache]
    ttl_secs = 1800  # 30 minutes instead of longer
    
  2. Limit memory:

    [resource_limits]
    enabled = true
    max_memory_mb = 256
    

No Alerts Generated

Diagnosis

-- Check for any alerts
SELECT COUNT(*) FROM aquilon_dlp_alerts;

-- Check recent alerts
SELECT * FROM aquilon_dlp_alerts
ORDER BY timestamp DESC
LIMIT 5;

Common Causes

  1. No sensitive data: Test with known sensitive data:

    echo "SSN: 122-15-6289" > /tmp/test-sensitive.txt
    
  2. Policies not enabled: Check configuration:

    [policies]
    enabled_policies = ["gdpr", "ccpa", "hipaa"]
    
  3. Wrong watch paths: Ensure paths are monitored:

    watch_paths = ["/home/%%", "/var/data/%%"]
    
  4. Exclusions too broad: Review exclude_paths

  5. Cache returning old results: Clear cache or wait for TTL expiry


Debugging Enrichment

When investigating false positives (legitimate data flagged as sensitive) or false negatives (sensitive data not detected), use context trace mode to understand enrichment decisions.

Enable Context Trace

Add to your configuration:

[context_trace]
enabled = true

Understanding Trace Output

With tracing enabled, the logs show each enrichment decision in JSON format:

{
  "event": "context_enrichment",
  "scanner": "ssn",
  "original_confidence": 0.75,
  "context_profiles_matched": ["personal_data", "employment"],
  "adjustments": [
    {"profile": "personal_data", "boost": 0.15, "reason": "SSN keyword in context"},
    {"profile": "employment", "boost": 0.10, "reason": "W-2 form indicator"}
  ],
  "final_confidence": 0.95
}

Key fields:

FieldDescription
original_confidenceScanner’s base confidence before context analysis
context_profiles_matchedWhich context profiles found relevant keywords
adjustmentsIndividual confidence boosts with reasoning
final_confidenceResult after all adjustments applied

Common Debugging Scenarios

False Positive Investigation:

  1. Enable context trace
  2. Scan the file generating false positives
  3. Check which context profiles are boosting confidence
  4. Consider adjusting enabled_profiles in [context] config

False Negative Investigation:

  1. Enable context trace
  2. Scan the file that should generate alerts
  3. Look for low original_confidence (scanner issue) vs no adjustments (context issue)
  4. Consider enabling additional context profiles

Disable After Debugging

Context tracing generates significant log volume. Disable when done:

[context_trace]
enabled = false

Getting Help

Collecting Diagnostic Information

macOS:

# Collect logs
tail -n 500 /var/log/aquilon/aquilon-dlp.log > dlp-diagnostics.txt

# System info
sw_vers >> dlp-diagnostics.txt

# FDA status
sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
  "SELECT * FROM access WHERE client LIKE '%aquilon%';" >> dlp-diagnostics.txt

Linux:

# Collect logs
sudo journalctl -u osqueryd -n 500 > dlp-diagnostics.txt

# System info
uname -a >> dlp-diagnostics.txt
cat /etc/os-release >> dlp-diagnostics.txt

# Service status
systemctl status aquilon-dlp >> dlp-diagnostics.txt

Support Channels

Changelog

[Unreleased]

Added

  • National ID Scanners: Added 28 national ID scanners across 4 regions with country-specific checksum validation
    • Europe (14): France NIR, Germany Steuer-ID, Italy Codice Fiscale, Spain DNI/NIE, Poland PESEL, Netherlands BSN, Belgium NRN, UK NINO, Sweden Personnummer, Norway Fødselsnummer, Finland HETU, Portugal NIF, Romania CNP, Czech Rodné číslo
    • Americas (4): Brazil CPF, Canada SIN, Chile RUT, Argentina CUIT
    • Asia-Pacific (8): Australia TFN, India Aadhaar, India PAN, South Korea RRN, Japan My Number, China Resident ID, Taiwan National ID, New Zealand IRD
    • Middle East & Africa (2): Israel Teudat Zehut, Turkey TC Kimlik
  • GDPR policy now automatically detects EU/EEA national IDs with specialized validation
  • Turkey TC Kimlik included for KVKK (GDPR-equivalent) compliance