Aquilon DLP Documentation
Welcome to the Aquilon DLP documentation.
🍎 macOS | 🐧 Linux | 🏢 Enterprise
Note: Features marked with 🏢 are available in Enterprise edition only.
About This Documentation
This documentation covers:
- Getting Started: Quick overview and setup guides
- Installation: Platform-specific installation instructions
- User Guide: Day-to-day configuration and usage
- Deployment: Production deployment strategies
- Administration: Operations, backup, and disaster recovery
- Technical Reference: Architecture and API integration
- Compliance: Regulatory framework implementations
- Support: Troubleshooting and changelog
Editions
Aquilon DLP is available in two editions:
- Basic Edition (🐧 Linux only): Up to 5 servers, GDPR/CCPA policies
- Enterprise Edition (🍎 macOS + 🐧 Linux): Unlimited servers, full compliance suite (HIPAA, PCI, SOX, ISO 27001) plus government/defense frameworks (CUI, CMMC, FedRAMP, FISMA)
Choose the appropriate guide for your edition throughout this documentation.
Overview
Aquilon DLP is a production-grade data leak prevention solution built in Rust.
Key Features
- Real-time Monitoring: Detect sensitive data as files are created or modified
- Deep Content Analysis: Parse archives, Office documents, and PDFs
- Pattern Detection: 35 scanner plugins for PII, secrets, and compliance patterns
- OSQuery Integration: Query findings through standard osquery tables
Use Cases
Compliance Monitoring
Monitor endpoints for sensitive data that violates compliance requirements:
- Healthcare (HIPAA): Detect protected health information (PHI) including medical records, insurance IDs, and patient data
- Financial Services (PCI DSS, SOX): Find credit card numbers, CVVs, and financial records
- Privacy Regulations (GDPR, CCPA): Identify personal data including names, addresses, and government IDs
Data Breach Prevention
Prevent data leaks before they become incidents:
- Real-time Detection: Alert immediately when sensitive data appears in monitored directories
- Removable Media Scanning: Automatically scan USB drives when mounted to detect exfiltration attempts
- File Sharing Oversight: Monitor shared folders and collaboration directories
Security Auditing
Discover where sensitive data resides across your infrastructure:
- Data Discovery: Scan endpoints to map sensitive data locations
- Risk Assessment: Identify files with multiple policy violations
- Coverage Verification: Ensure all endpoints are protected and reporting
Incident Response
Rapidly assess affected systems during security incidents:
- Targeted Scanning: Query specific directories or file types
- Historical Analysis: Review past alerts for patterns
- Triage Workflow: Acknowledge, investigate, and resolve findings with audit trail
Quick Start
Get up and running with Aquilon DLP in 5 minutes.
Prerequisites
Before installing Aquilon DLP, ensure you have:
- OSQuery: Version 5.0.1 or later (download)
- Operating System:
- 🍎 macOS 11.0 (Big Sur) or later
- 🐧 Linux (Ubuntu 22.04+, RHEL 9+, Debian 11+, CentOS Stream 9+, Fedora 38+)
- Privileges: Administrator (macOS) or root/sudo (Linux)
- Resources: 2GB RAM minimum, 500MB disk space
Choose Your Edition
Aquilon DLP is available in two editions:
- 🐧 Basic Edition (Linux only): GDPR and CCPA policies, up to 5 servers
- 🏢 Enterprise Edition (macOS + Linux): All compliance frameworks (HIPAA, PCI DSS, SOX, ISO 27001, GDPR, CCPA)
Select your quick start path below:
🏢 macOS Enterprise Quick Start
Time: ~5 minutes
1. Install OSQuery
# Using Homebrew (recommended)
brew install --cask osquery
# Or download PKG from https://github.com/osquery/osquery/releases
2. Install Aquilon DLP Enterprise
Download the Enterprise Edition PKG installer from your organization’s portal and install:
# Install using PKG installer
sudo installer -pkg aquilon-dlp-enterprise-VERSION.pkg -target /
# Verify installation
aquilon-dlp --version
3. Configure
# Configuration is installed by PKG at /etc/aquilon/config.toml
# Edit as needed for your environment
# Grant Full Disk Access
# Open System Settings → Privacy & Security → Full Disk Access
# Click + and add the Aquilon DLP application
4. Start Monitoring
Aquilon DLP runs as an osquery extension. Start osquery to begin monitoring:
# Start osquery (Aquilon DLP extension loads automatically)
sudo osqueryd
5. Verify
# In a new terminal, query OSQuery
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM aquilon_dlp_alerts LIMIT 5;'
Next Steps:
- See Installation Guide for LaunchDaemon setup
- See Deployment Guide for MDM deployment
🐧 Linux Basic Edition Quick Start
Time: ~5 minutes
1. Install OSQuery
# Ubuntu/Debian
export OSQUERY_KEY=1484120AC4E9F8A1A577AEEE97A80C63C9D8B80B
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys $OSQUERY_KEY
sudo add-apt-repository 'deb [arch=amd64] https://pkg.osquery.io/deb deb main'
sudo apt-get update
sudo apt-get install osquery
# RHEL/CentOS
curl -L https://pkg.osquery.io/rpm/GPG | sudo tee /etc/pki/rpm-gpg/RPM-GPG-KEY-osquery
sudo yum-config-manager --add-repo https://pkg.osquery.io/rpm/osquery-s3-rpm.repo
sudo yum-config-manager --enable osquery-s3-rpm
sudo yum install osquery
2. Install Aquilon DLP Basic
Download the Basic Edition package from your organization’s portal:
# Ubuntu/Debian
sudo apt install ./aquilon-dlp-basic_VERSION_amd64.deb
# RHEL/CentOS
sudo dnf install ./aquilon-dlp-basic-VERSION.x86_64.rpm
# Verify
aquilon-dlp --version
3. Configure
# Configuration is installed at /etc/aquilon/config.toml
# Edit as needed for your environment
# Validate configuration
aquilon-dlp --validate-config /etc/aquilon/config.toml
4. Start Monitoring
Aquilon DLP runs as an osquery extension. Start osquery to begin monitoring:
# Start osquery (Aquilon DLP extension loads automatically)
sudo systemctl start osqueryd
5. Verify
# In a new terminal, query OSQuery
osqueryi --connect /var/osquery/extensions.sock 'SELECT * FROM aquilon_dlp_alerts LIMIT 5;'
Next Steps:
- See Installation Guide for systemd service setup
- See Deployment Guide for production deployment
🏢 Linux Enterprise Quick Start
Time: ~5 minutes
1. Install OSQuery
# Ubuntu/Debian
curl -L https://pkg.osquery.io/deb/osquery_5.x_1.0.0_amd64.deb -o osquery.deb
sudo dpkg -i osquery.deb
# RHEL/CentOS
sudo yum install https://pkg.osquery.io/rpm/osquery-5.x-1.0.0.x86_64.rpm
2. Install Aquilon DLP Enterprise
Download the Enterprise Edition package from your organization’s portal:
# Ubuntu/Debian
sudo apt install ./aquilon-dlp-enterprise_VERSION_amd64.deb
# RHEL/CentOS
sudo dnf install ./aquilon-dlp-enterprise-VERSION.x86_64.rpm
# Verify
aquilon-dlp --version
3. Configure
# Configuration is installed at /etc/aquilon/config.toml
# Edit as needed for your environment
# Validate configuration
aquilon-dlp --validate-config /etc/aquilon/config.toml
4. Start Monitoring
Aquilon DLP runs as an osquery extension. Start osquery to begin monitoring:
# Start osquery (Aquilon DLP extension loads automatically)
sudo systemctl start osqueryd
5. Verify
# In a new terminal, query OSQuery for HIPAA violations
osqueryi --connect /var/osquery/extensions.sock 'SELECT * FROM aquilon_dlp_alerts WHERE policy = "HIPAA" LIMIT 5;'
# Query PCI DSS findings
osqueryi --connect /var/osquery/extensions.sock 'SELECT * FROM aquilon_dlp_alerts WHERE policy = "PCI_DSS" LIMIT 5;'
Next Steps:
- See Installation Guide for systemd service setup
- See Deployment Guide for distributed deployment
- See Compliance Documentation for policy-specific guidance
What’s Next?
After completing the quick start:
- Production Setup: Configure systemd service (Linux) or LaunchDaemon (macOS) for automatic startup
- Customize Policies: Edit
/etc/aquilon/config.tomlto add watch paths and exclude directories - Monitor Alerts: Integrate with your SIEM or set up OSQuery scheduled queries
- Review Architecture: Understand the system architecture and plugin system
Troubleshooting
OSQuery extension not loading?
- Verify OSQuery is running:
ps aux | grep osquery - Check socket path matches configuration
- Review OSQuery logs for extension errors
Permission errors (macOS)?
- Ensure Full Disk Access granted in System Settings
- Restart LaunchDaemon after granting permissions
Policy not available (Basic Edition)?
- Basic Edition only includes GDPR and CCPA
- Remove enterprise policies (HIPAA, PCI DSS, SOX, ISO 27001) from configuration
- Upgrade to Enterprise Edition for full policy support
High CPU usage?
- Add exclusions for cache directories and system paths
- Reduce
num_workersin configuration - See Troubleshooting Guide for performance tuning
Support
- Basic Edition: GitHub issues at https://github.com/aquilonsecurity/aquilon-dlp/issues
- Enterprise Edition: support@aquilonsecurity.com (4-hour SLA for critical issues)
Installation
Aquilon DLP is available in two editions to meet different organizational needs. This section covers installation for all platforms and editions.
Edition Comparison
| Feature | Basic Edition | Enterprise Edition |
|---|---|---|
| Platforms | Linux only | macOS + Linux |
| GDPR | Yes | Yes |
| CCPA | Yes | Yes |
| HIPAA | No | Yes |
| PCI DSS | No | Yes |
| SOX | No | Yes |
| ISO 27001 | No | Yes |
| Custom TOML Policies | Yes | Yes |
| Support | Community | Enterprise SLA |
| macOS Endpoint Security | No | Yes |
| MDM Deployment | No | Yes |
Quick Installation Reference
macOS (Enterprise Only)
Note: macOS support requires the Enterprise Edition.
# Download PKG installer, then:
sudo installer -pkg aquilon-dlp-enterprise-VERSION.pkg -target /
See macOS Installation Guide for complete instructions including Full Disk Access setup.
Linux (Basic or Enterprise)
Ubuntu/Debian:
sudo apt install ./aquilon-dlp-{edition}_VERSION_amd64.deb
CentOS/RHEL:
sudo dnf install ./aquilon-dlp-{edition}-VERSION.x86_64.rpm
See Linux Basic Edition or Linux Enterprise Edition for complete instructions.
Prerequisites
All installations require:
- osquery 5.0.1 or later - Download from GitHub releases
- Administrator privileges - Installation requires root/sudo access
Platform-specific requirements:
| Platform | Additional Requirements |
|---|---|
| macOS | macOS 11.0 (Big Sur) or later, Full Disk Access permission |
| Ubuntu/Debian | Ubuntu 22.04+ or Debian 11+ |
| CentOS/RHEL | CentOS Stream 9+, RHEL 9+, or Fedora 38+ |
Choosing Your Edition
Basic Edition
Perfect for:
- Small teams and startups
- Organizations needing GDPR/CCPA compliance only
- Evaluation and testing purposes
Enterprise Edition
Required for:
- macOS deployment
- HIPAA, PCI DSS, SOX, or ISO 27001 compliance
- MDM-based deployment (Jamf, Intune, Kandji)
Install macOS Enterprise Edition | Install Linux Enterprise Edition
Contact
- Sales: sales@aquilonsecurity.com
- Website: https://aquilonsecurity.com
macOS Installation
Enterprise Edition Only: macOS support requires the Enterprise Edition of Aquilon DLP.
This guide covers installing Aquilon DLP on macOS using the PKG installer, including the required Full Disk Access configuration.
Prerequisites
Before installing Aquilon DLP, ensure you have:
- macOS 11.0 (Big Sur) or later
- osquery 5.0.1 or later - Download from GitHub releases
- Administrator privileges
Install osquery
Download and install osquery from the official releases:
# Download the PKG installer from osquery.io
# Then install:
sudo installer -pkg osquery-5.10.2.pkg -target /
Verify the installation:
osqueryd --version
# Expected: osqueryd version 5.10.2 (or later)
Installation
Step 1: Download the Installer
Download the signed PKG installer for macOS from the Aquilon Security portal:
- File:
aquilon-dlp-enterprise-VERSION.pkg
Step 2: Install via GUI or Command Line
GUI Installation: Double-click the PKG file and follow the installation wizard.
Command Line Installation:
sudo installer -pkg aquilon-dlp-enterprise-VERSION.pkg -target /
Step 3: Verify Installation
Check that all components were installed correctly:
# Verify app bundle
ls -la /opt/aquilon/aquilon-dlp.app
# Verify configuration directory
ls -la /etc/aquilon/
# Verify data directory
ls -la /var/aquilon/dlp/
# Verify extension registered with osquery
cat /var/osquery/extensions.load
What Gets Installed:
| Component | Location |
|---|---|
| App bundle | /opt/aquilon/aquilon-dlp.app |
| Configuration | /etc/aquilon/ |
| Database | /var/db/aquilon/ |
| Logs | /var/log/aquilon/ |
| osquery extension | Registered in /var/osquery/extensions.load |
Endpoint Security Setup
Aquilon DLP uses Apple’s Endpoint Security framework for real-time file monitoring. This requires granting Full Disk Access permission.
Grant Full Disk Access
- Open System Settings (or System Preferences on older macOS)
- Navigate to Privacy & Security > Full Disk Access
- Click the lock icon and authenticate
- Click + to add an application
- Navigate to
/opt/aquilon/aquilon-dlp.appand add it - Ensure the toggle is enabled
Verify Endpoint Security
After granting Full Disk Access, verify the extension loads correctly:
# Check osquery sees the extension
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM osquery_extensions;'
# Query DLP tables
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM aquilon_dlp_alerts LIMIT 5;'
MDM Deployment (Enterprise)
For enterprise environments, automate Full Disk Access grants via MDM using PPPC (Privacy Preferences Policy Control) profiles:
Supported MDM Platforms:
- Jamf Pro
- Microsoft Intune
- Kandji
- SimpleMDM, FileWave, Mosyle
Quick Setup:
- Upload the PPPC profile from
deployment/mdm/to your MDM - Deploy profile to target devices
- Deploy the Aquilon DLP PKG
See the Deployment Guide for platform-specific MDM instructions.
Post-Installation
Initial Configuration
The installer creates a default configuration at /etc/aquilon/config.toml. Edit this file to customize:
sudo nano /etc/aquilon/config.toml
Key configuration options:
- Watch paths: Directories to monitor for sensitive data
- Enabled policies: HIPAA, PCI DSS, SOX, ISO 27001, GDPR, CCPA
- Removable media scanning: Auto-scan USB drives on mount
See the Configuration Guide for complete options.
Verify DLP is Working
Test that Aquilon DLP is detecting files:
# Create a test file with sensitive data
echo "SSN: 223-41-1189" > /tmp/test-sensitive.txt
# Wait a moment for scanning, then query alerts
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM aquilon_dlp_alerts;'
Upgrading
To upgrade to a new version:
# Download new PKG installer
# Install over existing installation
sudo installer -pkg aquilon-dlp-enterprise-NEW_VERSION.pkg -target /
Your configuration in /etc/aquilon/config.toml is preserved during upgrades.
Uninstalling
To completely remove Aquilon DLP:
# Remove the application
sudo rm -rf /opt/aquilon
# Remove configuration (optional - preserves settings)
sudo rm -rf /etc/aquilon
# Remove data and logs (optional)
sudo rm -rf /var/aquilon /var/log/aquilon
# Remove from osquery extensions
sudo sed -i '' '/aquilon/d' /var/osquery/extensions.load
Troubleshooting
Common Issues
“Unsupported macOS version”
Aquilon DLP requires macOS 11.0 or later. Check your version:
sw_vers -productVersion
“Unsupported osquery version”
Aquilon DLP requires osquery 5.0.1 or later. Upgrade from osquery releases.
“Signature verification failed”
The PKG may be corrupted. Re-download from the official source and verify:
spctl -a -v aquilon-dlp-enterprise.pkg
Extension not loading in osquery
-
Verify Full Disk Access is granted (see Endpoint Security Setup)
-
Restart osqueryd:
sudo launchctl unload /Library/LaunchDaemons/io.osquery.agent.plist sudo launchctl load /Library/LaunchDaemons/io.osquery.agent.plistNote: OSQuery 5.0.1+ uses
io.osquery.agent.plist. Older versions usecom.facebook.osqueryd.plist. -
Check logs:
tail -f /var/log/aquilon/aquilon-dlp.log
“Installation already in progress”
Another installation is running. If a previous installation crashed, automatic stale lock detection should clean up. If not:
sudo rm -rf /var/run/aquilon-install.lock
Getting Help
- Documentation: Troubleshooting Guide
- Community Support: GitHub Issues
- Enterprise Support: Contact your account representative
Linux Installation (Basic Edition)
Basic Edition Features: GDPR, CCPA, and custom TOML policies. Community support.
This guide covers installing Aquilon DLP Basic Edition on Linux using DEB or RPM packages.
Prerequisites
Before installing Aquilon DLP, ensure you have:
- Supported Linux Distribution:
- Ubuntu 22.04 LTS or later
- Debian 11 or later
- CentOS Stream 9 or later
- RHEL 9 or later
- Fedora 38 or later
- osquery 5.0.1 or later
- Administrator (root) privileges
Install osquery
Ubuntu/Debian:
# Download osquery DEB package
wget https://pkg.osquery.io/deb/osquery_5.10.2-1.linux_amd64.deb
# Install osquery
sudo apt install ./osquery_5.10.2-1.linux_amd64.deb
CentOS/RHEL:
# Download osquery RPM package
wget https://pkg.osquery.io/rpm/osquery-5.10.2-1.linux.x86_64.rpm
# Install osquery
sudo dnf install ./osquery-5.10.2-1.linux.x86_64.rpm
Verify the installation:
osqueryd --version
# Expected: osqueryd version 5.10.2 (or later)
Installation
Ubuntu/Debian
Step 1: Download the Package
Download the Basic Edition DEB package from the Aquilon Security portal:
- File:
aquilon-dlp-basic_VERSION_amd64.deb
Step 2: Install
sudo apt install ./aquilon-dlp-basic_VERSION_amd64.deb
Expected output:
Reading package lists... Done
Building dependency tree... Done
[INFO] Validating osquery installation...
[INFO] osquery validation passed
[INFO] Creating application directories...
[INFO] Extension binary permissions set: /usr/lib/osquery/extensions/aquilon-dlp-basic.ext
[INFO] Added extension to /etc/osquery/extensions.load
[INFO] Installation completed successfully
Step 3: Verify Installation
# Check binary location
ls -lh /usr/lib/osquery/extensions/aquilon-dlp-basic.ext
# Expected: -rwxr-xr-x 1 root root 9.3M ... aquilon-dlp-basic.ext
# Check osquery configuration
cat /etc/osquery/extensions.load
# Expected: /usr/lib/osquery/extensions/aquilon-dlp-basic.ext
# Restart osqueryd
sudo systemctl restart osqueryd
sudo systemctl status osqueryd
# Expected: active (running)
# Verify extension loaded
osqueryi --json "SELECT * FROM aquilon_dlp_alerts LIMIT 1;"
CentOS/RHEL
Step 1: Download the Package
Download the Basic Edition RPM package from the Aquilon Security portal:
- File:
aquilon-dlp-basic-VERSION.x86_64.rpm
Step 2: Install
sudo dnf install ./aquilon-dlp-basic-VERSION.x86_64.rpm
Expected output:
Last metadata expiration check: ...
Dependencies resolved.
Installing:
aquilon-dlp-basic x86_64 VERSION @commandline 9.3 M
[INFO] Validating osquery installation...
[INFO] osquery validation passed
[INFO] Creating application directories...
[INFO] Extension binary permissions set: /usr/lib/osquery/extensions/aquilon-dlp-basic.ext
[INFO] Added extension to /etc/osquery/extensions.load
[INFO] Installation completed successfully
Step 3: Verify Installation
# Check binary location
ls -lh /usr/lib/osquery/extensions/aquilon-dlp-basic.ext
# Check osquery configuration
cat /etc/osquery/extensions.load
# Restart osqueryd
sudo systemctl restart osqueryd
sudo systemctl status osqueryd
# Verify extension loaded
osqueryi --json "SELECT * FROM aquilon_dlp_alerts LIMIT 1;"
SELinux Considerations (RHEL/CentOS)
On systems with SELinux enabled, the installation script automatically restores security contexts. If issues occur:
# Check SELinux status
getenforce
# Manually restore contexts if needed
sudo restorecon -Rv /usr/lib/osquery/extensions/
sudo restorecon -Rv /etc/aquilon/
Post-Installation
Configuration
Copy the default configuration and customize:
sudo cp /etc/aquilon/config.toml.default /etc/aquilon/config.toml
sudo nano /etc/aquilon/config.toml
Basic Edition Policies:
The Basic Edition includes these compliance policies:
- GDPR - EU General Data Protection Regulation
- CCPA - California Consumer Privacy Act
- Custom TOML Policies - Define your own detection rules
Example configuration:
watch_paths = ["/home/%%", "/var/data/%%", "/srv/%%"]
[policies]
enabled_policies = ["gdpr", "ccpa"]
See the Configuration Guide for complete options.
Verify DLP is Working
Test that Aquilon DLP is detecting files:
# Create a test file with sensitive data
echo "SSN: 223-41-6711" > /tmp/test-sensitive.txt
# Wait a moment for scanning, then query alerts
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM aquilon_dlp_alerts;'
Upgrading
Ubuntu/Debian:
# Stop osqueryd (optional)
sudo systemctl stop osqueryd
# Install new package
sudo apt install ./aquilon-dlp-basic_NEW_VERSION_amd64.deb
# Start osqueryd
sudo systemctl start osqueryd
CentOS/RHEL:
# Stop osqueryd (optional)
sudo systemctl stop osqueryd
# Upgrade package
sudo dnf upgrade ./aquilon-dlp-basic-NEW_VERSION.x86_64.rpm
# Start osqueryd
sudo systemctl start osqueryd
Your configuration in /etc/aquilon/config.toml is preserved during upgrades.
Uninstalling
Ubuntu/Debian:
# Remove package
sudo apt remove aquilon-dlp-basic
# Clean up configuration (optional)
sudo rm -rf /etc/aquilon /var/lib/aquilon /var/log/aquilon
CentOS/RHEL:
# Remove package
sudo dnf remove aquilon-dlp-basic
# Clean up configuration (optional)
sudo rm -rf /etc/aquilon /var/lib/aquilon /var/log/aquilon
Upgrading to Enterprise Edition
Need HIPAA, PCI DSS, SOX, or ISO 27001 compliance? Upgrade to the Enterprise Edition:
- Uninstall the Basic Edition
- Install the Enterprise Edition
- Your configuration is preserved
Contact sales@aquilonsecurity.com for Enterprise Edition access.
Troubleshooting
Common Issues
“osquery not found” during installation
Install osquery before installing Aquilon DLP:
# Ubuntu/Debian
sudo apt install ./osquery_5.10.2-1.linux_amd64.deb
# CentOS/RHEL
sudo dnf install ./osquery-5.10.2-1.linux.x86_64.rpm
Extension not loading
-
Check extension is registered:
cat /etc/osquery/extensions.load -
Restart osqueryd:
sudo systemctl restart osqueryd -
Check logs:
journalctl -u osqueryd -f
Permission denied errors
Verify the extension has correct permissions:
ls -la /usr/lib/osquery/extensions/aquilon-dlp-basic.ext
# Should be: -rwxr-xr-x root root
Getting Help
- Documentation: Troubleshooting Guide
- Community Support: GitHub Issues
Linux Installation (Enterprise Edition)
Enterprise Edition Features: All compliance policies (HIPAA, PCI DSS, SOX, ISO 27001, GDPR, CCPA), unlimited servers, enterprise SLA support.
This guide covers installing Aquilon DLP Enterprise Edition on Linux using DEB or RPM packages.
Prerequisites
Before installing Aquilon DLP, ensure you have:
- Supported Linux Distribution:
- Ubuntu 22.04 LTS or later
- Debian 11 or later
- CentOS Stream 9 or later
- RHEL 9 or later
- Fedora 38 or later
- osquery 5.0.1 or later
- Administrator (root) privileges
Install osquery
Ubuntu/Debian:
# Download osquery DEB package
wget https://pkg.osquery.io/deb/osquery_5.10.2-1.linux_amd64.deb
# Install osquery
sudo apt install ./osquery_5.10.2-1.linux_amd64.deb
CentOS/RHEL:
# Download osquery RPM package
wget https://pkg.osquery.io/rpm/osquery-5.10.2-1.linux.x86_64.rpm
# Install osquery
sudo dnf install ./osquery-5.10.2-1.linux.x86_64.rpm
Verify the installation:
osqueryd --version
# Expected: osqueryd version 5.10.2 (or later)
Installation
Ubuntu/Debian
Step 1: Download the Package
Download the Enterprise Edition DEB package from the Aquilon Security portal:
- File:
aquilon-dlp-enterprise_VERSION_amd64.deb
Step 2: Install
sudo apt install ./aquilon-dlp-enterprise_VERSION_amd64.deb
Expected output:
Reading package lists... Done
Building dependency tree... Done
[INFO] Validating osquery installation...
[INFO] osquery validation passed
[INFO] Creating application directories...
[INFO] Extension binary permissions set: /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext
[INFO] Added extension to /etc/osquery/extensions.load
[INFO] Installation completed successfully
Step 3: Verify Installation
# Check binary location
ls -lh /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext
# Expected: -rwxr-xr-x 1 root root 9.3M ... aquilon-dlp-enterprise.ext
# Check osquery configuration
cat /etc/osquery/extensions.load
# Expected: /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext
# Restart osqueryd
sudo systemctl restart osqueryd
sudo systemctl status osqueryd
# Expected: active (running)
# Verify extension loaded
osqueryi --json "SELECT * FROM aquilon_dlp_alerts LIMIT 1;"
CentOS/RHEL
Step 1: Download the Package
Download the Enterprise Edition RPM package from the Aquilon Security portal:
- File:
aquilon-dlp-enterprise-VERSION.x86_64.rpm
Step 2: Install
sudo dnf install ./aquilon-dlp-enterprise-VERSION.x86_64.rpm
Expected output:
Last metadata expiration check: ...
Dependencies resolved.
Installing:
aquilon-dlp-enterprise x86_64 VERSION @commandline 9.3 M
[INFO] Validating osquery installation...
[INFO] osquery validation passed
[INFO] Creating application directories...
[INFO] Extension binary permissions set: /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext
[INFO] Added extension to /etc/osquery/extensions.load
[INFO] Installation completed successfully
Step 3: Verify Installation
# Check binary location
ls -lh /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext
# Check osquery configuration
cat /etc/osquery/extensions.load
# Restart osqueryd
sudo systemctl restart osqueryd
sudo systemctl status osqueryd
# Verify extension loaded
osqueryi --json "SELECT * FROM aquilon_dlp_alerts LIMIT 1;"
SELinux Considerations (RHEL/CentOS)
On systems with SELinux enabled, the installation script automatically restores security contexts. If issues occur:
# Check SELinux status
getenforce
# Verify extension details
ls -Z /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext
# Manually restore contexts if needed
sudo restorecon -Rv /usr/lib/osquery/extensions/
sudo restorecon -Rv /etc/aquilon/
Post-Installation
Configuration
Copy the default configuration and customize:
sudo cp /etc/aquilon/config.toml.default /etc/aquilon/config.toml
sudo nano /etc/aquilon/config.toml
Enterprise Edition Policies:
The Enterprise Edition includes all compliance policies:
- GDPR - EU General Data Protection Regulation
- CCPA - California Consumer Privacy Act
- HIPAA - Health Insurance Portability and Accountability Act
- PCI DSS - Payment Card Industry Data Security Standard
- SOX - Sarbanes-Oxley Act
- ISO 27001 - Information Security Management
- Custom TOML Policies - Define your own detection rules
Example configuration for healthcare organization:
watch_paths = ["/home/%%", "/var/data/%%", "/srv/%%", "/mnt/medical-records/%%"]
[policies]
enabled_policies = ["hipaa", "gdpr", "pci_dss"]
[policies.policy_configs.hipaa]
enabled = true
settings = { confidence_threshold = "0.8" }
Example configuration for financial services:
watch_paths = ["/home/%%", "/var/data/%%", "/srv/transactions/%%"]
[policies]
enabled_policies = ["pci_dss", "sox", "gdpr", "ccpa"]
[policies.policy_configs.pci_dss]
enabled = true
settings = { alert_on_test_data = "false" }
See the Configuration Guide for complete options and the Compliance Documentation for policy context.
Verify DLP is Working
Test that Aquilon DLP is detecting files:
# Create a test file with sensitive data
echo "SSN: 223-41-6729" > /tmp/test-sensitive.txt
# Wait a moment for scanning, then query alerts
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM aquilon_dlp_alerts;'
Enterprise Features
Unlimited Server Deployment
The Enterprise Edition supports unlimited servers. For large-scale deployments:
- Use configuration management (Ansible, Puppet, Chef) for consistent deployment
- Consider centralized logging aggregation
- Use osquery fleet management tools like Fleet or Kolide
Enterprise Support
Enterprise customers receive:
- Priority support with SLA guarantees
- Direct access to engineering team
- Custom policy development assistance
- Deployment and integration consulting
Contact your account representative for support.
Upgrading
Ubuntu/Debian:
# Stop osqueryd (optional)
sudo systemctl stop osqueryd
# Install new package
sudo apt install ./aquilon-dlp-enterprise_NEW_VERSION_amd64.deb
# Start osqueryd
sudo systemctl start osqueryd
CentOS/RHEL:
# Stop osqueryd (optional)
sudo systemctl stop osqueryd
# Upgrade package
sudo dnf upgrade ./aquilon-dlp-enterprise-NEW_VERSION.x86_64.rpm
# Start osqueryd
sudo systemctl start osqueryd
Your configuration in /etc/aquilon/config.toml is preserved during upgrades. The RPM package uses %config(noreplace) to ensure this.
Uninstalling
Ubuntu/Debian:
# Remove package
sudo apt remove aquilon-dlp-enterprise
# Clean up configuration (optional)
sudo rm -rf /etc/aquilon /var/lib/aquilon /var/log/aquilon
CentOS/RHEL:
# Remove package
sudo dnf remove aquilon-dlp-enterprise
# Clean up configuration (optional)
sudo rm -rf /etc/aquilon /var/lib/aquilon /var/log/aquilon
Troubleshooting
Common Issues
“osquery not found” during installation
Install osquery before installing Aquilon DLP:
# Ubuntu/Debian
sudo apt install ./osquery_5.10.2-1.linux_amd64.deb
# CentOS/RHEL
sudo dnf install ./osquery-5.10.2-1.linux.x86_64.rpm
Extension not loading
-
Check extension is registered:
cat /etc/osquery/extensions.load -
Restart osqueryd:
sudo systemctl restart osqueryd -
Check logs:
journalctl -u osqueryd -f
SELinux blocking access
On RHEL/CentOS with SELinux enforcing:
# Check for denials
sudo ausearch -m avc -ts recent
# Restore contexts
sudo restorecon -Rv /usr/lib/osquery/extensions/
sudo restorecon -Rv /etc/aquilon/
Permission denied errors
Verify the extension has correct permissions:
ls -la /usr/lib/osquery/extensions/aquilon-dlp-enterprise.ext
# Should be: -rwxr-xr-x root root
Getting Help
- Documentation: Troubleshooting Guide
- Enterprise Support: Contact your account representative
- GitHub Issues: github.com/aquilonsecurity/aquilon-dlp/issues
User Guide
This guide covers the day-to-day configuration and usage of Aquilon DLP. Whether you’re setting up initial monitoring, configuring compliance policies, or analyzing alerts, you’ll find the information you need here.
Sections
Configuration
Learn how to configure Aquilon DLP for your environment:
- Configuration file location and format
- Watch paths and file monitoring
- Caching and performance settings
- Removable media auto-scanning
- Performance tuning options
Policy Frameworks
Understand and configure compliance policies:
- Built-in compliance frameworks (GDPR, CCPA, HIPAA, PCI DSS, SOX, ISO 27001)
- Edition-specific policy availability
- Policy configuration options
- Creating custom TOML policies and scanners
- Rule types and composition
Monitoring
Monitor Aquilon DLP operation and analyze findings:
- Querying the osquery tables
- Interpreting alert data
- Cache status and performance metrics
- Log analysis and troubleshooting
- Integration with SIEM systems
Getting Started
After installing Aquilon DLP (see Installation), follow these steps:
- Configure watch paths - Define which directories to monitor for sensitive data
- Enable policies - Select compliance frameworks appropriate for your organization
- Verify operation - Create test files and query alerts to confirm detection
- Monitor ongoing - Review alerts, tune confidence thresholds, add exclusions
Quick Reference
Configuration File Location
| Platform | Location |
|---|---|
| macOS | /etc/aquilon/config.toml |
| Linux | /etc/aquilon/config.toml |
Common osquery Queries
-- View recent alerts
SELECT * FROM aquilon_dlp_alerts
ORDER BY timestamp DESC LIMIT 10;
-- Count alerts by policy
SELECT policy, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY policy;
-- View alert details
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts LIMIT 10;
Edition Policy Availability
| Policy | Basic Edition | Enterprise Edition |
|---|---|---|
| GDPR | Yes | Yes |
| CCPA | Yes | Yes |
| HIPAA | No | Yes |
| PCI DSS | No | Yes |
| SOX | No | Yes |
| ISO 27001 | No | Yes |
| Custom TOML | Yes | Yes |
Support
- Basic Edition: GitHub Issues
- Enterprise Edition: Contact your account representative
- Documentation: Troubleshooting Guide
Configuration
Aquilon DLP is configured through a TOML file that controls all aspects of operation including watch paths, policies, caching, and performance settings.
Configuration File
Location
| Platform | Default Location |
|---|---|
| macOS | /etc/aquilon/config.toml |
| Linux | /etc/aquilon/config.toml |
Initial Setup
After installation, copy the default configuration and customize:
sudo cp /etc/aquilon/config.toml.default /etc/aquilon/config.toml
sudo nano /etc/aquilon/config.toml
Core Configuration
Watch Paths
Define which directories Aquilon DLP monitors for sensitive data. Use %% to recursively watch all subdirectories:
watch_paths = [
"/home/%%",
"/var/data/%%",
"/srv/%%",
"/Users/%%"
]
Path Syntax:
/path/to/dir/%%- Watch directory and all subdirectories recursively/path/to/dir- Watch only the directory itself (no recursion)
Best Practices:
- Include directories where users store documents
- Include shared drives and collaboration folders
- Exclude system directories (already excluded by default)
- Exclude known safe directories like source code repos
Exclusions
Exclude specific paths from monitoring:
watch_paths = ["/home/%%", "/var/data/%%"]
# Exclude specific directories
exclude_paths = [
"/home/*/.cache/%%",
"/home/*/Downloads/%%",
"/var/log/%%"
]
Policy Configuration
Enable Policies
Select which compliance frameworks to enable:
[policies]
enabled_policies = ["gdpr", "ccpa", "hipaa", "pci_dss", "sox", "iso27001"]
Available Policies:
| Policy | Description | Edition |
|---|---|---|
gdpr | EU General Data Protection Regulation | All |
ccpa | California Consumer Privacy Act | All |
hipaa | Health Insurance Portability and Accountability | Enterprise |
pci_dss | Payment Card Industry Data Security Standard | Enterprise |
sox | Sarbanes-Oxley Act | Enterprise |
iso27001 | Information Security Management | Enterprise |
Policy-Specific Settings
Configure individual policy behavior:
[policies.policy_configs.hipaa]
enabled = true
settings = { confidence_threshold = "0.8" }
[policies.policy_configs.pci_dss]
enabled = true
settings = { alert_on_test_data = "false" }
[policies.policy_configs.iso27001]
enabled = true
settings = { confidence_threshold = "0.7", enforce_data_masking = "true" }
Caching Configuration
Aquilon DLP uses a two-tier caching system to minimize redundant scanning:
[cache]
# Enable/disable caching (default: true)
enabled = true
# In-memory cache TTL in seconds (default: 0 = no expiry)
ttl_secs = 3600
# Database scan cache TTL in days (default: 7)
scan_cache_ttl_days = 7
Note: The database location is configured via the top-level
database_pathfield:# Linux: /var/lib/aquilon/aquilon.db # macOS: /var/db/aquilon/aquilon.db database_path = "/var/lib/aquilon/aquilon.db"Platform Note: Default database paths differ by platform. The macOS PKG installer automatically configures the macOS path at
/var/db/aquilon/aquilon.db.
Cache Performance
- Cache hit on clean file: <5ms p99
- Cache hit with alerts: <20ms p95
- Cache vs full scan: 10-100x faster
Cache Behavior
| File State | Cache Behavior |
|---|---|
| Clean (no findings) | Fully cached, subsequent scans skipped |
| Has alerts | Alert details cached as JSON (up to 25 alerts) |
| Modified | Cache entry invalidated, full rescan |
Removable Media
Configure automatic scanning of USB drives and external media:
[removable_media]
# Automatically scan removable media when mounted (default: false)
auto_scan_on_mount = true
Platform-Specific Behavior:
| Platform | Detection Method | Monitored Paths |
|---|---|---|
| macOS | Endpoint Security mount events | /Volumes/* (excluding system) |
| Linux | /proc/self/mounts polling | /media/*, /mnt/*, /run/media/* |
Use Cases:
- Data exfiltration detection
- Compliance monitoring for removable media
- Incident response device scanning
Performance Note: Large external drives (8TB+) with significant data will take time to scan. Consider the resource impact before enabling.
Performance Tuning
Scan Settings
[scan]
# Maximum findings per scanner per file (default: 5)
max_findings_per_scanner = 5
# Maximum file size in MB to scan (default: 40)
max_scan_size_mb = 40
# Maximum recursion depth for nested archives (default: 5)
max_recursion_depth = 5
# Regex size limits
regex_size_limit_mb = 10
regex_dfa_size_limit_mb = 2
# File update cooldown in minutes (default: 30)
file_update_cooldown_mins = 30
# Event coalesce delay in seconds (default: 120)
event_coalesce_delay_secs = 120
Resource Limits
[resource_limits]
# Enable resource limiting (default: false)
enabled = true
# Maximum CPU usage percentage (default: 50.0)
max_cpu_percent = 50.0
# Maximum memory in MB (default: 512)
max_memory_mb = 512
# Maximum disk I/O in MB/s (default: 50.0)
max_disk_io_mbps = 50.0
# Process nice level (default: 10)
nice_level = 10
# Throttle delay between scans in ms (default: 10)
throttle_delay_ms = 10
Worker Configuration
[worker]
# Number of worker threads (default: 0 = auto-detect CPU cores)
num_workers = 4
# Timeout for receiving work items in ms (default: 1000)
recv_timeout_ms = 1000
[work_queue]
# Maximum queue size (default: 10000)
max_queue_size = 10000
# Submit timeout in seconds (default: 5)
submit_timeout_secs = 5
Context Configuration
[context]
# Context window size in bytes for surrounding text capture (default: 200)
# Larger values provide more details but impact performance
window_size = 200
# Enable specific context profiles
# Available profiles:
# - healthcare: Medical terms (patient, diagnosis, HIPAA keywords)
# - payment: Financial transaction terms (credit card, payment, PCI keywords)
# - personal_data: PII identifiers (SSN, address, contact info)
# - employment: HR/payroll terms (employee, salary, W-2)
# - sox_financial: SOX compliance terms (revenue, earnings, 10-K, quarterly)
# - gdpr_phone: Personal vs business phone context (mobile, cell, office)
enabled_profiles = ["healthcare", "payment", "personal_data", "employment", "sox_financial", "gdpr_phone"]
Context Trace
Enable debug tracing for context enrichment decisions. When enabled, detailed JSON logs are emitted showing how each finding’s confidence was adjusted based on surrounding context.
Note: This feature generates verbose output and should only be enabled when debugging enrichment behavior (e.g., investigating false positives or negatives).
[context_trace]
# Enable context enrichment debug tracing (default: false)
# When enabled, emits JSON logs showing enrichment decisions:
# - Original confidence scores
# - Context profiles matched
# - Confidence adjustments applied
# - Final enriched confidence
enabled = false
See Troubleshooting: Debugging Enrichment for usage guidance.
CPU Debugging
Enable detailed performance metrics for troubleshooting:
[cpu_debugging]
# Enable CPU debugging features (default: true)
enabled = true
# Histogram buckets for latency tracking in ms (must be ascending)
histogram_buckets = [10, 50, 100, 500, 1000, 5000, 10000, 30000]
# Threshold for slow file warnings in ms (default: 1000)
slow_file_threshold_ms = 1000
# Maximum slow files to track (default: 10)
max_slow_files = 10
# Enable worker thread status tracking (default: true)
worker_tracking_enabled = true
# Enable performance alerting (default: false)
alerting_enabled = false
# Scanner processing time alert threshold in ms (default: 5000)
scanner_alert_threshold_ms = 5000
# Work queue pending items alert threshold (default: 1000)
queue_alert_threshold = 1000
Database Maintenance
Aquilon DLP includes automatic database maintenance to manage disk usage and keep the local database cache healthy. The local database is designed as a cache—your SIEM should handle long-term retention.
⚠️ Compliance Warning
The default
findings_max_age_daysof 7 days is SHORT for compliance requirements:
- HIPAA: 6 years (2190 days)
- SOX: 7 years (2555 days)
- PCI-DSS: 1 year (365 days)
Ensure your SIEM captures findings for long-term retention before enabling aggressive cleanup. The local database is intended as a cache, not permanent storage.
Basic Configuration
[maintenance]
# Enable background maintenance thread (default: true)
enabled = true
# Interval between maintenance runs in seconds (default: 3600 = 1 hour)
# Minimum: 60 seconds
interval_secs = 3600
Retention Settings
Configure how long data is retained before cleanup:
[maintenance.retention]
# Maximum age for findings before soft-delete (default: 7 days)
# Minimum: 1 day
findings_max_age_days = 7
# Maximum age for scan cache entries (default: 7 days)
# Minimum: 1 day
cache_max_age_days = 7
# Days to wait before hard-deleting soft-deleted findings (default: 1)
# Set to 0 for immediate hard delete (when SIEM has captured data)
hard_delete_grace_days = 1
Vacuum Settings
Configure incremental vacuum to reclaim disk space:
[maintenance.vacuum]
# Pages to reclaim per incremental vacuum run (default: 1000)
# Each page is ~4KB, so 1000 pages = ~4MB per run
# Set to 0 to disable vacuum operations
incremental_pages = 1000
Manual Maintenance
Run maintenance immediately without starting the daemon:
# Run maintenance once and exit
aquilon-dlp --maintenance-now --config /etc/aquilon/config.toml
# Output is JSON with counts and duration:
# {
# "soft_deleted": 42,
# "hard_deleted": 15,
# "cache_evicted": 128,
# "pages_vacuumed": 1000,
# "duration_ms": 234,
# "errors": []
# }
See Operations for additional database management commands.
Logging Configuration
Logging is configured via the RUST_LOG environment variable:
# Set log level
export RUST_LOG=info
# Set per-module log levels
export RUST_LOG=aquilon_dlp=debug,warn
# Available levels: error, warn, info, debug, trace
The application uses structured logging via the tracing crate. Logs are written to stdout/stderr and can be redirected as needed by your init system.
OSQuery Integration
Aquilon DLP exposes alerts via an OSQuery virtual table. Configure behavior:
[osquery]
# Maximum rows returned for alerts table without explicit LIMIT clause
# Prevents memory exhaustion from unbounded queries
# Default: 10000, set to 0 for unlimited (not recommended)
max_alert_rows = 10000
Note: When querying large alert sets, use WHERE clauses to filter results. Unbounded
SELECT * FROM aquilon_dlp_alertsqueries will be truncated at this limit.
Example Configurations
Healthcare Organization (HIPAA Focus)
watch_paths = ["/home/%%", "/var/data/%%", "/srv/%%", "/mnt/medical-records/%%"]
exclude_paths = ["/home/*/.cache/%%"]
database_path = "/var/lib/aquilon/aquilon.db"
[policies]
enabled_policies = ["hipaa", "gdpr", "pci_dss"]
[policies.policy_configs.hipaa]
enabled = true
settings = { confidence_threshold = "0.8" }
[removable_media]
auto_scan_on_mount = true
[cache]
enabled = true
ttl_secs = 3600
scan_cache_ttl_days = 7
[scan]
max_findings_per_scanner = 10
max_scan_size_mb = 100
[resource_limits]
enabled = true
max_cpu_percent = 75.0
max_memory_mb = 1024
Financial Services (PCI DSS/SOX Focus)
watch_paths = ["/home/%%", "/var/data/%%", "/srv/transactions/%%"]
exclude_paths = ["/home/*/Downloads/%%"]
# Linux: /var/lib/aquilon/aquilon.db
# macOS: /var/db/aquilon/aquilon.db
database_path = "/var/lib/aquilon/aquilon.db"
[policies]
enabled_policies = ["pci_dss", "sox", "gdpr", "ccpa"]
[policies.policy_configs.pci_dss]
enabled = true
settings = { alert_on_test_data = "false" }
[policies.policy_configs.sox]
enabled = true
settings = { confidence_threshold = "0.85" }
[removable_media]
auto_scan_on_mount = true
[cache]
enabled = true
ttl_secs = 7200
scan_cache_ttl_days = 14
[worker]
num_workers = 8
[resource_limits]
enabled = true
max_cpu_percent = 60.0
max_memory_mb = 1024
Small Business (Basic Edition)
watch_paths = ["/home/%%", "/var/data/%%"]
exclude_paths = ["/home/*/.cache/%%"]
# Linux: /var/lib/aquilon/aquilon.db
# macOS: /var/db/aquilon/aquilon.db
database_path = "/var/lib/aquilon/aquilon.db"
[policies]
enabled_policies = ["gdpr", "ccpa"]
[cache]
enabled = true
ttl_secs = 3600
scan_cache_ttl_days = 7
[scan]
max_findings_per_scanner = 5
max_scan_size_mb = 40
[worker]
num_workers = 2 # Conservative for small systems
[resource_limits]
enabled = true
max_cpu_percent = 30.0
max_memory_mb = 256
Complete Example Configurations
For complete, production-ready configuration examples, see:
- Basic Edition:
docs/config-examples/aquilon_dlp_config_basic.toml - Enterprise Edition:
docs/config-examples/aquilon_dlp_config_enterprise.toml
Custom Scanners and Policies
Custom scanners and policies are defined directly in the main configuration file using [[scanners]] and [[custom_policies]] sections. See Policy Frameworks for creating custom policies.
Example custom scanner (add to your main config):
[[scanners]]
name = "employee_id"
description = "ACME Corp employee IDs (format: EMP-######)"
regex = "EMP-[0-9]{6}"
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.85
Validate your configuration:
sudo aquilon-dlp --config /etc/aquilon/aquilon_dlp_config.toml --validate-config
Applying Configuration Changes
After modifying the configuration file, restart the osqueryd service:
macOS:
sudo launchctl unload /Library/LaunchDaemons/io.osquery.agent.plist
sudo launchctl load /Library/LaunchDaemons/io.osquery.agent.plist
Note: OSQuery 5.0.1+ uses
io.osquery.agent.plist. Older versions usecom.facebook.osqueryd.plist.
Linux:
sudo systemctl restart osqueryd
Validating Configuration
Check for configuration errors in the logs:
# macOS
tail -f /var/log/aquilon/aquilon-dlp.log | grep -i error
# Linux
journalctl -u osqueryd -f | grep -i aquilon
Common validation errors:
- Invalid TOML syntax
- Unknown policy names
- Invalid regex patterns in custom scanners
- Missing required fields
Environment Variables
Override configuration settings with environment variables using the AQUILON_DLP_ prefix:
# Worker configuration
export AQUILON_DLP_WORKER_NUM_WORKERS=8
# Resource limits
export AQUILON_DLP_RESOURCE_LIMITS_ENABLED=true
export AQUILON_DLP_RESOURCE_LIMITS_MAX_CPU_PERCENT=75.0
# Cache configuration
export AQUILON_DLP_CACHE_ENABLED=true
export AQUILON_DLP_CACHE_TTL_SECS=3600
# Watch paths (JSON array format)
export AQUILON_DLP_WATCH_PATHS='["/home/%%","/var/data/%%"]'
# Database path
export AQUILON_DLP_DATABASE_PATH=/var/lib/aquilon/aquilon.db
Environment variables override TOML configuration using underscore-separated paths. For example, [resource_limits] max_cpu_percent becomes AQUILON_DLP_RESOURCE_LIMITS_MAX_CPU_PERCENT.
Configuration Reference
For complete configuration reference and schema documentation, see the comments in the default configuration file:
cat /etc/aquilon/config.toml.default
Custom Scanners
Custom scanners let you define organization-specific detection patterns using regular expressions. Use them when built-in scanners don’t cover your proprietary identifiers like employee IDs, project codes, or internal account numbers.
Key features:
- Regex-based pattern matching with bounded quantifiers
- Confidence tuning via keyword proximity (boost/reduce)
- Validation rules with checksums and invalid patterns
- Multi-capture group redaction
Custom scanners integrate automatically with policies using the custom: prefix (e.g., custom:employee_id).
For integrating custom scanners with policies, SIEM systems, and fleet deployment, see Custom Scanner Integration.
Quick Start
Add a custom scanner to your configuration file:
[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})"
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.85
description = "ACME Corp employee IDs"
Test your scanner:
# Validate configuration
sudo aquilon-dlp --config /etc/aquilon/config.toml --validate-config
# Scan a test file
echo "Employee ID: EMP-123456" > /tmp/test.txt
sudo aquilon-dlp --config /etc/aquilon/config.toml --scan /tmp/test.txt
Discovering Built-in Scanners
Before creating a custom scanner, check if a built-in scanner already covers your use case. Aquilon DLP includes 30+ built-in scanners for common sensitive data types.
List Available Scanners
Use the CLI to see all available scanners (built-in and custom):
aquilon-dlp --list-scanners
Example output:
Built-in Scanners:
ssn - US Social Security Numbers
credit_card - Credit/debit card numbers (Visa, MC, Amex, etc.)
email - Email addresses
phone - Phone numbers (US, international)
iban - International Bank Account Numbers
passport - Passport numbers
drivers_license - Driver's license numbers
...
Custom Scanners:
custom:employee_id - ACME Corp employee IDs
custom:project_code - Internal project codes
Built-in Scanner Categories
Built-in scanners are organized by data type:
| Category | Scanners | Use Case |
|---|---|---|
| PII | ssn, email, phone, address, date_of_birth | Personal data protection |
| Financial | credit_card, iban, bank_account, aba_routing | PCI DSS, financial compliance |
| Healthcare | medical_record_number, npi, health_plan_id | HIPAA compliance |
| Government | passport, drivers_license, ein | Identity documents |
| Technical | api_key, private_key, database_connection | Secret detection |
For complete scanner-to-compliance mappings, see Policy Frameworks.
When to Create Custom Scanners
Create custom scanners when:
- Organization-specific identifiers: Employee IDs, project codes, internal account numbers
- Industry-specific formats: Your company’s unique document numbering scheme
- Regional identifiers not built-in: Some EU national IDs require custom patterns
Configuration Reference
All fields for [[scanners]] entries:
| Field | Required | Type | Description |
|---|---|---|---|
name | Yes | String | Unique identifier (alphanumeric + underscore, max 64 chars). Referenced as custom:{name} in policies. |
regex | Yes | String | Pattern to match. Must use bounded quantifiers (see Pattern Safety). |
redaction_pattern | Yes | String | Template for redacting matches. X sequences map to capture group lengths. |
base_confidence | Yes | Float | Base confidence score (0.0 - 1.0). Higher values = more confident the match is real. |
description | No | String | Human-readable description for documentation. |
context_signals | No | Array | Keywords attached to findings for classification (e.g., ["hr", "confidential"]). |
confidence_boost | No | Object | Boost confidence when positive keywords found nearby. See Confidence Tuning. |
confidence_reduce | No | Object | Reduce confidence when negative keywords found nearby. See Confidence Tuning. |
validation | No | Object | Additional validation rules. See Validation Rules. |
Pattern Safety
All regex patterns must be bounded to prevent performance issues. Unbounded patterns like \d+, .*, or [A-Z]+ will be rejected.
# SAFE - bounded patterns
[[scanners]]
name = "fixed_length"
regex = "EMP-([0-9]{6})" # Fixed length: exactly 6 digits
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.85
[[scanners]]
name = "max_length"
regex = "ID-([0-9]{1,20})" # Maximum 20 digits
redaction_pattern = "ID-XXXX"
base_confidence = 0.85
[[scanners]]
name = "range_length"
regex = "CODE-([A-Z]{3,6})" # 3 to 6 uppercase letters
redaction_pattern = "CODE-XXXX"
base_confidence = 0.85
Unsafe patterns that will be rejected:
\d+(unbounded digits).*(unbounded anything)[A-Z]+(unbounded letters)(.*)(unbounded capture)
Regex Escaping in TOML
TOML strings require backslash escaping. Use one of these approaches:
# Option 1: Escape backslashes (double them)
[[scanners]]
name = "escaped_digits"
regex = "ID-(\\d{6})" # \d becomes \\d in double quotes
redaction_pattern = "ID-XXXXXX"
base_confidence = 0.85
# Option 2: Use literal strings (single quotes)
[[scanners]]
name = "literal_digits"
regex = 'ID-(\d{6})' # No escaping needed in single quotes
redaction_pattern = "ID-XXXXXX"
base_confidence = 0.85
# Option 3: Use character classes (no escaping)
[[scanners]]
name = "char_class"
regex = "ID-([0-9]{6})" # [0-9] instead of \d
redaction_pattern = "ID-XXXXXX"
base_confidence = 0.85
Confidence Tuning
Adjust confidence scores based on nearby keywords to reduce false positives and improve accuracy.
Boosting Confidence
Increase confidence when positive keywords appear near a match:
[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})"
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.70
[scanners.confidence_boost]
keywords = ["employee", "badge", "payroll", "personnel", "HR"]
boost_amount = 0.20
proximity = 200
Effect: When “employee” or “payroll” appears within 200 bytes, confidence increases from 0.70 to 0.90.
Reducing Confidence
Decrease confidence when negative keywords appear near a match:
[[scanners]]
name = "account_number"
regex = "ACC-([0-9]{8})"
redaction_pattern = "ACC-XXXXXXXX"
base_confidence = 0.80
[scanners.confidence_reduce]
keywords = ["example", "test", "fake", "sample", "demo"]
boost_amount = 0.50
proximity = 100
Effect: When “example” or “test” appears within 100 bytes, confidence decreases from 0.80 to 0.30.
Combining Boost and Reduce
Use both on the same scanner for nuanced confidence:
[[scanners]]
name = "project_code"
regex = "PROJ-([A-Z]{3})-([0-9]{4})"
redaction_pattern = "PROJ-XXX-XXXX"
base_confidence = 0.65
[scanners.confidence_boost]
keywords = ["confidential", "restricted", "internal"]
boost_amount = 0.25
proximity = 150
[scanners.confidence_reduce]
keywords = ["example", "documentation", "template"]
boost_amount = 0.35
proximity = 100
Confidence calculation:
| Context | Calculation | Result |
|---|---|---|
| No keywords nearby | 0.65 (base) | 0.65 |
| “confidential” nearby | 0.65 + 0.25 | 0.90 |
| “template” nearby | 0.65 - 0.35 | 0.30 |
| Both nearby | Applied independently | Varies |
Confidence Adjustment Fields
| Field | Type | Description |
|---|---|---|
keywords | Array | Words to search for in proximity to match |
boost_amount | Float | Amount to add (boost) or subtract (reduce) from confidence (0.0 - 1.0) |
proximity | Integer | Maximum distance in bytes to search for keywords (1 - 10000) |
Validation Rules
Add validation rules to filter out false positives with checksums and pattern exclusions.
[[scanners]]
name = "company_account"
regex = "ACCT-([0-9]{10})"
redaction_pattern = "ACCT-XXXXXXXXXX"
base_confidence = 0.85
[scanners.validation]
min_confidence = 0.70
invalid_patterns = ["^ACCT-0{10}$", "^ACCT-1234567890$"]
validator = "luhn"
Validation Fields
| Field | Type | Description |
|---|---|---|
min_confidence | Float | Minimum confidence threshold. Matches below this are discarded. |
invalid_patterns | Array | Regex patterns to reject (e.g., all zeros, test sequences). |
validator | String | Checksum validator to apply: luhn, mod10, mod11, or iban. |
Available Validators
| Validator | Algorithm | Use Case |
|---|---|---|
luhn | Luhn (mod 10) | Credit cards, IMEI numbers, some account numbers |
mod10 | Modulo 10 | Various identifiers with check digits |
mod11 | Modulo 11 | ISBN-10, some national IDs |
iban | IBAN checksum | International Bank Account Numbers |
Example: Filtering Test Data
[[scanners]]
name = "customer_id"
regex = "CUST-([0-9]{8})"
redaction_pattern = "CUST-XXXXXXXX"
base_confidence = 0.80
[scanners.validation]
# Reject common test patterns
invalid_patterns = [
"^CUST-0{8}$", # All zeros
"^CUST-1{8}$", # All ones
"^CUST-12345678$", # Sequential
"^CUST-99999999$" # All nines
]
Example: Luhn Checksum Validation
[[scanners]]
name = "loyalty_card"
regex = "([0-9]{4})([0-9]{4})([0-9]{4})([0-9]{4})"
redaction_pattern = "XXXX-XXXX-XXXX-XXXX"
base_confidence = 0.80
description = "16-digit loyalty card numbers with Luhn check"
[scanners.validation]
validator = "luhn"
invalid_patterns = ["^0{16}$", "^1{16}$"]
This configuration:
- Matches any 16-digit number formatted as 4 groups
- Validates it passes the Luhn checksum
- Rejects all-zeros and all-ones patterns
- Reports only valid matches
Redaction Patterns
Redaction patterns control how matched text appears in alerts and logs. X sequences map to capture groups.
Single Capture Group
[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})" # One capture group
redaction_pattern = "EMP-XXXXXX" # 6 X's for the 6-digit capture
base_confidence = 0.85
| Input | Redacted Output |
|---|---|
EMP-123456 | EMP-XXXXXX |
EMP-987654 | EMP-XXXXXX |
Multiple Capture Groups
[[scanners]]
name = "project_code"
regex = "PROJ-([A-Z]{3})-([0-9]{4})" # Two capture groups
redaction_pattern = "PROJ-XXX-XXXX" # 3 X's, then 4 X's
base_confidence = 0.90
| Input | Redacted Output |
|---|---|
PROJ-ABC-1234 | PROJ-XXX-XXXX |
PROJ-XYZ-9999 | PROJ-XXX-XXXX |
Variable Length Captures
For variable-length captures, use a fixed number of X’s as a placeholder:
[[scanners]]
name = "order_number"
regex = "ORD-([0-9]{4,10})" # 4 to 10 digits
redaction_pattern = "ORD-XXXX" # Fixed placeholder
base_confidence = 0.85
| Input | Redacted Output |
|---|---|
ORD-1234 | ORD-XXXX |
ORD-1234567890 | ORD-XXXX |
Redaction Best Practices
- Match X count to expected capture length when possible
- Use fixed placeholders for variable-length captures
- Keep redaction patterns recognizable (preserve prefixes/formatting)
- Don’t include actual data in the pattern string
Real-World Examples
Complete, production-ready configurations for common use cases.
Healthcare: Patient ID Detection
Detect patient identifiers with healthcare context boosting:
[[scanners]]
name = "patient_id"
regex = "PAT-([0-9]{8})"
redaction_pattern = "PAT-XXXXXXXX"
base_confidence = 0.60
description = "Healthcare patient identifiers"
context_signals = ["healthcare", "phi", "hipaa"]
[scanners.confidence_boost]
keywords = ["patient", "medical", "diagnosis", "treatment", "healthcare", "hospital", "clinic"]
boost_amount = 0.30
proximity = 250
[scanners.confidence_reduce]
keywords = ["example", "test", "sample", "demo", "mock"]
boost_amount = 0.40
proximity = 100
Why this works:
- Low base confidence (0.60) prevents false positives on similar numeric patterns
- Healthcare keywords boost confidence significantly when in medical context
- Test/sample keywords reduce confidence to filter documentation examples
- Context signals (
phi,hipaa) integrate with SIEM for compliance workflows
Financial: Account Number with Validation
Detect account numbers using Luhn checksum validation:
[[scanners]]
name = "financial_account"
regex = "FA-([0-9]{12})"
redaction_pattern = "FA-XXXXXXXXXXXX"
base_confidence = 0.75
description = "Financial account numbers with check digit"
context_signals = ["financial", "pci", "account"]
[scanners.confidence_boost]
keywords = ["account", "balance", "transaction", "payment", "transfer", "deposit"]
boost_amount = 0.20
proximity = 200
[scanners.validation]
validator = "luhn"
min_confidence = 0.60
invalid_patterns = [
"^FA-0{12}$",
"^FA-123456789012$",
"^FA-9{12}$"
]
Why this works:
- Luhn validator rejects numbers that fail checksum (random digit sequences)
- Invalid patterns filter known test data
- Minimum confidence threshold adds another layer of filtering
- Financial keywords boost real occurrences in transaction contexts
Engineering: Multi-Part Project Code
Detect complex identifiers with multiple capture groups:
[[scanners]]
name = "internal_project"
regex = "IPROJ-([A-Z]{2})-([0-9]{4})-([A-Z]{1})"
redaction_pattern = "IPROJ-XX-XXXX-X"
base_confidence = 0.80
description = "Internal project codes (region-number-phase)"
context_signals = ["internal", "project", "confidential"]
[scanners.confidence_boost]
keywords = ["confidential", "restricted", "internal", "proprietary"]
boost_amount = 0.15
proximity = 150
[scanners.confidence_reduce]
keywords = ["template", "example", "placeholder", "documentation"]
boost_amount = 0.30
proximity = 100
Pattern breakdown:
([A-Z]{2})- Two-letter region code (e.g., US, EU, AP)([0-9]{4})- Four-digit project number([A-Z]{1})- Single-letter phase indicator (A-Z)
Redaction mapping:
| Input | Output |
|---|---|
IPROJ-US-1234-A | IPROJ-XX-XXXX-X |
IPROJ-EU-9999-C | IPROJ-XX-XXXX-X |
Legal: Document ID with Full Feature Set
Comprehensive scanner combining all advanced features:
[[scanners]]
name = "legal_document"
regex = "DOC-([A-Z]{3})-([0-9]{6})"
redaction_pattern = "DOC-XXX-XXXXXX"
base_confidence = 0.55
description = "Legal document identifiers"
context_signals = ["legal", "confidential", "privileged"]
[scanners.confidence_boost]
keywords = ["attorney", "legal", "privileged", "confidential", "counsel", "litigation"]
boost_amount = 0.35
proximity = 300
[scanners.confidence_reduce]
keywords = ["example", "sample", "test", "template", "draft"]
boost_amount = 0.45
proximity = 150
[scanners.validation]
min_confidence = 0.50
invalid_patterns = [
"^DOC-AAA-000000$",
"^DOC-XXX-[0-9]{6}$",
"^DOC-[A-Z]{3}-123456$"
]
Confidence scenarios:
| Context | Base | Boost | Reduce | Final |
|---|---|---|---|---|
| No keywords | 0.55 | — | — | 0.55 |
| “attorney-client” nearby | 0.55 | +0.35 | — | 0.90 |
| “example document” nearby | 0.55 | — | -0.45 | 0.10 (rejected) |
| “confidential draft” | 0.55 | +0.35 | -0.45 | 0.45 (rejected) |
The low base confidence (0.55) combined with aggressive reduce (-0.45) ensures that example/template documents are filtered even when “confidential” appears nearby.
Testing Custom Scanners
Validate your scanner configuration before deployment.
Validate Configuration
Check for syntax errors and unsafe patterns:
sudo aquilon-dlp --config /etc/aquilon/config.toml --validate-config
Successful validation:
Configuration valid
Loaded 3 custom scanners:
- patient_id (bounded regex, confidence 0.60)
- financial_account (bounded regex, confidence 0.75, validator: luhn)
- internal_project (bounded regex, confidence 0.80)
Failed validation (unsafe pattern):
Configuration error: Scanner 'bad_scanner' has unsafe regex pattern
Pattern: "ID-\d+"
Error: Unbounded repetition detected
Suggestion: Use bounded quantifiers like {1,20} instead of +
Scan Test Files
Test your scanner against sample data:
# Create test file
cat > /tmp/scanner_test.txt << 'EOF'
Patient PAT-12345678 visited on 2024-01-15.
Financial account FA-123456789012 balance.
Project IPROJ-US-1234-A is confidential.
Legal document DOC-ABC-123456 under attorney review.
EOF
# Run scan
sudo aquilon-dlp --config /etc/aquilon/config.toml --scan /tmp/scanner_test.txt
Expected output:
Scanning: /tmp/scanner_test.txt
Results:
[patient_id] PAT-XXXXXXXX (confidence: 0.60, line 1)
Context signals: healthcare, phi, hipaa
[financial_account] FA-XXXXXXXXXXXX (confidence: 0.75, line 2)
Context signals: financial, pci, account
Validation: luhn passed
[internal_project] IPROJ-XX-XXXX-X (confidence: 0.80, line 3)
Context signals: internal, project, confidential
[legal_document] DOC-XXX-XXXXXX (confidence: 0.90, line 4)
Context signals: legal, confidential, privileged
Confidence boosted by: "attorney"
Summary: 4 findings in 1 file
Testing Confidence Adjustments
Verify boost and reduce behavior:
# Test with boost keywords
cat > /tmp/boost_test.txt << 'EOF'
Patient medical record: PAT-12345678
EOF
sudo aquilon-dlp --config /etc/aquilon/config.toml --scan /tmp/boost_test.txt
# Expected: confidence 0.90 (0.60 base + 0.30 boost from "patient", "medical")
# Test with reduce keywords
cat > /tmp/reduce_test.txt << 'EOF'
Example patient ID: PAT-12345678 (test data)
EOF
sudo aquilon-dlp --config /etc/aquilon/config.toml --scan /tmp/reduce_test.txt
# Expected: confidence 0.20 (0.60 base - 0.40 reduce from "example", "test")
Testing Validation Rules
Verify checksum validation:
# Valid Luhn number (passes checksum)
echo "FA-374245455400126" > /tmp/valid_luhn.txt
sudo aquilon-dlp --config /etc/aquilon/config.toml --scan /tmp/valid_luhn.txt
# Expected: Match found
# Invalid Luhn number (fails checksum)
echo "FA-123456789012" > /tmp/invalid_luhn.txt
sudo aquilon-dlp --config /etc/aquilon/config.toml --scan /tmp/invalid_luhn.txt
# Expected: No match (fails Luhn validation)
Using Policy Integration
Test custom scanners through policies:
# Policy referencing custom scanner
cat > /tmp/policy_test.toml << 'EOF'
watch_paths = ["/tmp"]
exclude_paths = []
[[scanners]]
name = "patient_id"
regex = "PAT-([0-9]{8})"
redaction_pattern = "PAT-XXXXXXXX"
base_confidence = 0.60
[policies]
enabled_policies = ["test_policy"]
[policies.policy_configs.test_policy]
enabled = true
scanners = ["custom:patient_id"]
min_confidence = 0.5
[work_queue]
max_queue_size = 10000
submit_timeout_secs = 5
[worker]
num_workers = 0
[resource_limits]
enabled = false
[metrics]
bind_address = "127.0.0.1"
port = 9000
[cache]
enabled = true
ttl_secs = 0
[scan]
max_scan_size_mb = 40
max_recursion_depth = 5
EOF
sudo aquilon-dlp --config /tmp/policy_test.toml --scan /tmp/scanner_test.txt
Note the custom: prefix when referencing custom scanners in policies.
Troubleshooting
Common issues and solutions when working with custom scanners.
Configuration Errors
| Error Message | Cause | Solution |
|---|---|---|
Unsafe regex pattern: unbounded repetition | Pattern uses +, *, or unbounded {n,} | Use bounded quantifiers: {1,20} instead of +, {0,100} instead of * |
Invalid regex syntax | Malformed regular expression | Check TOML escaping: use \\d or 'single quotes' or [0-9] |
Mismatched capture groups | Regex capture count doesn’t match X sequences | Align capture groups with redaction X runs |
Scanner name already exists | Duplicate name field | Each scanner needs a unique name |
Invalid base_confidence | Value outside 0.0-1.0 range | Use values between 0.0 and 1.0 |
Pattern Not Matching
Symptom: Scanner configured but no matches found.
Diagnostic steps:
-
Test regex separately:
echo "EMP-123456" | grep -E "EMP-([0-9]{6})" -
Check TOML escaping:
# These are all equivalent: regex = "\\d{6}" # Double backslash in double quotes regex = '\d{6}' # Single quotes (literal) regex = "[0-9]{6}" # Character class (recommended) -
Verify file is being scanned:
- Check
watch_pathsincludes the file location - Check
exclude_pathsdoesn’t exclude it - Verify file size is under
max_scan_size_mb
- Check
-
Check confidence threshold:
- If using policies, verify
min_confidenceisn’t filtering matches - Check if
confidence_reducekeywords are nearby
- If using policies, verify
False Positives
Symptom: Scanner matches too many non-relevant patterns.
Solutions:
-
Add validation rules:
[scanners.validation] invalid_patterns = ["^ACCT-0{10}$", "^ACCT-12345"] min_confidence = 0.70 -
Use confidence reduce:
[scanners.confidence_reduce] keywords = ["example", "test", "sample", "demo"] boost_amount = 0.40 proximity = 100 -
Add checksum validation:
[scanners.validation] validator = "luhn" # or "mod10", "mod11", "iban"
Policy Integration Issues
| Error Message | Cause | Solution |
|---|---|---|
Unknown scanner 'employee_id' | Missing custom: prefix | Use custom:employee_id in policy scanners list |
Scanner 'custom:foo' not found | Scanner not defined | Add [[scanners]] entry with name = "foo" |
Policy references disabled scanner | Scanner defined but not enabled | Check scanner configuration is complete |
Performance Issues
Symptom: Scanning is slow after adding custom scanners.
Solutions:
-
Check pattern complexity:
- Avoid nested alternations:
(a|b|c)is fine,((a|b)|(c|d))is slow - Avoid overlapping patterns:
[A-Za-z]+[a-z]creates backtracking
- Avoid nested alternations:
-
Reduce proximity search:
[scanners.confidence_boost] proximity = 100 # Smaller = faster (default is 200) -
Simplify validation:
invalid_patternswith simple patterns are fast- Complex regex in
invalid_patternscan slow scanning
Redaction Issues
Symptom: Redacted output looks wrong.
| Issue | Cause | Solution |
|---|---|---|
| Partial redaction | Capture group mismatch | Ensure X count matches capture group length |
XXX for variable data | Variable-length capture | Use fixed placeholder or document behavior |
| No prefix in output | Prefix not in pattern | Add prefix outside capture group: PREFIX-([0-9]{6}) |
Example fix:
# Wrong - captures everything including prefix
regex = "(EMP-[0-9]{6})"
redaction_pattern = "XXXXXXXXXX" # Loses prefix
# Correct - captures only sensitive part
regex = "EMP-([0-9]{6})"
redaction_pattern = "EMP-XXXXXX" # Preserves prefix
Best Practices
Guidelines for building effective, maintainable custom scanners.
Pattern Design
-
Always use bounded quantifiers
{6}for fixed length{1,20}for variable length with maximum- Never use
+,*, or{n,}(unbounded)
-
Use character classes over escape sequences
[0-9]instead of\d(avoids TOML escaping issues)[A-Za-z]instead of\w[^a-z]for negation
-
Capture only sensitive data
# Good: prefix preserved, only digits captured regex = "EMP-([0-9]{6})" # Bad: entire match captured regex = "(EMP-[0-9]{6})" -
Test patterns before deployment
echo "EMP-123456" | grep -E "EMP-([0-9]{6})"
Confidence Strategy
-
Start with low base confidence (0.50-0.70)
- Prevents over-alerting before context analysis
- Allows boost/reduce to have meaningful effect
-
Use boost for high-value context
- Domain-specific keywords that indicate real data
- Proximity 150-300 bytes for document context
-
Use reduce aggressively for noise
- Test, example, sample, demo, placeholder
- Proximity 50-150 bytes for nearby indicators
-
Document your confidence rationale
description = "Patient IDs: low base (0.60) + medical boost (0.30) = 0.90 in healthcare docs"
Validation Rules
-
Always add invalid_patterns for test data
- Common sequences: all zeros, all ones, sequential (123456)
- Known test values from documentation
-
Use checksums when available
- Financial accounts often have Luhn/mod10 digits
- Reduces false positives by 90%+
-
Set appropriate min_confidence
- 0.50-0.60 for high-recall (find everything)
- 0.70-0.80 for balanced precision/recall
- 0.85+ for high-precision (minimize false positives)
Organization and Maintenance
-
Use descriptive names
name = "patient_mrn" # Good: specific name = "id" # Bad: too generic -
Always include description
description = "Medical Record Numbers: MRN-XXXXXXXX format, HIPAA-regulated" -
Use context_signals for SIEM integration
context_signals = ["healthcare", "phi", "hipaa"]These tags appear in alerts and enable filtering/routing in your SIEM.
-
Group related scanners
# Healthcare scanners [[scanners]] name = "patient_mrn" # ... [[scanners]] name = "patient_ssn" # ... # Financial scanners [[scanners]] name = "account_number" # ...
Performance Optimization
-
Order patterns by specificity
- Most specific patterns first (fewer false matches)
- Generic patterns last
-
Minimize proximity for boost/reduce
- Start with 100-150 bytes
- Increase only if needed for context
-
Avoid complex alternations
# Slow: nested alternations regex = "((EMP|STAFF)-(ID|NUM))-([0-9]{6})" # Fast: separate scanners [[scanners]] name = "emp_id" regex = "EMP-ID-([0-9]{6})" [[scanners]] name = "staff_num" regex = "STAFF-NUM-([0-9]{6})"
Security Considerations
-
Never log sensitive data in tests
- Use obviously fake test data
- Don’t use real examples in documentation
-
Review patterns for over-matching
- Simple patterns like
[0-9]{9}match too broadly - Always include prefix/format markers
- Simple patterns like
-
Test with production-like data volume
- Performance issues emerge at scale
- Run against large sample files before deployment
Custom Scanner Integration
This guide covers integrating custom scanners with policies, SIEM systems, and fleet deployments. For creating custom scanners, see Custom Scanners.
Combining Built-in and Custom Scanners
Custom scanners work alongside built-in scanners in policies. The key difference is the naming convention:
- Built-in scanners: Use the scanner name directly (e.g.,
ssn,email,iban) - Custom scanners: Use the
custom:prefix (e.g.,custom:employee_id)
[policies]
enabled_policies = ["data_protection"]
[policies.policy_configs.data_protection]
enabled = true
settings = { confidence_threshold = "0.7" }
For advanced policy composition (AND/OR rules, thresholds), see Policy Frameworks.
Example: GDPR with Custom Identifiers
A common scenario is extending GDPR compliance with organization-specific identifiers. This example combines built-in EU data scanners with custom project codes:
# Custom scanner for internal project codes
[[scanners]]
name = "project_code"
regex = "PROJ-([A-Z]{2})-([0-9]{4})"
redaction_pattern = "PROJ-XX-XXXX"
base_confidence = 0.80
context_signals = ["internal", "confidential"]
[scanners.confidence_boost]
keywords = ["confidential", "restricted", "internal"]
boost_amount = 0.15
proximity = 150
[scanners.confidence_reduce]
keywords = ["example", "template", "documentation"]
boost_amount = 0.30
proximity = 100
[policies]
enabled_policies = ["gdpr_extended"]
[policies.policy_configs.gdpr_extended]
enabled = true
settings = { confidence_threshold = "0.7" }
For complete GDPR scanner mappings and compliance guidance, see GDPR Compliance.
Reducing False Positives from Test Files
Development environments often contain test fixtures with fake sensitive data. Use these strategies to reduce false positives.
Path-Based Exclusions
Exclude entire directories from scanning using global exclude_paths:
watch_paths = ["/home/%%", "/var/data/%%"]
exclude_paths = [
# Test directories
"/home/*/projects/*/tests/%%",
"/home/*/projects/*/test/%%",
"/home/*/projects/*/__tests__/%%",
# Test fixtures and mock data
"/home/*/projects/*/fixtures/%%",
"/home/*/projects/*/mock-data/%%",
"/home/*/projects/*/testdata/%%",
# Build artifacts
"/home/*/projects/*/node_modules/%%",
"/home/*/projects/*/target/%%",
"/home/*/projects/*/.git/%%"
]
Keyword-Based Confidence Reduction
For files that can’t be excluded by path, use confidence_reduce to lower confidence when test-related keywords appear nearby:
[[scanners]]
name = "customer_id"
regex = "CUST-([0-9]{8})"
redaction_pattern = "CUST-XXXXXXXX"
base_confidence = 0.80
[scanners.confidence_reduce]
keywords = [
# Test indicators
"test", "spec", "mock", "fake", "dummy",
# Documentation indicators
"example", "sample", "demo", "placeholder",
# Development indicators
"fixture", "seed", "factory"
]
boost_amount = 0.50
proximity = 100
With base_confidence = 0.80 and boost_amount = 0.50, matches near test keywords drop to 0.30 confidence, which typically falls below policy thresholds.
For detailed confidence tuning patterns, see Custom Scanners - Confidence Tuning. For global configuration options, see Configuration.
SIEM Integration
Custom scanner findings flow to your SIEM through the OSQuery aquilon_dlp_alerts table.
How context_signals Flow to Alerts
The context_signals you define on custom scanners appear in alert metadata:
[[scanners]]
name = "patient_id"
regex = "PAT-([0-9]{8})"
redaction_pattern = "PAT-XXXXXXXX"
base_confidence = 0.85
context_signals = ["healthcare", "phi", "hipaa"] # These flow to alerts
Key Alert Fields for Custom Scanners
Query custom scanner alerts via OSQuery:
SELECT
timestamp,
path,
scanner,
confidence,
policy,
severity,
context
FROM aquilon_dlp_alerts
WHERE scanner LIKE 'custom:%'
ORDER BY timestamp DESC
LIMIT 100;
The context JSON field contains context_signals for SIEM filtering and routing.
Splunk Integration Example
Schedule OSQuery to export alerts, then query in Splunk:
index=osquery sourcetype=osquery:results name=aquilon_dlp_alerts
| spath input=columns.context
| search context_signals="*healthcare*"
| stats count by scanner, severity, policy
For complete SIEM integration including Elastic Stack, see Monitoring - SIEM Integration. For the full alert schema, see API Integration.
Fleet Deployment
Deploy custom scanner configurations across your fleet using MDM or configuration management tools.
Centralized Configuration
- Create a base configuration with your custom scanners and policies
- Deploy via MDM (Jamf, Intune, Kandji) to managed devices
- Verify deployment using OSQuery fleet queries
Example verification query to confirm custom scanners are active:
SELECT
name,
version,
status
FROM aquilon_dlp_status
WHERE status = 'running';
Deployment Resources
- MDM deployment guide: MDM Deployment - PPPC profiles, staged rollout
- Enterprise patterns: Enterprise Deployment - pilot groups, success metrics
- macOS requirements: macOS Installation - Full Disk Access, entitlements
Performance Considerations
Custom scanners add minimal overhead, but keep these guidelines in mind for large fleets:
Scanner Count
- 10-20 custom scanners: Negligible performance impact
- 20-50 custom scanners: Monitor scan latency metrics
- 50+ custom scanners: Consider splitting into multiple policies by use case
Proximity Search Tuning
Large proximity values in confidence_boost/confidence_reduce increase memory usage per scan:
[scanners.confidence_boost]
keywords = ["confidential"]
boost_amount = 0.20
proximity = 100 # Recommended: 100-200 bytes
# proximity = 1000 # Avoid: increases memory per match
Monitoring Performance
Track scanner performance via Prometheus metrics:
aquilon_scan_duration_seconds- Per-file scan timeaquilon_scanner_matches_total- Matches by scanner nameaquilon_queue_depth- Work queue backlog
For metrics setup, see Monitoring.
CLI Reference
Aquilon DLP provides command-line tools for testing, validating, and debugging your DLP configuration before deploying to production.
Quick Reference
| Command | Purpose |
|---|---|
--validate-config | Validate configuration file syntax and references |
--list-scanners | List all available scanners (built-in + custom) |
--list-policies | List all available policies (built-in + custom) |
--test-scanner | Test a specific scanner against a file |
--test-policy | Test a specific policy against a file |
--dry-run | Scan a file without database persistence |
--maintenance-now | Run database maintenance immediately |
Configuration Validation
--validate-config
Validates your configuration file for syntax errors, invalid regex patterns, and missing scanner references.
Syntax:
aquilon-dlp --validate-config <config-file>
Example:
aquilon-dlp --validate-config /etc/aquilon/config.toml
Output: Returns exit code 0 if valid, non-zero with error details if invalid.
Use when:
- After editing configuration files
- Before deploying configuration changes
- Validating custom scanner regex patterns
Discovery Commands
--list-scanners
Lists all available scanners including built-in scanners and any custom scanners defined in your configuration.
Syntax:
aquilon-dlp --list-scanners [--config <config-file>]
Example:
aquilon-dlp --list-scanners
Sample output:
ssn
credit_card
email
phone
iban
ip_address
aws_key
custom:employee_id
Built-in scanners include:
ssn- US Social Security Numberscredit_card- Credit/debit card numbers (Visa, MC, Amex, etc.)email- Email addressesphone- Phone numbers (US and international)iban- International Bank Account Numbersip_address- IPv4 and IPv6 addressesaws_key- AWS access keys and secrets- And more…
--list-policies
Lists all available policies including built-in compliance frameworks and any custom policies defined in your configuration.
Syntax:
aquilon-dlp --list-policies [--config <config-file>]
Example:
aquilon-dlp --list-policies
Sample output:
hipaa
gdpr
pci_dss
sox
ccpa
iso27001
custom:internal_data
Built-in policies include:
hipaa- Health Insurance Portability and Accountability Actgdpr- General Data Protection Regulationpci_dss- Payment Card Industry Data Security Standardsox- Sarbanes-Oxley Actccpa- California Consumer Privacy Actiso27001- ISO/IEC 27001 Information Security
Testing Commands
--test-scanner
Tests a specific scanner against a file and outputs JSON results with any findings.
Syntax:
aquilon-dlp --test-scanner <scanner-name> --test-file <file-path>
Example:
aquilon-dlp --test-scanner ssn --test-file /var/test-data/sample-data.csv
Output: JSON with findings array:
{
"scanner": "ssn",
"file": "/tmp/test-ssn.txt",
"findings": [
{
"matched_text": "123-45-6789",
"position": 10,
"confidence": 0.85,
"redacted_text": "XXX-XX-6789"
}
],
"duration_ms": 5
}
Use when:
- Developing custom scanners
- Debugging detection issues
- Verifying scanner behavior
--test-policy
Tests a specific policy against a file and outputs JSON results with any violations.
Syntax:
aquilon-dlp --test-policy <policy-name> --test-file <file-path>
Example:
aquilon-dlp --test-policy gdpr --test-file /var/test-data/sample-data.csv
Output: JSON with policy evaluation results:
{
"policy": "gdpr",
"file": "/var/test-data/sample-data-csv.tgz",
"matched": true,
"violations": [
{
"rule_id": "Article-4",
"description": "Unprotected personal data detected - email address violates GDPR requirements",
"severity": "Medium",
"evidence_count": 1
},
{
"rule_id": "Article-32",
"description": "Unprotected financial personal data detected - violates GDPR security requirements",
"severity": "High",
"evidence_count": 1
}
],
"total_findings": 2,
"scan_duration_ms": 19
}
Use when:
- Developing custom policies
- Testing policy rules
- Verifying compliance detection
--dry-run
Scans a file using all configured scanners and policies without persisting to the database. Outputs JSON results to stdout.
Syntax:
aquilon-dlp --dry-run <file-path> [--config <config-file>]
Example:
aquilon-dlp --dry-run /var/test-data/sample-data.csv
Output: JSON with complete scan results:
{
"file": "/var/test-data/sample-data-csv",
"mime_type": "application/octet-stream",
"file_size_bytes": 2929,
"scan_duration_ms": 1648,
"findings": [
{
"scanner": "PCI-DSS_policy",
"matched_text": "xxxx-xxxx-xxxx-5516, xxxx-xxxx-xxxx-3020, xxxx-xxxx-xxxx-6147",
"position": 0,
"confidence": 1.0,
"pattern_type": "cc",
"redacted": "xxxx-xxxx-xxxx-5516, xxxx-xxxx-xxxx-3020, xxxx-xxxx-xxxx-6147"
},
{
"scanner": "GDPR_policy",
"matched_text": "xxxx-xxxx-xxxx-5516, xxxx-xxxx-xxxx-3020, xxxx-xxxx-xxxx-6147",
"position": 0,
"confidence": 1.0,
"pattern_type": "cc",
"redacted": "xxxx-xxxx-xxxx-5516, xxxx-xxxx-xxxx-3020, xxxx-xxxx-xxxx-6147"
}
],
"policies_matched": [
"GDPR",
"PCI-DSS"
],
"total_findings": 2
}
Use when:
- Testing files before enabling monitoring
- Debugging why files are (or aren’t) flagged
- One-off scans without affecting database
- CI/CD pipeline integration
Maintenance Commands
--maintenance-now
Runs database maintenance tasks immediately and exits. This includes cleanup of old findings, cache eviction, and vacuum operations.
Syntax:
aquilon-dlp --maintenance-now [--config <config-file>]
Example:
aquilon-dlp --maintenance-now --config /etc/aquilon/config.toml
Output: JSON with maintenance results:
{
"soft_deleted": 42,
"hard_deleted": 15,
"cache_evicted": 128,
"pages_vacuumed": 1000,
"duration_ms": 234,
"errors": []
}
Use when:
- Before database backups
- After bulk data imports
- To reclaim disk space immediately
- Troubleshooting database issues
Testing Workflow
When developing custom scanners or policies, use this recommended workflow:
-
Validate configuration after any changes:
aquilon-dlp --validate-config /etc/aquilon/config.toml -
List available scanners to verify custom scanners loaded:
aquilon-dlp --list-scanners --config /etc/aquilon/config.toml -
Test individual scanner against sample files:
aquilon-dlp --test-scanner my_custom_scanner --test-file sample.txt -
Test policy to verify detection rules:
aquilon-dlp --test-policy my_policy --test-file sample.txt -
Dry-run scan to see full results:
aquilon-dlp --dry-run sample.txt --config /etc/aquilon/config.toml
Platform Notes
The binary name varies by platform and edition:
| Platform | Edition | Binary Name |
|---|---|---|
| Linux | Basic | aquilon-dlp-basic |
| Linux | Enterprise | aquilon-dlp-enterprise |
| macOS | Enterprise | aquilon-dlp (in app bundle) |
Examples in this documentation use aquilon-dlp for simplicity. Substitute with your platform-specific binary name.
Policy Frameworks
Aquilon DLP includes built-in compliance policy frameworks that automatically classify findings and generate violations according to regulatory requirements. You can also create custom policies using TOML configuration.
Built-in Compliance Frameworks
Overview
| Framework | Standard | Key Controls | Edition |
|---|---|---|---|
| GDPR | EU General Data Protection Regulation | Articles 5, 32, 33 | All |
| CCPA | California Consumer Privacy Act | Sections 1798.100-199 | All |
| HIPAA | Health Insurance Portability and Accountability Act | Sections 164.306, 164.312 | Enterprise |
| PCI DSS | Payment Card Industry Data Security Standard | Requirements 3, 4, 12 | Enterprise |
| SOX | Sarbanes-Oxley Act | Sections 302, 404, 409 | Enterprise |
| ISO 27001 | Information Security Management | Controls A.8.12, A.5.12, A.8.11 | Enterprise |
| CUI | Controlled Unclassified Information | NIST SP 800-171 | Enterprise |
| CMMC | Cybersecurity Maturity Model Certification | DFARS 252.204-7012 | Enterprise |
| FedRAMP | Federal Risk and Authorization Management | NIST SP 800-53 | Enterprise |
| FISMA | Federal Information Security Modernization Act | FIPS 199, NIST SP 800-53 | Enterprise |
GDPR (General Data Protection Regulation)
The GDPR policy detects EU personal data subject to data protection regulations.
Detected Data Types:
- Personal identifiers (names, addresses, phone numbers)
- Email addresses
- National identification numbers
- Financial account data
- Health information
Configuration:
[policies]
enabled_policies = ["gdpr"]
[policies.policy_configs.gdpr]
enabled = true
settings = { confidence_threshold = "0.7", requires_cc_context = "true" }
Context-Aware Credit Card Detection:
By default, GDPR policy requires payment context keywords to detect credit card numbers. This reduces false positives from Luhn-valid numbers appearing in non-payment contexts (JSON logs, test files, etc.).
| Setting | Default | Effect |
|---|---|---|
requires_cc_context | "true" | CC findings require payment context keywords |
Payment context keywords: payment, card, merchant, transaction, billing, invoice
To restore legacy behavior (alert on all Luhn-valid credit cards regardless of context):
settings = { requires_cc_context = "false" }
CCPA (California Consumer Privacy Act)
The CCPA policy detects California consumer personal information.
Detected Data Types:
- Personal identifiers
- Social Security numbers
- Driver’s license numbers
- Financial information
- Geolocation data
- Biometric information
Configuration:
[policies]
enabled_policies = ["ccpa"]
[policies.policy_configs.ccpa]
enabled = true
settings = { confidence_threshold = "0.7" }
HIPAA (Health Insurance Portability and Accountability Act)
Enterprise Edition Only
The HIPAA policy detects Protected Health Information (PHI).
Detected Data Types:
- Medical record numbers
- Health plan beneficiary numbers
- Social Security numbers
- Names with medical details
- Dates of service
- Provider information
Configuration:
[policies]
enabled_policies = ["hipaa"]
[policies.policy_configs.hipaa]
enabled = true
settings = { confidence_threshold = "0.8" }
PCI DSS (Payment Card Industry Data Security Standard)
Enterprise Edition Only
The PCI DSS policy detects payment card data.
Detected Data Types:
- Credit card numbers (validated with Luhn algorithm)
- Card security codes (CVV/CVC)
- Cardholder names
- Expiration dates
- Magnetic stripe data
Configuration:
[policies]
enabled_policies = ["pci_dss"]
[policies.policy_configs.pci_dss]
enabled = true
settings = { alert_on_test_data = "false", requires_cc_context = "true" }
Context-Aware Credit Card Detection:
By default, PCI DSS policy requires payment context keywords to detect credit card numbers. This reduces false positives from Luhn-valid numbers appearing in non-payment contexts (JSON logs, test files, etc.).
| Setting | Default | Effect |
|---|---|---|
requires_cc_context | "true" | CC findings require payment context keywords |
Payment context keywords: payment, card, merchant, transaction, billing, invoice
To restore legacy behavior (alert on all Luhn-valid credit cards regardless of context):
settings = { requires_cc_context = "false" }
SOX (Sarbanes-Oxley Act)
Enterprise Edition Only
The SOX policy detects financial data subject to internal controls.
Detected Data Types:
- Financial statements
- Account numbers
- Transaction identifiers
- Audit information
- Executive communications
Configuration:
[policies]
enabled_policies = ["sox"]
[policies.policy_configs.sox]
enabled = true
settings = { confidence_threshold = "0.85" }
ISO 27001:2022
Enterprise Edition Only
The ISO 27001:2022 policy implements information security management controls, particularly Control A.8.12 (Data leakage prevention) which explicitly mandates DLP capabilities.
Features:
- 4-level data classification: Restricted, Confidential, Internal, Public
- Automatic classification of all 33 scanners by sensitivity
- Configurable controls for data masking, encryption, access
Detected Data Types:
- All categories classified by sensitivity level
- Automatic assignment based on scanner type
Configuration:
[policies]
enabled_policies = ["iso27001"]
[policies.policy_configs.iso27001]
enabled = true
settings = { confidence_threshold = "0.7", enforce_data_masking = "true" }
Enabling Multiple Policies
You can enable multiple policies simultaneously:
[policies]
enabled_policies = ["gdpr", "hipaa", "pci_dss", "sox", "ccpa", "iso27001"]
Each policy evaluates scan findings independently and generates violations according to its regulatory framework. A single file might trigger alerts from multiple policies if it contains different types of sensitive data.
Custom Policies
Aquilon DLP supports custom policies and scanners to detect company-specific data patterns without writing code.
Creating Custom Scanners
Define scanners for proprietary identifiers:
[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})"
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.85
description = "ACME Corp employee IDs"
context_signals = ["hr", "confidential", "personnel"]
[scanners.confidence_boost]
keywords = ["employee", "personnel", "payroll", "badge"]
boost_amount = 0.10
proximity = 200
Scanner Fields:
| Field | Required | Description |
|---|---|---|
name | Yes | Unique identifier (alphanumeric + underscore) |
regex | Yes | Pattern to match (must be bounded) |
redaction_pattern | Yes | Template for redacting matches |
base_confidence | Yes | Base confidence score (0.0 - 1.0) |
description | No | Human-readable description |
context_signals | No | Keywords for classification |
confidence_boost | No | Boost confidence when keywords found nearby |
Pattern Safety
All regex patterns must be bounded to prevent performance issues:
# SAFE - bounded patterns
[[scanners]]
name = "fixed_length"
regex = "EMP-([0-9]{6})" # Fixed length
Unsafe patterns (unbounded) will be rejected:
\d+,.*,[A-Z]+
Dictionary Scanners
Dictionary scanners detect words and phrases from configurable inline lists using the Aho-Corasick algorithm for efficient O(n) multi-pattern matching.
When to Use Dictionary Scanners
- Detect lists of keywords or terms (medical terms, project codes, product names)
- Match multi-word phrases (e.g., “social security number”, “patient record”)
- Domain-specific vocabulary that doesn’t follow a regex pattern
Basic Configuration
[[dictionary_scanners]]
name = "medical_terms"
words = [
"diagnosis",
"prescription",
"patient record",
"medical history"
]
case_sensitive = false
match_whole_words = true
base_confidence = 0.85
Configuration Fields
| Field | Type | Default | Description |
|---|---|---|---|
name | String | Required | Unique scanner identifier (alphanumeric + underscore) |
words | Array | Required | Words and phrases to detect |
case_sensitive | Boolean | false | Case-sensitive matching |
match_whole_words | Boolean | true | Match only at word boundaries |
base_confidence | Float | 0.8 | Base confidence score (0.0-1.0) |
min_matches | Integer | None | Minimum matches required to report |
match_proximity | Integer | None | Maximum bytes between matches |
description | String | None | Human-readable description |
context_signals | Array | None | Keywords for classification |
Advanced: Match Constraints
Use min_matches and match_proximity to reduce false positives by requiring multiple terms to appear together:
[[dictionary_scanners]]
name = "hipaa_terms"
words = [
"protected health information",
"PHI",
"patient",
"medical record",
"diagnosis",
"treatment"
]
base_confidence = 0.75
min_matches = 2
match_proximity = 500
This configuration only reports findings when at least 2 terms appear within 500 bytes of each other.
Advanced: Confidence Adjustments
Boost or reduce confidence based on nearby keywords:
[[dictionary_scanners]]
name = "project_codenames"
words = ["Project Alpha", "Operation Gamma", "Initiative Delta"]
base_confidence = 0.70
boost_keywords = ["confidential", "restricted", "internal only"]
boost_amount = 0.20
reduce_keywords = ["example", "test", "demo", "sample"]
reduce_amount = 0.30
When “confidential” appears nearby, confidence increases from 0.70 to 0.90. When “test” appears nearby, confidence decreases from 0.70 to 0.40.
Referencing Dictionary Scanners in Policies
Dictionary scanners use the custom: prefix when referenced in policies:
[[custom_policies]]
name = "healthcare_data"
enabled = true
required_scanners = ["custom:medical_terms", "ssn", "email"]
[[custom_policies.rules]]
id = "phi_exposure"
severity = "high"
[custom_policies.rules.composition]
operator = "AND"
proximity = 500
[[custom_policies.rules.composition.conditions]]
scanner = "custom:medical_terms"
min_confidence = 0.70
[[custom_policies.rules.composition.conditions]]
scanner = "ssn"
min_confidence = 0.75
Built-in Validators
Validators provide checksum or format validation for regex matches, significantly reducing false positives by verifying that detected patterns are mathematically valid.
Available Validators
| Validator | Algorithm | Use Case |
|---|---|---|
luhn | Luhn (mod 10) | Credit cards, IMEI numbers |
mod10 | Modulo 10 | Various identifiers with check digits |
mod11 | Modulo 11 | ISBN-10, some national IDs |
iban | IBAN checksum | International Bank Account Numbers |
Using Validators in Custom Scanners
Add a validator to filter out matches that fail checksum validation:
[[scanners]]
name = "company_account"
regex = "ACCT-([0-9]{10})"
redaction_pattern = "ACCT-XXXXXXXXXX"
base_confidence = 0.85
[scanners.validation]
validator = "luhn"
min_confidence = 0.70
invalid_patterns = ["^0+$", "1234567890$"]
Validation Configuration Fields
| Field | Type | Description |
|---|---|---|
validator | String | Checksum validator: luhn, mod10, mod11, iban |
min_confidence | Float | Minimum confidence threshold (0.0-1.0) |
invalid_patterns | Array | Regex patterns to reject (e.g., all zeros) |
Example: Credit Card with Luhn Validation
The built-in credit card scanner already uses Luhn validation internally. For custom patterns that should use Luhn:
[[scanners]]
name = "loyalty_card"
regex = "([0-9]{4})([0-9]{4})([0-9]{4})([0-9]{4})"
redaction_pattern = "XXXX-XXXX-XXXX-XXXX"
base_confidence = 0.80
description = "16-digit loyalty card numbers with Luhn check"
[scanners.validation]
validator = "luhn"
invalid_patterns = ["^0{16}$", "^1{16}$"]
This configuration:
- Matches any 16-digit number
- Validates it passes the Luhn checksum
- Rejects all-zeros and all-ones patterns
- Reports only valid matches
Confidence Scoring
Aquilon DLP uses weighted confidence scoring to reduce false positives. Confidence can be boosted by nearby keywords or reduced by negative indicators.
How Confidence Works
Each scanner assigns a base_confidence score (0.0 to 1.0). This score can be adjusted based on:
- Nearby positive keywords → Boost confidence (more likely a real match)
- Nearby negative keywords → Reduce confidence (likely a false positive)
- Validator success → Maintains or boosts confidence
- Validator failure → Match is discarded
Boosting Confidence with Keywords
When specific keywords appear near a match, boost the confidence:
[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})"
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.75
[scanners.confidence_boost]
keywords = ["employee", "badge", "payroll", "personnel", "HR"]
boost_amount = 0.15
proximity = 200
If “employee” or “payroll” appears within 200 bytes, confidence increases from 0.75 to 0.90.
Reducing Confidence with Negative Keywords
When negative keywords appear near a match, reduce the confidence to suppress likely false positives:
[[scanners]]
name = "ssn_custom"
regex = "([0-9]{3})-([0-9]{2})-([0-9]{4})"
redaction_pattern = "XXX-XX-XXXX"
base_confidence = 0.80
[scanners.confidence_reduce]
keywords = ["example", "test", "fake", "sample", "xxx", "000-00-0000"]
boost_amount = 0.50
proximity = 100
If “example” or “test” appears within 100 bytes, confidence is reduced by 0.50 (from 0.80 to 0.30).
Combining Boost and Reduce
You can use both boost and reduce on the same scanner:
[[scanners]]
name = "account_number"
regex = "ACC-([0-9]{8})"
redaction_pattern = "ACC-XXXXXXXX"
base_confidence = 0.70
[scanners.confidence_boost]
keywords = ["account", "balance", "statement", "transaction"]
boost_amount = 0.20
proximity = 150
[scanners.confidence_reduce]
keywords = ["example", "test", "demo", "documentation"]
boost_amount = 0.40
proximity = 100
Confidence calculation:
- Base: 0.70
- With “account” nearby: 0.70 + 0.20 = 0.90
- With “test” nearby: 0.70 - 0.40 = 0.30
- With both: Boost and reduce are applied independently based on proximity
Creating Custom Policies
Define policies to enforce business rules:
[[custom_policies]]
name = "employee_data_protection"
description = "Detects employee PII exposure"
enabled = true
required_scanners = ["ssn", "custom:employee_id", "email"]
[[custom_policies.rules]]
id = "employee_pii_leak"
severity = "high"
remediation = "Contact HR compliance - do not share file"
[custom_policies.rules.composition]
operator = "AND"
proximity = 500
[[custom_policies.rules.composition.conditions]]
scanner = "custom:employee_id"
min_confidence = 0.70
[[custom_policies.rules.composition.conditions]]
scanner = "ssn"
min_confidence = 0.75
[custom_policies.rules.exclusions]
file_patterns = ["*/hr/authorized/*", "*/payroll/approved/*"]
Rule Types
Composition Rules (AND/OR Logic)
Alert when multiple data types appear together:
[custom_policies.rules.composition]
operator = "AND" # All conditions must match
proximity = 500 # Within 500 characters
[[custom_policies.rules.composition.conditions]]
scanner = "ssn"
min_confidence = 0.75
[[custom_policies.rules.composition.conditions]]
scanner = "email"
min_confidence = 0.70
Threshold Rules (Count-Based)
Alert when count exceeds threshold (bulk export detection):
[custom_policies.rules.threshold]
scanner = "custom:employee_id"
operator = "greater_equal"
count = 10
Operators: >, >=, <, <=, ==
Context Rules (Exclusions)
Control when rules fire based on details:
[custom_policies.rules.context]
requires_any = ["external", "public", "shared"]
[custom_policies.rules.exclusions]
file_patterns = ["*/hr/authorized/*"]
requires_context_signals = ["approved", "authorized"]
Scanner References
When referencing scanners in policies:
- Built-in scanners: Use direct name (
ssn,email,cc) - Custom scanners: Use
custom:prefix (custom:employee_id)
# Scanner references in required_scanners
required_scanners = [
"ssn", # Built-in
"email", # Built-in
"custom:employee_id", # Custom
"custom:project_code" # Custom
]
Adding Custom Policies
Custom scanners and policies are defined directly in your main configuration file using [[scanners]] and [[custom_policies]] sections:
# In aquilon_dlp_config.toml
[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})"
base_confidence = 0.85
[[custom_policies]]
name = "employee_data_protection"
enabled = true
required_scanners = ["custom:employee_id", "ssn"]
Validate your configuration:
sudo aquilon-dlp --config /etc/aquilon/aquilon_dlp_config.toml --validate-config
Built-in Scanners
Aquilon DLP includes 50+ built-in scanner plugins across multiple categories:
| Category | Scanner Count | Examples |
|---|---|---|
| National IDs | 28 | EU, Americas, Asia-Pacific, Middle East national IDs |
| PII | 8 | SSN, email, phone, address, date of birth |
| Financial | 5 | Credit card, bank account, IBAN, CVV |
| Medical | 6 | MRN, NPI, MBI, medical device IDs |
| Government | 3 | Passport, driver’s license, vehicle identifier |
| Technical | 3 | API keys, database connections, crypto keys |
| Business | 5 | Executive communications, financial figures, audit docs |
All scanners integrate automatically with compliance policies.
National ID Scanners
Aquilon DLP includes comprehensive national ID detection with country-specific checksum validation:
Europe (14 scanners):
| Country | Scanner | Format | Validation |
|---|---|---|---|
| France | france_nir | 15 digits (NIR) | Mod 97 |
| Germany | germany_steurid | 11 digits (Steuer-ID) | Format rules |
| Italy | italy_cf | 16 chars (Codice Fiscale) | Mod 26 |
| Spain | spain_dni | 8-9 chars (DNI/NIE) | Mod 23 |
| Poland | poland_pesel | 11 digits (PESEL) | Weighted mod 10 |
| Netherlands | netherlands_bsn | 9 digits (BSN) | 11-proof |
| Belgium | belgium_nrn | 11 digits (NRN) | Mod 97 |
| UK | uk_nino | 9 chars (NINO) | Format rules |
| Sweden | sweden_personnummer | 10-12 digits | Luhn |
| Norway | norway_fodselsnummer | 11 digits | Dual mod-11 |
| Finland | finland_hetu | 11 chars (HETU) | Mod 31 |
| Portugal | portugal_nif | 9 digits (NIF) | Weighted mod 11 |
| Romania | romania_cnp | 13 digits (CNP) | Weighted mod 11 |
| Czech/Slovakia | czech_rodne_cislo | 9-10 digits | Mod 11 |
Americas (4 scanners):
| Country | Scanner | Format | Validation |
|---|---|---|---|
| Brazil | brazil_cpf | 11 digits (CPF) | Dual mod 11 |
| Canada | canada_sin | 9 digits (SIN) | Luhn |
| Chile | chile_rut | 8-9 chars (RUT) | Mod 11 |
| Argentina | argentina_cuit | 11 digits (CUIT/CUIL) | Weighted mod 11 |
Asia-Pacific (8 scanners):
| Country | Scanner | Format | Validation |
|---|---|---|---|
| Australia | australia_tfn | 9 digits (TFN) | Weighted mod 11 |
| India | india_aadhaar | 12 digits (Aadhaar) | Format rules |
| India | india_pan | 10 chars (PAN) | Format rules |
| South Korea | south_korea_rrn | 13 digits (RRN) | Weighted mod 11 |
| Japan | japan_my_number | 12 digits | Government checksum |
| China | china_resident_id | 18 chars | ISO 7064 MOD 11-2 |
| Taiwan | taiwan_national_id | 10 chars | Weighted mod 10 |
| New Zealand | new_zealand_ird | 8-9 digits (IRD) | Mod 11 |
Middle East & Africa (2 scanners):
| Country | Scanner | Format | Validation |
|---|---|---|---|
| Israel | israel_teudat_zehut | 9 digits | Luhn variant |
| Turkey | turkey_tc_kimlik | 11 digits (TC Kimlik) | Two-step checksum |
Other Scanners
PII: ssn, email, phone, address, date_of_birth, biometric, facial_photo, ip_address
Financial: credit_card, cvv, bank_account, iban, account_number
Medical: mrn, medical_id, npi, mbi, medical_device, certificate_license
Government: passport, drivers_license, vehicle_identifier
Technical: api_key, crypto, database_connection
Business: business_ip, audit_docs, executive_comms, financial_figures, material_info
Web: web_url
Policy Metadata
Add metadata for compliance tracking:
[[custom_policies]]
name = "employee_data_protection"
enabled = true
required_scanners = ["ssn", "custom:employee_id"]
[custom_policies.metadata]
compliance_framework = "ACME_DATA_PROTECTION_2024"
owner = "hr-compliance@acme.com"
review_date = "2025-01-15"
Severity Levels
Policy violations can have severity levels:
| Severity | Description | Example |
|---|---|---|
critical | Immediate action required | Bulk SSN export |
high | Urgent investigation | PII with contact info |
medium | Review required | Single finding in unexpected location |
low | Informational | Context-appropriate finding |
Example: Complete Custom Configuration
# Custom scanner for employee IDs
[[scanners]]
name = "employee_id"
regex = "EMP-([0-9]{6})"
redaction_pattern = "EMP-XXXXXX"
base_confidence = 0.85
description = "ACME Corp employee IDs"
context_signals = ["hr", "personnel"]
[scanners.confidence_boost]
keywords = ["employee", "badge", "payroll"]
boost_amount = 0.10
proximity = 200
# Custom policy for employee protection
[[custom_policies]]
name = "employee_data_protection"
enabled = true
required_scanners = ["ssn", "custom:employee_id", "email"]
[custom_policies.metadata]
owner = "security@acme.com"
review_date = "2025-06-01"
# Rule 1: Employee ID with SSN
[[custom_policies.rules]]
id = "employee_pii_leak"
severity = "high"
remediation = "Contact HR compliance immediately"
[custom_policies.rules.composition]
operator = "AND"
proximity = 500
[[custom_policies.rules.composition.conditions]]
scanner = "custom:employee_id"
min_confidence = 0.70
[[custom_policies.rules.composition.conditions]]
scanner = "ssn"
min_confidence = 0.75
[custom_policies.rules.exclusions]
file_patterns = ["*/hr/authorized/*"]
# Rule 2: Bulk employee export
[[custom_policies.rules]]
id = "bulk_employee_export"
severity = "critical"
remediation = "Investigate potential data breach"
[custom_policies.rules.threshold]
scanner = "custom:employee_id"
operator = "greater_equal"
count = 50
Troubleshooting Policies
Common Errors
“Unsafe regex pattern” - Pattern is unbounded. Add length limits.
“Reserved policy name” - Cannot use HIPAA, PCI-DSS, GDPR, etc. as custom policy names.
“Unknown scanner” - Check scanner name and custom: prefix.
No Alerts Appearing
- Verify policy is enabled (
enabled = true) - Check confidence thresholds aren’t too high
- Verify rule conditions are met
- Check exclusions aren’t blocking alerts
See the Configuration guide for applying changes and checking logs.
Monitoring
Aquilon DLP exposes findings through osquery tables, enabling powerful querying, alerting, and integration with existing security infrastructure.
osquery Tables
Aquilon DLP provides the following table for monitoring:
| Table | Description |
|---|---|
aquilon_dlp_alerts | Primary alert table with all findings and triage status |
For complete table schema and triage workflow, see OSQuery Integration.
Querying Alerts
Basic Queries
View recent alerts:
SELECT * FROM aquilon_dlp_alerts
ORDER BY timestamp DESC
LIMIT 10;
View alerts for specific file:
SELECT policy, severity, data_type, pattern
FROM aquilon_dlp_alerts
WHERE path LIKE '%specific-file-test%';
View alerts from last 24 hours:
SELECT * FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400);
Analyzing Patterns
Count alerts by policy:
SELECT policy, COUNT(*) as alert_count
FROM aquilon_dlp_alerts
GROUP BY policy
ORDER BY alert_count DESC;
Count alerts by severity:
SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY severity
ORDER BY
CASE severity
WHEN 'critical' THEN 1
WHEN 'high' THEN 2
WHEN 'medium' THEN 3
WHEN 'low' THEN 4
END;
Find most affected directories:
SELECT
rtrim(path, replace(path, '/', '')) as directory,
COUNT(*) as alert_count
FROM aquilon_dlp_alerts
GROUP BY directory
ORDER BY alert_count DESC
LIMIT 10;
View data types found:
SELECT data_type, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY data_type
ORDER BY count DESC;
Investigation Queries
Find all alerts for a specific user:
SELECT * FROM aquilon_dlp_alerts
WHERE path LIKE '/var/watch/%'
ORDER BY timestamp DESC;
Find files with multiple policy violations:
SELECT path, COUNT(DISTINCT policy) as policy_count
FROM aquilon_dlp_alerts
GROUP BY path
HAVING policy_count > 1
ORDER BY policy_count DESC;
Find high-severity alerts with multiple findings:
SELECT path, policy, data_type, confidence
FROM aquilon_dlp_alerts
WHERE severity IN ('critical', 'high')
ORDER BY timestamp DESC
LIMIT 20;
Alert Fields
The aquilon_dlp_alerts table contains these fields:
| Field | Type | Description |
|---|---|---|
id | TEXT | UUID (finding_id) for row identification |
timestamp | BIGINT | Unix timestamp of detection |
path | TEXT | Full path to scanned file |
scanner | TEXT | Scanner that detected the finding |
severity | TEXT | Alert severity (critical/high/medium/low) |
policy | TEXT | Policy that triggered the alert |
data_type | TEXT | Type of sensitive data found |
pattern | TEXT | Redacted pattern that matched |
confidence | INTEGER | Confidence level (0-100) |
match_count | INTEGER | Number of matches in file |
frameworks | TEXT | Applicable compliance frameworks |
triage_status | TEXT | Triage status (new/acknowledged/resolved/ignored) |
triage | TEXT | JSON object with triage details (owner, comment, timestamp) |
context | TEXT | JSON object with file metadata and context |
For details on the JSON column structure and querying, see OSQuery Integration.
Alert Triage
Aquilon DLP supports updating alert triage status directly through OSQuery UPDATE statements. This allows security analysts to acknowledge, investigate, and resolve alerts.
Triage Status Values
| Status | Description |
|---|---|
new | Just detected, needs review (default) |
acknowledged | Analyst is investigating |
resolved | Issue has been handled |
ignored | Intentionally skipped (false positive) |
Example Triage Queries
View alerts needing review:
SELECT path, scanner, severity, policy
FROM aquilon_dlp_alerts
WHERE triage_status = 'new'
ORDER BY severity DESC;
Acknowledge an alert:
UPDATE aquilon_dlp_alerts
SET triage_status = 'acknowledged',
triage = JSON_OBJECT('owner', 'analyst@company.com')
WHERE path LIKE '%ack-test%';
Resolve an alert:
UPDATE aquilon_dlp_alerts
SET triage_status = 'resolved',
triage = JSON_OBJECT('comment', 'File removed from system')
WHERE path LIKE '%ack-test%';
For complete triage workflow documentation, see OSQuery Integration.
Log Analysis
Log Locations
| Platform | Log Location |
|---|---|
| macOS | /var/log/aquilon/aquilon-dlp.log |
| Linux | /var/log/aquilon/aquilon-dlp.log |
Viewing Logs
Real-time log monitoring:
# macOS/Linux
tail -f /var/log/aquilon/aquilon-dlp.log
Filter for errors:
grep -i error /var/log/aquilon/aquilon-dlp.log
Filter for specific file:
grep "document.pdf" /var/log/aquilon/aquilon-dlp.log
Log Levels
Configure log level using the RUST_LOG environment variable:
# Set log level
export RUST_LOG=info
# Available levels: error, warn, info, debug, trace
| Level | Description |
|---|---|
error | Only critical errors |
warn | Errors and warnings |
info | General operational messages |
debug | Detailed debugging information |
trace | Extremely verbose tracing |
SIEM Integration
JSON Log Format
Configure JSON logs for SIEM ingestion using environment variables:
# Set log level and format
export RUST_LOG=info
# Logs are written to stdout in structured JSON format
# Redirect output as needed for your SIEM
The application uses the tracing crate which outputs structured JSON fields for easy parsing when configured appropriately in your init system.
osquery Fleet Management
Aquilon DLP integrates with osquery fleet management tools:
- Fleet - kolide/fleet or fleetdm
- Kolide - kolide.com
- osquery directly - via distributed queries
Example distributed query:
SELECT * FROM aquilon_dlp_alerts
WHERE severity IN ('critical', 'high')
AND timestamp > (strftime('%s', 'now') - 3600);
Splunk Integration
Forward osquery results to Splunk:
- Configure osquery logger to file
- Use Splunk Universal Forwarder to ingest logs
- Create dashboards for DLP alerts
Example Splunk query:
index=osquery sourcetype=osquery:results name=aquilon_dlp_alerts
| stats count by policy, severity
Elastic Stack Integration
Forward to Elasticsearch:
- Configure osquery with Kafka or file logger
- Use Filebeat/Logstash to ingest
- Create Kibana dashboards
Alerting
osquery Scheduled Queries
Schedule regular alert checks:
{
"schedule": {
"dlp_critical_alerts": {
"query": "SELECT * FROM aquilon_dlp_alerts WHERE severity = 'critical' AND triage_status = 'new'",
"interval": 300
}
}
}
External Alerting
Integrate with external systems by:
- Using osquery scheduled queries to export results
- Configuring SIEM to forward alerts to PagerDuty/Slack
- Triggering SOAR playbooks on critical alerts
Performance Metrics
System Resource Usage
Monitor Aquilon DLP resource consumption:
- Memory usage: Check process memory
- CPU usage: Monitor during active scanning
- Disk I/O: Cache database writes
# macOS/Linux - find Aquilon DLP process
ps aux | grep aquilon
# Watch resource usage
top -p $(pgrep aquilon)
Operational Dashboards
Key Metrics to Track
| Metric | Query | Alert Threshold |
|---|---|---|
| Critical alerts/hour | COUNT WHERE severity=‘critical’ | >0 |
| High alerts/hour | COUNT WHERE severity=‘high’ | >10 |
| New alerts needing triage | COUNT WHERE triage_status=‘new’ | >50 |
Example Dashboard Queries
Alerts by policy:
SELECT policy, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY policy
ORDER BY count DESC
LIMIT 5;
Triage status summary:
SELECT triage_status, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY triage_status;
Health Checks
Verify Extension Loading
SELECT * FROM osquery_extensions
WHERE name LIKE '%aquilon%';
Verify Table Availability
PRAGMA table_info(aquilon_dlp_alerts);
Test Query
osqueryi --connect /var/osquery/osquery.sock 'SELECT COUNT(*) as alert_count FROM aquilon_dlp_alerts;'
Troubleshooting Monitoring
No Data in Tables
-
Verify extension is loaded:
<!--SETUP /init-services.sh --> SELECT * FROM osquery_extensions; <!--ASSERT rows >= 0 --> -
Check osquery logs:
journalctl -u osqueryd -f -
Verify configuration is loaded:
cat /etc/aquilon/config.toml
Stale Data
- Check if files are being monitored (watch paths configured)
- Verify cache isn’t returning old results
- Check system time is correct
Performance Issues
- Reduce log verbosity
- Increase cache TTL
- Adjust concurrent scan limits
- Review excluded paths
See Configuration for performance tuning options.
Deployment Guide
This section covers production deployment strategies for Aquilon DLP, from single workstation installations to enterprise-wide fleet deployments.
Deployment Options
Single Node
Manual installation on individual machines. Best for:
- Personal use and evaluation
- Small teams (< 10 machines)
- Development and testing environments
MDM Deployment
Automated deployment via Mobile Device Management. Best for:
- Enterprise macOS fleets
- Automated compliance enforcement
- Zero-touch provisioning
Covers: Jamf Pro, Microsoft Intune, Kandji, and generic MDM platforms.
Enterprise Deployment
Large-scale deployment planning and fleet management. Best for:
- Organizations with 100+ endpoints
- Multi-platform environments (macOS + Linux)
- Centralized monitoring and compliance reporting
Edition Differences
| Feature | Basic | Enterprise |
|---|---|---|
| Platforms | Linux only | macOS + Linux |
| Policy frameworks | GDPR, CCPA | All frameworks |
| Support | Community | Enterprise SLA |
| MDM deployment | N/A | Full support |
Planning Checklist
Before deployment, ensure you have:
- Identified target endpoints and their platforms
- Selected appropriate edition (Basic or Enterprise)
- Planned deployment method (manual, MDM, or scripted)
- Prepared configuration for your environment
- Defined compliance policies to enable
- Planned monitoring and alerting strategy
Deployment Prerequisites
All Platforms
- OSQuery 5.x installed (for table integration)
- Network access to download binaries
- Administrative/root privileges for installation
macOS (Enterprise Edition)
- macOS 11.0 (Big Sur) or later
- Full Disk Access permission
- MDM enrollment (for automated deployment)
Linux
- Ubuntu 22.04+, RHEL 9+, Debian 11+, CentOS Stream 9+, or Fedora 38+
- x86_64 architecture
- systemd for service management
Next Steps
- Evaluation: Start with Single Node to test on one machine
- Pilot: Deploy to 10-50 devices to validate in your environment
- Production: Use MDM or Enterprise guides for full rollout
Single Node Deployment
Manual installation of Aquilon DLP on individual workstations. This guide covers both Linux (Basic Edition) and macOS (Enterprise Edition) deployments.
Overview
Single node deployment is ideal for:
- Evaluating Aquilon DLP before enterprise rollout
- Small teams with fewer than 10 machines
- Development and testing environments
- Personal data protection
Linux Deployment
Prerequisites
- Operating System: Ubuntu 20.04+, RHEL 8+, Debian 11+
- Architecture: x86_64
- Memory: 2GB RAM minimum
- Disk Space: 500MB for application and database
- Permissions: Root or sudo access
Installation Steps
Step 1: Download
Download the Basic Edition package for your distribution from your organization’s portal:
- Ubuntu/Debian:
aquilon-dlp-basic_VERSION_amd64.deb - RHEL/CentOS:
aquilon-dlp-basic-VERSION.x86_64.rpm
Step 2: Verify Checksum
# Verify checksum (SHA256 file provided with download)
sha256sum -c aquilon-dlp-basic-linux.sha256
Expected output: aquilon-dlp-basic-linux: OK
Step 3: Install Binary
# Make executable
chmod +x aquilon-dlp-basic
# Move to system path
sudo mv aquilon-dlp-basic /usr/local/bin/
# Verify installation
aquilon-dlp-basic --version
Step 4: Create Configuration
# Create config directory
sudo mkdir -p /etc/aquilon-dlp
# Download sample configuration
sudo curl -o /etc/aquilon-dlp/aquilon_dlp_config.toml \
https://raw.githubusercontent.com/aquilonsecurity/aquilon-dlp/main/docs/config-examples/aquilon_dlp_config_basic.toml
# Set permissions
sudo chmod 644 /etc/aquilon-dlp/aquilon_dlp_config.toml
Step 5: Configure Watch Paths
Edit /etc/aquilon-dlp/aquilon_dlp_config.toml:
# Monitor these directories
watch_paths = [
"/home/%%", # All user home directories
"/var/www/%%", # Web server files
"/data/%%" # Data directory
]
# Exclude unnecessary paths
exclude_paths = [
"/home/*/.cache/%%", # User caches
"/home/*/.local/%%" # Local application data
]
# Enable policies (Basic Edition: GDPR, CCPA only)
[policies]
enabled_policies = ["gdpr", "ccpa"]
[policies.policy_configs.gdpr]
enabled = true
[policies.policy_configs.ccpa]
enabled = true
Step 6: Validate Configuration
aquilon-dlp-basic --validate-config /etc/aquilon-dlp/aquilon_dlp_config.toml
Expected output: Configuration is valid.
Running as a Service
Create systemd service file /etc/systemd/system/aquilon-dlp.service:
[Unit]
Description=Aquilon DLP Basic Edition
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/bin/aquilon-dlp-basic --config /etc/aquilon-dlp/aquilon_dlp_config.toml
Restart=on-failure
RestartSec=10s
User=root
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
Enable and start:
sudo systemctl daemon-reload
sudo systemctl enable aquilon-dlp
sudo systemctl start aquilon-dlp
sudo systemctl status aquilon-dlp
Verification
# Check service status
sudo systemctl status aquilon-dlp
# View logs
sudo journalctl -u aquilon-dlp -f
# Query OSQuery tables (if OSQuery installed)
osqueryi "SELECT * FROM aquilon_dlp_alerts LIMIT 10;"
macOS Deployment
Note: macOS requires Enterprise Edition for native Endpoint Security monitoring.
Prerequisites
- Operating System: macOS 11.0 (Big Sur) or later
- Architecture: x86_64 or Apple Silicon
- Memory: 2GB RAM minimum, 4GB recommended
- Disk Space: 1GB for application and database
- Permissions: Full Disk Access, Administrator privileges
Installation Steps
Step 1: Download
Download the Enterprise Edition package for macOS from your organization’s portal:
- macOS:
aquilon-dlp-enterprise-VERSION.pkg
Step 2: Verify Code Signature
# Verify Apple Developer ID signature
codesign -dvv aquilon-dlp-enterprise
# Expected output should include:
# Authority=Developer ID Application: Aquilon Security, LLC
Step 3: Install Binary
# Make executable
chmod +x aquilon-dlp-enterprise
# Move to system path
sudo cp aquilon-dlp-enterprise /usr/local/bin/
# Verify installation
aquilon-dlp-enterprise --version
Step 4: Grant Full Disk Access
- Open System Settings > Privacy & Security > Full Disk Access
- Click + to add
/usr/local/bin/aquilon-dlp-enterprise - Enable the checkbox for Aquilon DLP
Important: Full Disk Access is required for Endpoint Security file monitoring. Without it, the application cannot scan protected directories.
Step 5: Create Configuration
# Create config directory
sudo mkdir -p /etc/aquilon-dlp
# Download sample configuration
sudo curl -o /etc/aquilon-dlp/aquilon_dlp_config.toml \
https://raw.githubusercontent.com/aquilonsecurity/aquilon-dlp/main/docs/config-examples/aquilon_dlp_config_enterprise.toml
# Set permissions
sudo chmod 644 /etc/aquilon-dlp/aquilon_dlp_config.toml
Step 6: Configure Watch Paths
Edit /etc/aquilon-dlp/aquilon_dlp_config.toml:
# Monitor these directories
watch_paths = [
"/Users/%%", # All user home directories
"/Volumes/%%", # External drives
"/data/%%" # Data directories
]
# Exclude unnecessary paths
exclude_paths = [
"/Users/*/.cache/%%", # User caches
"/Users/*/Library/%%" # Library (optional)
]
# Enable all Enterprise policy frameworks
[policies]
enabled_policies = ["gdpr", "ccpa", "hipaa", "pci_dss", "sox", "iso27001"]
[policies.policy_configs.gdpr]
enabled = true
[policies.policy_configs.ccpa]
enabled = true
[policies.policy_configs.hipaa]
enabled = true
[policies.policy_configs.pci_dss]
enabled = true
[policies.policy_configs.sox]
enabled = true
[policies.policy_configs.iso27001]
enabled = true
Running as a LaunchDaemon
Create /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.aquilonsecurity.dlp</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/bin/aquilon-dlp-enterprise</string>
<string>--config</string>
<string>/etc/aquilon-dlp/aquilon_dlp_config.toml</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/var/log/aquilon-dlp/stdout.log</string>
<key>StandardErrorPath</key>
<string>/var/log/aquilon-dlp/stderr.log</string>
</dict>
</plist>
Load and start:
# Create log directory
sudo mkdir -p /var/log/aquilon-dlp
# Load daemon
sudo launchctl load /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist
# Check status
sudo launchctl list | grep aquilon
# View logs
tail -f /var/log/aquilon-dlp/stderr.log
Verification
# Check if running
sudo launchctl list | grep aquilon
# Expected log output (in /var/log/aquilon-dlp/stderr.log):
# Attempting to initialize Endpoint Security monitoring...
# Full Disk Access verified
# Endpoint Security client created successfully
# Endpoint Security monitoring active
# Query OSQuery tables (if OSQuery installed)
osqueryi "SELECT * FROM aquilon_dlp_alerts LIMIT 10;"
OSQuery Integration
Both editions integrate with OSQuery for monitoring and alerting.
Install OSQuery
Linux (Ubuntu/Debian):
curl -L https://pkg.osquery.io/deb/osquery_5.x_1.0.0_amd64.deb -o osquery.deb
sudo dpkg -i osquery.deb
Linux (RHEL/CentOS):
sudo yum install https://pkg.osquery.io/rpm/osquery-5.x-1.0.0.x86_64.rpm
macOS:
# Using Homebrew
brew install --cask osquery
# Or download PKG
curl -L https://pkg.osquery.io/darwin/osquery-5.x.pkg -o osquery.pkg
sudo installer -pkg osquery.pkg -target /
Configure Extension
Add to /etc/osquery/extensions.load:
/usr/local/bin/aquilon-dlp-basic --socket /var/osquery/osquery.em
(Replace aquilon-dlp-basic with aquilon-dlp-enterprise for macOS)
Query DLP Tables
-- Query alerts by severity
SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY severity;
Troubleshooting
Linux: Service Won’t Start
Check logs:
sudo journalctl -u aquilon-dlp -n 50
Common causes:
- Invalid configuration file (run
–validate-config) - Missing permissions on watch directories
- Database lock (only one instance can run)
macOS: Full Disk Access Not Working
Symptoms: “Operation not permitted” errors
Solutions:
-
Verify FDA in System Settings > Privacy & Security > Full Disk Access
-
Remove and re-add the binary
-
Restart the LaunchDaemon:
sudo launchctl unload /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist sudo launchctl load /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist
Policy Not Available (Basic Edition)
Symptom: “Unknown policy ‘hipaa’, skipping”
Cause: Basic Edition only includes GDPR and CCPA
Solution: Remove enterprise policies from configuration:
[policies]
enabled_policies = ["gdpr", "ccpa"] # Only these available in Basic Edition
For enterprise policies (HIPAA, PCI DSS, SOX, ISO 27001), upgrade to Enterprise Edition.
High Resource Usage
Symptoms: High CPU or memory consumption
Solutions:
- Add exclusions for high-churn directories
- Exclude large binary files (
.app,.dmg,.iso) - Reduce
num_workersin configuration - Adjust
max_scan_size_mbto skip large files
Next Steps
- Scale up: Use MDM Deployment for macOS fleets
- Enterprise features: See Enterprise Deployment for fleet management
- Monitoring: Review Monitoring for alerting setup
MDM Deployment
Note: MDM deployment requires macOS Enterprise Edition.
Automated deployment of Aquilon DLP via Mobile Device Management (MDM) for enterprise macOS fleets.
Overview
MDM deployment enables:
- Zero-touch provisioning of Full Disk Access permissions
- Automated app installation across hundreds/thousands of Macs
- Centralized configuration and compliance enforcement
- Silent deployment without user interaction
Why MDM?
Aquilon DLP uses macOS Endpoint Security framework, which requires Full Disk Access (FDA). In enterprise environments:
- Manual FDA grants don’t scale
- Users may skip or misconfigure permissions
- Compliance requires consistent deployment
MDM solves this by deploying PPPC (Privacy Preferences Policy Control) profiles that automatically grant FDA before app installation.
Prerequisites
- MDM Platform: Jamf Pro, Microsoft Intune, Kandji, SimpleMDM, or compatible
- macOS Version: 11.0 (Big Sur) or later
- Signed App Bundle: Code-signed with Endpoint Security entitlement
- Admin Access: MDM console with profile deployment permissions
- Enrolled Devices: Target Macs enrolled in your MDM
Before You Begin
-
Verify your signed app bundle has correct code requirement:
./scripts/extract_code_requirement.sh target/debug/aquilon-dlp.app -
Create a pilot group (10-50 devices) for initial testing
-
Document your rollback plan in case of issues
Deployment Process
The deployment follows three phases, always in this order:
- Deploy PPPC Profile - Grants Full Disk Access permission
- Wait for Confirmation - Verify profile installation
- Deploy App - Install after FDA is granted
Critical: Deploy profile BEFORE app. macOS only applies PPPC grants during app installation.
Jamf Pro
Step 1: Upload PPPC Profile
-
Navigate to: Computers > Configuration Profiles > + New
-
Configure:
- Display Name:
Aquilon DLP - Full Disk Access - Category: Security
- Distribution Method: Install Automatically
- Display Name:
-
Click Privacy Preferences Policy Control payload
-
Click Upload and select
deployment/mdm/pppc-jamf.mobileconfig -
Verify imported settings:
- Identifier:
dev.aquilon.dlp-plugin - System Policy All Files: Checked
- Identifier:
Step 2: Scope and Deploy
- Click Scope tab
- Add target computer groups (start with pilot group)
- Click Save
Profile deploys on next check-in (typically 15-30 minutes).
Step 3: Verify Installation
On target Mac:
sudo profiles list | grep -i aquilon
# Expected: com.aquilonsecurity.dlp.pppc.jamf
Step 4: Package and Deploy App
-
Create PKG installer:
pkgbuild --root /path/to/aquilon-dlp.app \ --identifier dev.aquilon.dlp-plugin \ --version 0.1.0 \ --install-location /Library/Application\ Support/aquilon-dlp.app \ aquilon-dlp-0.1.0.pkg -
Upload to Jamf:
- Settings > Computer Management > Packages > + New
- Upload signed package
-
Create installation policy:
- Computers > Policies > + New
- Add package with Install action
- Scope to same groups as PPPC profile
Timeline
| Event | Timing |
|---|---|
| Profile propagates | 15-30 minutes |
| App installs | 15-30 minutes after profile |
| Total | ~60-90 minutes |
Microsoft Intune
Step 1: Upload PPPC Profile
-
Navigate to: Devices > macOS > Configuration profiles > + Create profile
-
Select:
- Platform: macOS
- Profile type: Templates > Custom
-
Configure:
- Name:
Aquilon DLP - Full Disk Access - Upload
deployment/mdm/pppc-intune.mobileconfig - Deployment channel: Device channel
- Name:
Step 2: Assign to Devices
- Click Assignments tab
- Add target Azure AD device groups
- Optionally add filter for macOS 11.0+
Step 3: Package App for Intune
Intune requires .intunemac format:
# Download Intune App Wrapping Tool from:
# https://github.com/msintuneappsdk/intune-app-wrapping-tool-mac
./IntuneAppUtil -c /path/to/aquilon-dlp.app \
-o aquilon-dlp.intunemac \
-n "0.1.0" \
-v "0.1.0"
Step 4: Deploy App
- Navigate to: Apps > macOS > + Add
- App type: Line-of-business app
- Upload
.intunemacfile - Configure app information
- Assign to same device groups as profile
Note: Wait 24 hours after profile deployment before deploying app, or use dynamic groups.
Timeline
| Event | Timing |
|---|---|
| Profile propagates | 1-8 hours |
| App installs | 1-8 hours after profile |
| Total | ~2-16 hours |
Tip: Force sync via Company Portal > Settings > Sync to speed up check-ins.
Kandji
Step 1: Create Custom Profile
-
Navigate to: Library > Custom Profiles > + Add Profile
-
Configure:
- Name:
Aquilon DLP - Full Disk Access - Upload
deployment/mdm/pppc-kandji.mobileconfig - Enforcement: Deploy Always
- Name:
-
Assign to target blueprints
Step 2: Create Custom App
-
Navigate to: Library > Custom Apps > + Add App
-
Upload PKG installer
-
Configure:
- Install Type: Package
- Run as: System
-
Set PPPC profile as dependency (optional but recommended)
-
Assign to same blueprints
Timeline
| Event | Timing |
|---|---|
| Profile propagates | 15-60 minutes |
| App installs | 15-60 minutes after profile |
| Total | ~30-120 minutes |
Generic MDM
For SimpleMDM, FileWave, Mosyle, or other platforms:
Profile Deployment
- Download
deployment/mdm/pppc-generic.mobileconfig - Upload to your MDM’s configuration profile section
- Assign to target devices/groups
App Deployment
- Package app as
.pkginstaller - Upload to your MDM’s app distribution
- Deploy after confirming profile installation
Key Configuration
The profile must contain:
- Bundle ID:
dev.aquilon.dlp-plugin - Service:
SystemPolicyAllFiles(Full Disk Access) - Code Requirement: Match your signed app
Verification
After deployment, verify on target Mac:
Check Profile Installation
sudo profiles list | grep -i aquilon
# Expected: com.aquilonsecurity.dlp.pppc.<mdm>
# Where <mdm> is: jamf, intune, or kandji
Check FDA Grant
sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
"SELECT auth_value FROM access
WHERE service = 'kTCCServiceSystemPolicyAllFiles'
AND client = 'dev.aquilon.dlp-plugin';"
# Expected: 2
Check App Function
sudo /Library/Application\ Support/aquilon-dlp.app/Contents/MacOS/aquilon-dlp \
--socket /tmp/osquery.sock
Expected output:
Attempting to initialize Endpoint Security monitoring...
Full Disk Access verified
Endpoint Security client created successfully
Endpoint Security monitoring active
Troubleshooting
FDA Not Granted After Installation
Cause: App installed before PPPC profile
Solution:
# 1. Verify profile is installed
sudo profiles list | grep aquilon
# 2. Remove app
sudo rm -rf /Library/Application\ Support/aquilon-dlp.app
# 3. Reinstall via MDM (triggers on next check-in)
System Settings Shows FDA Unchecked
Cause: Known macOS UI bug - checkbox doesn’t reflect TCC database
Solution: Trust the TCC database query. If auth_value = 2, FDA IS granted.
Warning: Do NOT manually toggle the checkbox - it may revoke the PPPC grant.
“Failed to create ES client” Error
Causes and solutions:
-
FDA not granted: Check TCC database (see above)
-
Not running as root: Use
sudo -
ES entitlement missing: Check code signing
codesign -d --entitlements - /Library/Application\ Support/aquilon-dlp.app
Code Requirement Mismatch
Symptom: Profile installed but TCC has no entry
Solution:
-
Extract app’s actual code requirement:
codesign -dr - /Library/Application\ Support/aquilon-dlp.app -
Update profile to match
-
Redeploy profile and reinstall app
Profile Won’t Install
Solutions:
-
Validate profile:
plutil -lint deployment/mdm/pppc-*.mobileconfig -
Check device enrollment status
-
Remove conflicting profiles:
# Replace <mdm> with: jamf, intune, or kandji sudo profiles remove -identifier com.aquilonsecurity.dlp.pppc.<mdm>
Diagnostic Script
Save and run this script on target Mac:
#!/bin/bash
# FDA Troubleshooting Diagnostic
echo "=== Aquilon DLP FDA Diagnostic ==="
echo
echo "1. Profile Installation:"
profiles list | grep -q "com.aquilonsecurity.dlp.pppc" && \
echo "✓ Profile installed" || echo "✗ Profile NOT installed (check for .jamf/.intune/.kandji suffix)"
echo "2. TCC Database Entry:"
AUTH=$(sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
"SELECT auth_value FROM access WHERE service = 'kTCCServiceSystemPolicyAllFiles'
AND client = 'dev.aquilon.dlp-plugin';" 2>/dev/null)
[ "$AUTH" = "2" ] && echo "✓ FDA granted" || echo "✗ FDA NOT granted"
echo "3. App Bundle:"
[ -d "/Library/Application Support/aquilon-dlp.app" ] && \
echo "✓ App installed" || echo "✗ App NOT installed"
echo "4. Code Signature:"
codesign --verify /Library/Application\ Support/aquilon-dlp.app 2>/dev/null && \
echo "✓ Valid signature" || echo "✗ Invalid signature"
echo "5. ES Entitlement:"
codesign -d --entitlements - /Library/Application\ Support/aquilon-dlp.app 2>&1 | \
grep -q "endpoint-security" && \
echo "✓ ES entitlement present" || echo "✗ ES entitlement missing"
echo "=== End Diagnostic ==="
Best Practices
Staged Rollout
- Pilot (Week 1): Deploy to IT/security team (10-50 devices)
- Early Adopters (Week 2): Expand to 100-500 devices
- Production (Week 3+): Roll out to all devices
Smart Groups
Create groups to track deployment status:
- Profile Installed: Devices with PPPC profile
- App Installed: Devices with app bundle
- Needs Remediation: App installed but FDA not granted
Remediation Policy
Create automated remediation for FDA issues:
- Detect: App installed but FDA not in TCC
- Action: Remove app, trigger reinstall
- Monitor: Alert on repeated failures
Next Steps
- Fleet monitoring: See Enterprise Deployment for large-scale management
- Troubleshooting: Refer to Troubleshooting for detailed solutions
- Support: Contact support@aquilonsecurity.com for deployment assistance
Enterprise Deployment
Large-scale deployment planning and fleet management for Aquilon DLP across enterprise environments.
Overview
Enterprise deployment addresses:
- Scaling to hundreds or thousands of endpoints
- Multi-platform environments (macOS and Linux)
- Centralized configuration management
- Compliance reporting and monitoring
- Fleet health and remediation
Planning
Deployment Scope
Before deploying, define your scope:
| Factor | Considerations |
|---|---|
| Endpoints | Total count, platform mix, geographic distribution |
| Compliance | Required frameworks (HIPAA, PCI DSS, SOX, ISO 27001) |
| Policies | Standard vs custom, per-department variations |
| Monitoring | Alert routing, SIEM integration, dashboards |
| Support | Help desk preparation, escalation paths |
Rollout Strategy
Recommended: Staged rollout
| Phase | Scope | Duration | Goals |
|---|---|---|---|
| Pilot | IT/Security (10-50) | 1 week | Validate deployment, catch issues |
| Early Adopter | Willing teams (100-500) | 1 week | Broader testing, refine process |
| General | All remaining | 2-4 weeks | Full production rollout |
For each phase:
- Deploy configuration and profiles
- Monitor for issues (24-48 hours)
- Address any problems
- Proceed to next phase
Success Criteria
Define metrics before deployment:
- Installation success rate > 99%
- FDA grant rate (macOS) > 99%
- Service running rate > 99%
- Alert generation within 24 hours
- No critical issues in pilot
Configuration Management
Centralized Configuration
For consistent deployment across endpoints, centralize configuration:
Option A: MDM-deployed configuration file
- Deploy
/etc/aquilon-dlp/aquilon_dlp_config.tomlvia MDM - Update by redeploying profile
Option B: Configuration management (Ansible, Chef, Puppet)
# Ansible example
- name: Deploy Aquilon DLP config
template:
src: aquilon_dlp_config.toml.j2
dest: /etc/aquilon-dlp/aquilon_dlp_config.toml
mode: '0644'
notify: restart aquilon-dlp
Department-Specific Policies
Different departments may need different policies:
# Example: Finance department config
[policies]
enabled_policies = ["gdpr", "ccpa", "sox", "pci_dss"]
# Other departments would use different policies:
# - Healthcare: ["gdpr", "hipaa"]
# - Engineering: ["gdpr", "ccpa"]
Deploy department-specific configs via:
- MDM smart groups/blueprints
- Configuration management role assignments
- AD group membership
Tracking Deployment
Track active installations:
- Use MDM inventory reports
- Query OSQuery fleet
- Monitor Prometheus endpoint count
Monitoring and Alerting
OSQuery Fleet Queries
Schedule queries across your fleet:
-- Daily: Deployment health
SELECT
hostname,
(SELECT COUNT(*) FROM aquilon_dlp_alerts) AS total_alerts,
(SELECT COUNT(*) FROM aquilon_dlp_alerts WHERE severity = 'critical') AS critical_alerts
FROM system_info;
-- Hourly: Alert summary
SELECT
policy,
severity,
COUNT(*) AS count
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 3600)
GROUP BY policy, severity;
Prometheus Metrics
Configure Prometheus scraping:
# prometheus.yml
scrape_configs:
- job_name: 'aquilon-dlp'
static_configs:
- targets: ['host1:9090', 'host2:9090', ...]
# Or use service discovery
file_sd_configs:
- files:
- 'targets/aquilon-dlp/*.json'
Key metrics to monitor:
aquilon_dlp_scans_total- Scan volume by policyaquilon_dlp_alerts_total- Alert count by severityaquilon_dlp_cache_hits_total- Cache efficiencyaquilon_dlp_scan_duration_seconds- Performance
Grafana Dashboards
Enterprise customers receive pre-built dashboards:
- Compliance Overview: Policy coverage across fleet
- Performance: Scan rates, latency, resource usage
- Alerts: Real-time alert visualization
Contact support@aquilonsecurity.com for dashboard templates.
SIEM Integration
Forward alerts to your SIEM via:
Structured logging:
# Configure logging via environment variable
export RUST_LOG=info
# Logs are output to stdout in structured JSON format
# Configure your SIEM to ingest from osquery results or log files
Note: Direct syslog forwarding is a planned feature. Currently, integrate via OSQuery scheduled queries.
OSQuery scheduled queries: Configure OSQuery to forward aquilon_dlp_alerts to SIEM.
Fleet Health
Health Checks
Monitor endpoint health:
Service running:
# macOS
sudo launchctl list | grep -q "com.aquilonsecurity.dlp" && echo "Running" || echo "Stopped"
# Linux
systemctl is-active aquilon-dlp
Recent alerts:
SELECT * FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400);
FDA status (macOS):
sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
"SELECT auth_value FROM access
WHERE service = 'kTCCServiceSystemPolicyAllFiles'
AND client = 'dev.aquilon.dlp-plugin';"
Common Issues
Service Not Running
Diagnosis:
# macOS
sudo launchctl list | grep aquilon
tail -100 /var/log/aquilon-dlp/stderr.log
# Linux
systemctl status aquilon-dlp
journalctl -u aquilon-dlp -n 100
Causes:
- Configuration error (run
–validate-config) - Database lock (another instance running)
- Missing permissions
Remediation:
- Fix configuration issue
- Kill duplicate processes
- Restart service
FDA Not Granted (macOS)
Diagnosis:
# Check profile
sudo profiles list | grep aquilon
# Check TCC database
sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
"SELECT auth_value FROM access
WHERE service = 'kTCCServiceSystemPolicyAllFiles'
AND client = 'dev.aquilon.dlp-plugin';"
Remediation:
- Verify PPPC profile installed
- Remove app bundle
- Reinstall via MDM
- Verify TCC entry shows
auth_value = 2
No Alerts Generated
Diagnosis:
-- Check for recent alerts
SELECT COUNT(*) as alert_count, policy
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400)
GROUP BY policy;
Causes:
- No sensitive data in monitored paths
- Policies not enabled in configuration
- Exclusions too broad
Remediation:
- Review enabled policies
- Check watch_paths include relevant directories
- Review exclude_paths for over-exclusion
- Test with known sensitive data
High Resource Usage
Diagnosis:
# Check CPU/memory (use aquilon-dlp-enterprise or aquilon-dlp-basic based on edition)
top -pid $(pgrep -f aquilon)
# Check alert count
osqueryi "SELECT COUNT(*) FROM aquilon_dlp_alerts;"
Causes:
- Monitoring high-churn directories
- Large files without size limits
- Too many workers
Remediation:
# Add exclusions
exclude_paths = [
"/Users/*/.cache/%%",
"/home/*/.npm/%%",
"**/*.iso",
"**/*.dmg"
]
# Limit file size
[scan]
max_scan_size_mb = 100
# Reduce workers
[worker]
num_workers = 2 # Default is 4
Automated Remediation
MDM Remediation Policies
Jamf Pro - Extension Attribute for FDA status:
#!/bin/bash
AUTH=$(sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
"SELECT auth_value FROM access
WHERE service = 'kTCCServiceSystemPolicyAllFiles'
AND client = 'dev.aquilon.dlp-plugin';" 2>/dev/null)
if [ "$AUTH" = "2" ]; then
echo "<result>Granted</result>"
else
echo "<result>Not Granted</result>"
fi
Smart Group for remediation:
- Criteria: Extension Attribute “FDA Status” is “Not Granted”
- Policy: Reinstall Aquilon DLP package
Ansible Remediation Playbook
---
- name: Remediate Aquilon DLP issues
hosts: dlp_endpoints
tasks:
- name: Check service status
service:
name: aquilon-dlp
state: started
enabled: yes
- name: Validate configuration
command: aquilon-dlp --validate-config /etc/aquilon-dlp/aquilon_dlp_config.toml
register: config_check
failed_when: config_check.rc != 0
- name: Restart if config changed
service:
name: aquilon-dlp
state: restarted
when: config_changed | default(false)
Compliance Reporting
Generating Reports
Use OSQuery to generate compliance reports:
-- HIPAA compliance summary
SELECT
date(timestamp, 'unixepoch') AS date,
COUNT(*) AS total_findings,
SUM(CASE WHEN severity = 'critical' THEN 1 ELSE 0 END) AS critical,
SUM(CASE WHEN severity = 'high' THEN 1 ELSE 0 END) AS high
FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA'
GROUP BY date(timestamp, 'unixepoch')
ORDER BY date DESC;
-- PCI DSS cardholder data exposure
SELECT
path,
timestamp,
scanner,
severity
FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
AND scanner IN ('credit_card', 'cvv')
ORDER BY timestamp DESC;
Audit Trail
Maintain audit trails for compliance:
- Findings: All alerts with timestamps
- Remediation: Actions taken on findings
- Coverage: Endpoints monitored
Export from OSQuery or configure SIEM to retain.
Disaster Recovery
Backup
Back up critical data:
- Configuration files (
/etc/aquilon-dlp/) - SQLite database (cache)
- MDM profiles and packages
Recovery
Single endpoint recovery:
- Reinstall via MDM or manual deployment
- Deploy configuration
- Verify service running
Fleet-wide recovery:
- Verify MDM profiles and packages available
- Trigger reinstall via MDM policy
- Monitor deployment dashboard
Version Rollback
To roll back a problematic update:
- Upload previous version to MDM
- Deploy to affected endpoints
- Monitor for issues
Support
Enterprise Support Channels
- Email: support@aquilonsecurity.com
- Portal: https://portal.aquilonsecurity.com
- Emergency: Per your license agreement
Support Response Times
| Priority | Response Time |
|---|---|
| Critical (P1) | 4 hours |
| High (P2) | 8 hours |
| Normal (P3) | 24 hours |
Providing Logs
When contacting support, include:
macOS:
# Collect logs
tail -n 500 /var/log/aquilon-dlp/stderr.log > dlp-logs.txt
# System info
system_profiler SPSoftwareDataType >> dlp-logs.txt
# FDA status
sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
"SELECT * FROM access WHERE client LIKE '%aquilon%';" >> dlp-logs.txt
Linux:
# Collect logs
sudo journalctl -u aquilon-dlp -n 500 > dlp-logs.txt
# System info
uname -a >> dlp-logs.txt
cat /etc/os-release >> dlp-logs.txt
# Service status
systemctl status aquilon-dlp >> dlp-logs.txt
Next Steps
- Configure policies: See Policy Frameworks
- Set up monitoring: See Monitoring
- Review compliance: See Compliance
Admin Guide
This section covers system administration tasks for Aquilon DLP, including daily operations, maintenance, backup procedures, and disaster recovery.
Quick Links
- Operations - Service management, logs, performance
- Backup & Restore - Data protection and recovery
- Disaster Recovery - Recovery planning and procedures
Administrative Overview
Daily Operations
| Task | Frequency | Guide |
|---|---|---|
| Check service status | Daily | Operations |
| Review critical alerts | Daily | Monitoring |
| Check disk usage | Daily | Operations |
Weekly Maintenance
| Task | Frequency | Guide |
|---|---|---|
| Review scan statistics | Weekly | Operations |
| Check cache efficiency | Weekly | Operations |
| Verify backups | Weekly | Backup & Restore |
Monthly Tasks
| Task | Frequency | Guide |
|---|---|---|
| Database vacuum | Monthly | Operations |
| Log rotation review | Monthly | Operations |
| Performance audit | Monthly | Operations |
Prerequisites
Administrative tasks require:
- Root/Administrator access to the system
- OSQuery installed for monitoring queries
- SSH access for remote administration
Key File Locations
Linux
| Purpose | Location |
|---|---|
| Configuration | /etc/aquilon-dlp/aquilon_dlp_config.toml |
| Database | /var/lib/aquilon-dlp/aquilon_dlp.db |
| Logs | /var/log/aquilon-dlp/ or systemd journal |
| Service file | /etc/systemd/system/aquilon-dlp.service |
macOS
| Purpose | Location |
|---|---|
| Configuration | /etc/aquilon-dlp/aquilon_dlp_config.toml |
| Database | /var/lib/aquilon-dlp/aquilon_dlp.db |
| Logs | /var/log/aquilon-dlp/ |
| LaunchDaemon | /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist |
Next Steps
- Start with Operations for daily management tasks
- Set up Backup & Restore procedures
- Document Disaster Recovery plans
Operations
Day-to-day operational tasks for managing Aquilon DLP.
Service Management
Linux (systemd)
# Check status
sudo systemctl status aquilon-dlp
# Start/stop/restart
sudo systemctl start aquilon-dlp
sudo systemctl stop aquilon-dlp
sudo systemctl restart aquilon-dlp
# Enable/disable at boot
sudo systemctl enable aquilon-dlp
sudo systemctl disable aquilon-dlp
# View recent logs
sudo journalctl -u aquilon-dlp -n 100
sudo journalctl -u aquilon-dlp -f # Follow
macOS (launchd)
# Check status
sudo launchctl list | grep aquilon
# Load/unload (start/stop)
sudo launchctl load /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist
sudo launchctl unload /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist
# View logs
tail -f /var/log/aquilon-dlp/stderr.log
Configuration Reload
After configuration changes:
# Linux
sudo systemctl restart aquilon-dlp
# macOS
sudo launchctl unload /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist
sudo launchctl load /Library/LaunchDaemons/com.aquilonsecurity.dlp.plist
Health Checks
Service Running
# Linux
systemctl is-active aquilon-dlp
# macOS
sudo launchctl list | grep -q "com.aquilonsecurity.dlp" && echo "Running" || echo "Stopped"
Recent Alerts
-- Alerts in last 24 hours
SELECT COUNT(*) as alerts_24h
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400);
Alert Generation
-- Alerts in last hour
SELECT COUNT(*) as recent_alerts
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 3600);
Log Management
Log Locations
- Linux (systemd):
journalctl -u aquilon-dlp - Linux (syslog):
/var/log/syslogor/var/log/messages - macOS:
/var/log/aquilon-dlp/*.log
Log Rotation (Linux)
Create /etc/logrotate.d/aquilon-dlp:
/var/log/aquilon-dlp/*.log {
daily
rotate 14
compress
delaycompress
missingok
notifempty
create 0640 root root
postrotate
systemctl reload aquilon-dlp 2>/dev/null || true
endscript
}
Log Levels
Adjust log verbosity via environment variable:
# Set in service environment (Linux)
# /etc/systemd/system/aquilon-dlp.service.d/override.conf
[Service]
Environment="RUST_LOG=aquilon_dlp=info" # debug, info, warn, error
Resource Monitoring
Disk Usage
# Database size
du -sh /var/lib/aquilon-dlp/aquilon_dlp.db
# Log directory
du -sh /var/log/aquilon-dlp/
Process Resources
# CPU and memory
ps aux | grep aquilon-dlp
# Detailed (Linux)
top -p $(pgrep aquilon-dlp)
OSQuery Metrics
-- Alert statistics by severity
SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY severity;
-- Alerts by policy
SELECT policy, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY policy;
Database Maintenance
Vacuum
Reclaim space and optimize performance:
# Manual vacuum
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "VACUUM;"
# Check size before/after
ls -lh /var/lib/aquilon-dlp/aquilon_dlp.db
Integrity Check
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "PRAGMA integrity_check;"
# Expected: ok
Query Performance
Check for slow queries:
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "PRAGMA analysis_limit=1000; ANALYZE;"
Cache Management
Alert Statistics
-- Count alerts by severity
SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY severity;
Target: Review and triage critical/high severity alerts promptly
Clear Cache
To force re-scanning (use cautiously):
# Stop service first
sudo systemctl stop aquilon-dlp
# Clear cache table
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "DELETE FROM scan_cache;"
# Restart
sudo systemctl start aquilon-dlp
Cache Configuration
Tune cache settings:
[cache]
enabled = true
ttl_secs = 86400 # Cache TTL in memory (24 hours)
scan_cache_ttl_days = 7 # Database cache TTL
Alert Statistics
Current Status
-- Alert overview by triage status
SELECT
triage_status,
COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY triage_status;
-- Alert by scanner type
SELECT
scanner,
COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY scanner
ORDER BY count DESC;
Alert Trend
-- Recent alert activity
SELECT
severity,
COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY severity
ORDER BY
CASE severity
WHEN 'critical' THEN 1
WHEN 'high' THEN 2
WHEN 'medium' THEN 3
ELSE 4
END;
Performance Tuning
Worker Configuration
Adjust based on CPU cores:
[work_queue]
max_queue_size = 10000 # Work queue size
submit_timeout_secs = 5 # Timeout for queue submissions
[worker]
num_workers = 4 # Match CPU cores
Reduce I/O Load
[scan]
max_scan_size_mb = 100 # Skip large files
[resource_limits]
enabled = true
nice_level = 10 # Lower CPU priority (0-19)
High-Churn Directory Handling
Exclude directories that change frequently:
exclude_paths = [
"/tmp/%%",
"/var/cache/%%",
"/home/*/.cache/%%"
]
Troubleshooting Operations
Service Won’t Start
-
Check configuration:
aquilon-dlp --validate-config /etc/aquilon-dlp/aquilon_dlp_config.toml -
Check logs for errors:
sudo journalctl -u aquilon-dlp -n 50 -
Check database lock:
lsof /var/lib/aquilon-dlp/aquilon_dlp.db
High CPU Usage
- Check scan rate in logs
- Add exclusions for high-churn directories
- Increase nice level
- Reduce worker count
High Memory Usage
- Reduce
max_entriesin cache config - Reduce
queue_sizein worker config - Restart service to clear memory
Database Corruption
- Stop service
- Run integrity check
- If failed, restore from backup (see Backup & Restore)
Related
Backup & Restore
Procedures for backing up and restoring Aquilon DLP data and configuration.
What to Back Up
| Component | Location | Priority | Notes |
|---|---|---|---|
| Configuration | /etc/aquilon-dlp/aquilon_dlp_config.toml | Critical | Application settings |
| Database | /var/lib/aquilon-dlp/aquilon_dlp.db | High | Findings and cache |
| Custom policies | /etc/aquilon-dlp/policies/ | High | If using custom policies |
| Retention config | /etc/aquilon-dlp/retention_config.toml | Medium | Compliance retention settings |
Backup Procedures
Configuration Backup
# Create backup directory
mkdir -p /backup/aquilon-dlp/$(date +%Y%m%d)
# Backup configuration
cp /etc/aquilon-dlp/aquilon_dlp_config.toml /backup/aquilon-dlp/$(date +%Y%m%d)/
# Backup custom policies (if any)
cp -r /etc/aquilon-dlp/policies/ /backup/aquilon-dlp/$(date +%Y%m%d)/ 2>/dev/null || true
# Backup retention config (if any)
cp /etc/aquilon-dlp/retention_config.toml /backup/aquilon-dlp/$(date +%Y%m%d)/ 2>/dev/null || true
Database Backup
Hot backup (service running):
# SQLite hot backup
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db ".backup /backup/aquilon-dlp/$(date +%Y%m%d)/aquilon_dlp.db"
# Verify backup
sqlite3 /backup/aquilon-dlp/$(date +%Y%m%d)/aquilon_dlp.db "PRAGMA integrity_check;"
Cold backup (service stopped):
# Stop service
sudo systemctl stop aquilon-dlp
# Copy database
cp /var/lib/aquilon-dlp/aquilon_dlp.db /backup/aquilon-dlp/$(date +%Y%m%d)/
# Restart service
sudo systemctl start aquilon-dlp
Complete Backup Script
Create /usr/local/bin/aquilon-dlp-backup.sh:
#!/bin/bash
set -e
BACKUP_DIR="/backup/aquilon-dlp/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$BACKUP_DIR"
echo "Backing up Aquilon DLP to $BACKUP_DIR"
# Configuration
cp /etc/aquilon-dlp/aquilon_dlp_config.toml "$BACKUP_DIR/"
cp /etc/aquilon-dlp/retention_config.toml "$BACKUP_DIR/" 2>/dev/null || true
cp -r /etc/aquilon-dlp/policies/ "$BACKUP_DIR/" 2>/dev/null || true
# Database (hot backup)
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db ".backup $BACKUP_DIR/aquilon_dlp.db"
# Verify
sqlite3 "$BACKUP_DIR/aquilon_dlp.db" "PRAGMA integrity_check;" > "$BACKUP_DIR/integrity.txt"
# Compress
tar -czf "$BACKUP_DIR.tar.gz" -C "$(dirname $BACKUP_DIR)" "$(basename $BACKUP_DIR)"
rm -rf "$BACKUP_DIR"
echo "Backup complete: $BACKUP_DIR.tar.gz"
Automated Backups
Add to crontab:
# Daily backup at 2 AM
0 2 * * * /usr/local/bin/aquilon-dlp-backup.sh >> /var/log/aquilon-dlp-backup.log 2>&1
Restore Procedures
Configuration Restore
# Stop service
sudo systemctl stop aquilon-dlp
# Restore configuration
cp /backup/aquilon-dlp/20240115/aquilon_dlp_config.toml /etc/aquilon-dlp/
# Restore custom policies (if any)
cp -r /backup/aquilon-dlp/20240115/policies/ /etc/aquilon-dlp/ 2>/dev/null || true
# Validate configuration
aquilon-dlp --validate-config /etc/aquilon-dlp/aquilon_dlp_config.toml
# Restart service
sudo systemctl start aquilon-dlp
Database Restore
# Stop service
sudo systemctl stop aquilon-dlp
# Backup current database (in case restore fails)
cp /var/lib/aquilon-dlp/aquilon_dlp.db /var/lib/aquilon-dlp/aquilon_dlp.db.bak
# Restore from backup
cp /backup/aquilon-dlp/20240115/aquilon_dlp.db /var/lib/aquilon-dlp/
# Verify restored database
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "PRAGMA integrity_check;"
# Restart service
sudo systemctl start aquilon-dlp
# Verify service health
sleep 5
systemctl status aquilon-dlp
Complete Restore Script
#!/bin/bash
set -e
BACKUP_FILE="$1"
if [ -z "$BACKUP_FILE" ]; then
echo "Usage: $0 /path/to/backup.tar.gz"
exit 1
fi
echo "Restoring from $BACKUP_FILE"
# Extract backup
TEMP_DIR=$(mktemp -d)
tar -xzf "$BACKUP_FILE" -C "$TEMP_DIR"
BACKUP_DIR=$(ls "$TEMP_DIR")
# Stop service
sudo systemctl stop aquilon-dlp
# Backup current state
mkdir -p /backup/aquilon-dlp/pre-restore
cp /etc/aquilon-dlp/aquilon_dlp_config.toml /backup/aquilon-dlp/pre-restore/
cp /var/lib/aquilon-dlp/aquilon_dlp.db /backup/aquilon-dlp/pre-restore/
# Restore configuration
cp "$TEMP_DIR/$BACKUP_DIR/aquilon_dlp_config.toml" /etc/aquilon-dlp/
# Restore database
cp "$TEMP_DIR/$BACKUP_DIR/aquilon_dlp.db" /var/lib/aquilon-dlp/
# Verify
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "PRAGMA integrity_check;"
aquilon-dlp --validate-config /etc/aquilon-dlp/aquilon_dlp_config.toml
# Cleanup
rm -rf "$TEMP_DIR"
# Restart service
sudo systemctl start aquilon-dlp
echo "Restore complete"
Verification
Post-Restore Checklist
- Service starts successfully
- Configuration validates without errors
- Database integrity check passes
- OSQuery tables return data
- New findings are being generated
Verification Queries
-- Check database has data
SELECT COUNT(*) as total_alerts FROM aquilon_dlp_alerts;
-- Check recent activity
SELECT MAX(timestamp) as last_alert, COUNT(*) as total
FROM aquilon_dlp_alerts;
Log Review
After restore, check logs for errors:
# Linux
sudo journalctl -u aquilon-dlp -n 50 --no-pager
# macOS
tail -50 /var/log/aquilon-dlp/stderr.log
Retention Policy
Backup Retention
Recommended retention schedule:
| Backup Type | Retention |
|---|---|
| Daily | 7 days |
| Weekly | 4 weeks |
| Monthly | 12 months |
Cleanup Script
#!/bin/bash
BACKUP_DIR="/backup/aquilon-dlp"
# Remove backups older than 7 days
find "$BACKUP_DIR" -name "*.tar.gz" -mtime +7 -delete
echo "Cleaned up old backups"
Cloud Backup
AWS S3
# Upload to S3
aws s3 cp /backup/aquilon-dlp/20240115.tar.gz s3://my-bucket/aquilon-dlp/
# Restore from S3
aws s3 cp s3://my-bucket/aquilon-dlp/20240115.tar.gz /tmp/
./restore.sh /tmp/20240115.tar.gz
Azure Blob
# Upload to Azure
az storage blob upload \
--container-name backups \
--file /backup/aquilon-dlp/20240115.tar.gz \
--name aquilon-dlp/20240115.tar.gz
Related
Disaster Recovery
Planning and procedures for recovering Aquilon DLP in disaster scenarios.
Recovery Planning
Recovery Objectives
| Metric | Target | Description |
|---|---|---|
| RTO (Recovery Time Objective) | 1 hour | Time to restore service |
| RPO (Recovery Point Objective) | 24 hours | Maximum data loss acceptable |
Critical Components
| Component | Recovery Priority | Notes |
|---|---|---|
| Configuration | P1 | Required for service start |
| Service binary | P1 | Application itself |
| Database | P2 | Historical findings |
| Cache | P3 | Can be rebuilt |
Disaster Scenarios
Scenario 1: Single Endpoint Failure
Symptoms: Service down on one machine
Recovery:
- Restore from backup (see Backup & Restore)
- Or reinstall and reconfigure
# Restore configuration
cp /backup/aquilon-dlp/latest/aquilon_dlp_config.toml /etc/aquilon-dlp/
# Start service
sudo systemctl start aquilon-dlp
# Verify
sudo systemctl status aquilon-dlp
Scenario 2: Database Corruption
Symptoms: Service fails to start with database errors
Recovery:
# Stop service
sudo systemctl stop aquilon-dlp
# Check corruption
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "PRAGMA integrity_check;"
# If corrupted, restore from backup
cp /backup/aquilon-dlp/latest/aquilon_dlp.db /var/lib/aquilon-dlp/
# If no backup, recreate (loses history)
rm /var/lib/aquilon-dlp/aquilon_dlp.db
sudo systemctl start aquilon-dlp # Creates new database
Scenario 3: Configuration Loss
Symptoms: Invalid or missing configuration
Recovery:
# Restore from backup
cp /backup/aquilon-dlp/latest/aquilon_dlp_config.toml /etc/aquilon-dlp/
# Or download default
curl -o /etc/aquilon-dlp/aquilon_dlp_config.toml \
https://raw.githubusercontent.com/aquilonsecurity/aquilon-dlp/main/docs/config-examples/aquilon_dlp_config_enterprise.toml
# Validate
aquilon-dlp --validate-config /etc/aquilon-dlp/aquilon_dlp_config.toml
# Restart
sudo systemctl start aquilon-dlp
Scenario 4: Fleet-Wide Outage
Symptoms: Multiple endpoints affected
Recovery:
- Identify root cause (bad update, configuration push, etc.)
- Prepare fix (rollback version, configuration fix)
- Deploy fix via MDM or configuration management
- Monitor recovery
Version Rollback
Download Previous Version
Download the previous version from the Aquilon Security portal and save to /tmp/aquilon-dlp-previous.
Rollback Procedure
# Stop current service
sudo systemctl stop aquilon-dlp
# Backup current binary
cp /usr/local/bin/aquilon-dlp-enterprise /usr/local/bin/aquilon-dlp-enterprise.bak
# Install previous version
cp /tmp/aquilon-dlp-previous /usr/local/bin/aquilon-dlp-enterprise
chmod +x /usr/local/bin/aquilon-dlp-enterprise
# Restart
sudo systemctl start aquilon-dlp
# Verify version
aquilon-dlp-enterprise --version
MDM Rollback
- Upload previous version to MDM
- Deploy to affected endpoints
- Monitor deployment status
Recovery Procedures
Minimal Recovery (Configuration Only)
Fastest recovery - loses historical data but restores monitoring:
- Download fresh binary from the Aquilon Security portal
- Install to
/usr/local/bin/aquilon-dlp-enterprise - Restore configuration from backup or use default
- Restart aquilon-dlp service
# Restore configuration from backup
cp /backup/aquilon-dlp/latest/aquilon_dlp_config.toml /etc/aquilon-dlp/
# Restart service
sudo systemctl restart aquilon-dlp
Full Recovery (With History)
Complete recovery with all historical data:
# 1. Install binary from Aquilon Security portal
# Save to: /usr/local/bin/aquilon-dlp-enterprise
# 2. Restore from backup
tar -xzf /backup/aquilon-dlp/latest.tar.gz -C /tmp/
cp /tmp/backup/aquilon_dlp_config.toml /etc/aquilon-dlp/
cp /tmp/backup/aquilon_dlp.db /var/lib/aquilon-dlp/
# 3. Verify integrity
sqlite3 /var/lib/aquilon-dlp/aquilon_dlp.db "PRAGMA integrity_check;"
# 4. Restart service
sudo systemctl restart aquilon-dlp
macOS Recovery
FDA Re-grant After Recovery
After recovery on macOS, FDA may need re-granting:
-
Check profile:
sudo profiles list | grep aquilon -
If missing, redeploy PPPC profile via MDM
-
Reinstall app:
sudo rm -rf /Library/Application\ Support/aquilon-dlp.app # MDM will reinstall on next check-in -
Verify FDA:
sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \ "SELECT auth_value FROM access WHERE client = 'dev.aquilon.dlp-plugin';"
Verification
Post-Recovery Checklist
- Service running:
systemctl status aquilon-dlp - Configuration valid:
–validate-config - Database accessible: OSQuery tables return data
- Findings generating: New alerts appearing
- Monitoring active: Prometheus metrics available
- macOS: FDA granted (if applicable)
Recovery Test Queries
-- Service health - verify table exists
SELECT COUNT(*) as total_alerts FROM aquilon_dlp_alerts;
-- Recent activity
SELECT COUNT(*) as alerts_24h
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400);
-- Alert breakdown
SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY severity;
Automated Recovery
Systemd Auto-Restart
Configure in service file:
[Service]
Restart=on-failure
RestartSec=10s
StartLimitBurst=5
StartLimitIntervalSec=60s
Health Check Script
#!/bin/bash
# /usr/local/bin/aquilon-dlp-healthcheck.sh
if ! systemctl is-active --quiet aquilon-dlp; then
echo "Service down, attempting restart"
systemctl start aquilon-dlp
sleep 10
if ! systemctl is-active --quiet aquilon-dlp; then
echo "CRITICAL: Service failed to start"
# Send alert to monitoring system
exit 1
fi
fi
exit 0
Add to crontab:
*/5 * * * * /usr/local/bin/aquilon-dlp-healthcheck.sh
Communication Plan
During Outage
- Notify security team of reduced DLP coverage
- Update incident ticket
- Monitor recovery progress
Post-Recovery
- Verify all endpoints recovered
- Check for data gaps in findings
- Document root cause
- Update runbooks if needed
Prevention
Regular Testing
- Monthly: Test restore from backup
- Quarterly: Full DR drill
- Annually: Review and update DR plan
Monitoring
Set up alerts for:
- Service down
- Database corruption
- Configuration validation failures
- Scan rate drops
Related
Architecture
This page provides an architectural overview of Aquilon DLP, including system details, component architecture, and deployment topologies.
System Context
Aquilon DLP operates within an enterprise security ecosystem, integrating with OSQuery for system monitoring and exposing findings to SIEM systems for alerting and compliance reporting.
graph TB
subgraph "Enterprise Environment"
SA[Security Analysts]
SysAdmin[System Administrators]
SIEM[SIEM/Alerting System]
subgraph "Monitored System"
FS[File System]
OSQ[OSQuery]
AquilonDLP[Aquilon DLP]
end
end
SA -->|Query Alerts| OSQ
SA -->|Review Findings| SIEM
SysAdmin -->|Configure| AquilonDLP
FS -->|File Events| AquilonDLP
AquilonDLP -->|Scan Results| OSQ
OSQ -->|Export| SIEM
style AquilonDLP fill:#4a90e2,stroke:#2e5c8a,stroke-width:3px,color:#fff
style OSQ fill:#7cb342,stroke:#558b2f,stroke-width:2px,color:#fff
style SIEM fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff
Key Interactions:
- File System Monitoring: Aquilon DLP monitors directories for new/modified files
- Scan and Detect: Files are parsed, decompressed (if needed), and scanned for sensitive data
- OSQuery Integration: Findings exposed via
aquilon_dlp_alertsand related tables - SIEM Export: OSQuery exports alerts to enterprise SIEM systems
- Analyst Queries: Security analysts query findings via OSQuery or SIEM dashboards
Component Architecture
Aquilon DLP uses a plugin-based architecture with three primary layers: Scanner Engine, File Handler Layer, and Policy Engine.
graph TB
subgraph "Aquilon DLP Core"
FW[File Watcher]
FH[File Handler Layer]
SE[Scanner Engine]
PE[Policy Engine]
DB[(SQLite Cache)]
OSE[OSQuery Extension Interface]
end
subgraph "File Handlers (9 Formats)"
ZIP[ZIP Handler]
TAR[TAR Handler]
GZIP[GZIP Handler]
PDF[PDF Handler]
DOCX[DOCX Handler]
XLSX[XLSX Handler]
SEVEN[7-Zip Handler]
RAR[RAR Handler]
TEXT[Text Handler]
end
subgraph "Scanner Plugins (50+ Scanners)"
SSN[SSN Scanner]
CC[Credit Card Scanner]
EMAIL[Email Scanner]
PHONE[Phone Scanner]
PASSPORT[Passport Scanner]
NATID[National ID Scanners]
MORE[... 40+ more scanners]
end
subgraph "Policy Frameworks"
HIPAA[HIPAA Framework]
PCI[PCI DSS Framework]
GDPR[GDPR Framework]
CCPA[CCPA Framework]
SOX[SOX Framework]
ISO[ISO 27001 Framework]
end
FW -->|New/Modified Files| FH
FH --> ZIP
FH --> TAR
FH --> GZIP
FH --> PDF
FH --> DOCX
FH --> XLSX
FH --> SEVEN
FH --> RAR
FH --> TEXT
ZIP -->|Extracted Text| SE
TAR -->|Extracted Text| SE
GZIP -->|Decompressed Text| SE
PDF -->|Extracted Text| SE
DOCX -->|Extracted Text| SE
XLSX -->|Extracted Text| SE
SEVEN -->|Extracted Text| SE
RAR -->|Extracted Text| SE
TEXT -->|Raw Text| SE
SE --> SSN
SE --> CC
SE --> EMAIL
SE --> PHONE
SE --> PASSPORT
SE --> NATID
SE --> MORE
SSN -->|Findings| PE
CC -->|Findings| PE
EMAIL -->|Findings| PE
PHONE -->|Findings| PE
PASSPORT -->|Findings| PE
NATID -->|Findings| PE
MORE -->|Findings| PE
PE --> HIPAA
PE --> PCI
PE --> GDPR
PE --> CCPA
PE --> SOX
PE --> ISO
HIPAA -->|Violations| DB
PCI -->|Violations| DB
GDPR -->|Violations| DB
CCPA -->|Violations| DB
SOX -->|Violations| DB
ISO -->|Violations| DB
DB -->|Query Interface| OSE
style FW fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
style SE fill:#7cb342,stroke:#558b2f,stroke-width:2px,color:#fff
style PE fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff
style DB fill:#9c27b0,stroke:#6a1b9a,stroke-width:2px,color:#fff
Layer Descriptions:
File Handler Layer (9 Handlers)
Processes various file formats and containers:
- Archive Handlers: ZIP, TAR, GZIP, 7-Zip, RAR (recursive extraction)
- Document Handlers: PDF, DOCX, XLSX (text extraction)
- Text Handler: Plain text, source code, config files
Key Feature: Recursive descent into nested archives (e.g., ZIP inside TAR inside GZIP)
Scanner Engine (50+ Plugins)
Detects sensitive data patterns:
- National ID Scanners: 28 country-specific national IDs (EU, Americas, Asia-Pacific, Middle East)
- Identity Scanners: SSN, passport, driver’s license
- Financial Scanners: Credit cards, bank accounts, IBAN
- Healthcare Scanners: Medical record numbers, NPI, MBI
- Contact Scanners: Emails, phone numbers, physical addresses
- Credential Scanners: API keys, tokens, crypto keys, database connections
Key Feature: Stream-based scanning with O(1) memory usage (constant memory regardless of file size)
Policy Engine (6 Frameworks)
Maps findings to compliance requirements:
- 🏢 HIPAA: Healthcare PHI detection (Enterprise only)
- 🏢 PCI DSS: Payment card data (Enterprise only)
- 🏢 SOX: Financial data (Enterprise only)
- 🏢 ISO 27001: Information security (Enterprise only)
- GDPR: EU personal data (All editions)
- CCPA: California consumer data (All editions)
Key Feature: Multi-framework evaluation (single file can trigger multiple policy violations)
Database Cache (SQLite)
Stores scan results with:
- Hash-based deduplication: Skip rescanning unchanged files
- Metadata indexing: Fast lookups by path, policy, severity
- Retention policies: Configurable cleanup of old findings
Performance: 5.4M operations/sec query throughput
Deployment Topology
Aquilon DLP supports multiple deployment models depending on organizational needs.
graph TB
subgraph "Single-Node Deployment"
subgraph "Host System"
FS1[File System]
OSQ1[OSQuery]
AQD1[Aquilon DLP]
DB1[(SQLite Cache)]
end
FS1 --> AQD1
AQD1 --> DB1
DB1 --> OSQ1
OSQ1 -->|Export| SIEM1[SIEM/Alerting]
end
subgraph "Enterprise Deployment (Distributed)"
subgraph "Fleet (100s-1000s of hosts)"
subgraph "Host 1"
FS2[File System]
OSQ2[OSQuery]
AQD2[Aquilon DLP]
DB2[(Cache)]
end
subgraph "Host 2"
FS3[File System]
OSQ3[OSQuery]
AQD3[Aquilon DLP]
DB3[(Cache)]
end
subgraph "Host N"
FS4[File System]
OSQ4[OSQuery]
AQD4[Aquilon DLP]
DB4[(Cache)]
end
end
subgraph "Central Infrastructure"
FLEET[OSQuery Fleet Manager]
CENTRAL_SIEM[Central SIEM]
DASHBOARD[Compliance Dashboard]
end
OSQ2 --> FLEET
OSQ3 --> FLEET
OSQ4 --> FLEET
FLEET --> CENTRAL_SIEM
CENTRAL_SIEM --> DASHBOARD
end
subgraph "🍎 MDM Deployment (macOS)"
subgraph "MDM System"
JAMF[Jamf Pro / Intune / Kandji]
CONFIG[Configuration Profiles]
PKG[Aquilon DLP PKG]
end
subgraph "macOS Fleet"
MAC1[MacBook 1]
MAC2[MacBook 2]
MACN[MacBook N]
end
JAMF -->|Deploy PKG| MAC1
JAMF -->|Deploy PKG| MAC2
JAMF -->|Deploy PKG| MACN
CONFIG -->|Full Disk Access| MAC1
CONFIG -->|Full Disk Access| MAC2
CONFIG -->|Full Disk Access| MACN
end
style AQD1 fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
style AQD2 fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
style AQD3 fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
style AQD4 fill:#4a90e2,stroke:#2e5c8a,stroke-width:2px,color:#fff
Deployment Models:
Single-Node Deployment
Best for:
- Small teams (< 5 servers for Basic Edition)
- Development/staging environments
- Proof-of-concept deployments
Architecture:
- Aquilon DLP runs on each monitored system
- Local SQLite cache stores findings
- OSQuery exposes findings locally
- Optional SIEM export for centralized alerting
Setup Time: ~5 minutes per host
Enterprise Deployment (Distributed)
Best for:
- Large organizations (100s-1000s of hosts)
- Multi-site deployments
- Compliance-driven environments (healthcare, finance)
Architecture:
- Aquilon DLP deployed on every monitored host
- OSQuery Fleet Manager aggregates findings across fleet
- Central SIEM processes alerts and generates compliance reports
- Compliance Dashboard provides executive visibility
Key Features:
- Unlimited server licensing (Enterprise Edition)
- All policy frameworks (HIPAA, PCI DSS, SOX, ISO 27001, GDPR, CCPA)
- Enterprise support with 4-hour SLA for critical issues
🍎 MDM Deployment (macOS)
Best for:
- macOS fleet management (Enterprise Edition only)
- Organizations using Jamf Pro, Microsoft Intune, or Kandji
- Zero-touch deployment for new devices
Architecture:
- PKG installer deployed via MDM system
- Configuration profiles grant Full Disk Access
- Launch Daemon ensures Aquilon DLP starts on boot
- Integration with OSQuery for monitoring
Key Features:
- Automated deployment to 100s-1000s of Macs
- Centralized configuration management
- Native Endpoint Security integration
- User-transparent operation
Data Flow
Understanding how data flows through Aquilon DLP:
1. File Monitoring
macOS (Enterprise Edition):
- Native Endpoint Security API monitors file system events
- Events filtered by watch paths and exclusions
- New/modified files queued for scanning
Linux (All Editions):
- inotify-based file system monitoring
- Recursive directory watching with pattern matching
- Event deduplication to prevent scan storms
2. File Processing
File Detected → File Handler Selection → Format Processing → Text Extraction
Handler Selection:
- Based on file extension and magic number detection
- Archive handlers recursively process nested containers
- Document handlers extract text from structured formats
- Text handler processes plain text files directly
Example Flow (nested archive):
report.zip → ZIP Handler
├─ data.tar → TAR Handler
│ ├─ records.txt → Text Handler → Scanner Engine
│ └─ patient.pdf → PDF Handler → Scanner Engine
└─ summary.docx → DOCX Handler → Scanner Engine
3. Scanning
Stream-Based Processing:
- Text streamed to scanner plugins (not loaded into memory)
- All 50+ scanners run concurrently on same stream
- Constant O(1) memory usage regardless of file size
- 5.4M operations/sec throughput
Finding Generation:
- Each scanner reports matches with details (line number, surrounding text)
- Metadata captured: file path, scanner type, confidence score
- Findings passed to Policy Engine for evaluation
4. Policy Evaluation
Framework Matching:
- Each finding evaluated against enabled policy frameworks
- HIPAA: Checks for PHI patterns (SSN + medical details)
- PCI DSS: Validates credit card numbers with checksums
- GDPR/CCPA: Identifies EU/CA personal data
- SOX: Detects financial records requiring retention
- ISO 27001: Flags sensitive information assets
Severity Assignment:
- Critical: SSN, credit cards, passport numbers
- High: Email addresses, phone numbers (in sensitive contexts)
- Medium: Generic PII without strong identifiers
- Low: Informational findings (email domains)
5. Storage and Exposure
SQLite Cache:
- Hash-based deduplication (skip unchanged files)
- Indexed by path, policy, severity, timestamp
- Configurable retention (default 90 days)
- Vacuum and optimization on schedule
OSQuery Tables:
aquilon_dlp_alerts: Findings with policy violations, triage status, and metadata
SIEM Export:
- OSQuery scheduled queries export to SIEM
- JSON format with full details
- Configurable alert thresholds and grouping
- Integration with Splunk, Elasticsearch, QRadar, etc.
Performance Characteristics
Aquilon DLP is optimized for production workloads with minimal system impact:
Memory Usage
- O(1) Memory: Constant memory regardless of file size
- Stream Processing: Files scanned incrementally (no full load)
- Typical Usage: 50-150MB per process (depending on plugin count)
- Archive Handling: Temporary extraction cleaned up immediately
Throughput
- Scanner Engine: 5.4M operations/sec (single-threaded)
- File Processing: Limited by disk I/O (linear scaling)
- Concurrent Scanning: Configurable worker pool (default 4 workers)
- Archive Decompression: Streamed (no disk spooling for small files)
Latency
- Small Files (< 1MB): Sub-millisecond scan time
- Medium Files (1-100MB): Milliseconds to seconds
- Large Files (> 100MB): Seconds (configurable skip threshold)
- Archives: Proportional to number of contained files
Optimization Strategies
Cache Hit Rate:
- Hash-based deduplication: ~85-95% cache hits in typical environments
- Skip rescanning unchanged files
- Invalidation on modification timestamp change
Exclusion Patterns:
- Exclude high-churn directories (caches, temp files)
- Skip binary-only files (executables, images)
- Configurable max file size (default 100MB)
Worker Tuning:
- Adjust
num_workersbased on CPU cores - Default 4 workers balances throughput and system impact
- Increase for I/O-bound workloads, decrease for CPU-constrained systems
Plugin Architecture
Aquilon DLP’s extensibility comes from its plugin-based design:
Scanner Plugin Interface
All scanners implement the StreamScanner trait:
#![allow(unused)]
fn main() {
pub trait StreamScanner {
fn scan(&self, content: &str) -> anyhow::Result<Vec<Finding>>;
fn scanner_type(&self) -> &str;
}
}
Benefits:
- New scanners added without modifying core engine
- Independent testing and versioning
- Community contributions possible
File Handler Plugin Interface
Handlers implement the FileHandler trait:
#![allow(unused)]
fn main() {
pub trait FileHandler {
fn can_handle(&self, path: &Path) -> bool;
fn process(&self, path: &Path) -> anyhow::Result<Vec<String>>;
}
}
Benefits:
- Support new formats without core changes
- Recursive container handling (archives in archives)
- Fallback to text handler if format unknown
Policy Framework Interface
Frameworks implement the PolicyFramework trait:
#![allow(unused)]
fn main() {
pub trait PolicyFramework {
fn evaluate(&self, findings: &[Finding]) -> Vec<PolicyViolation>;
fn framework_name(&self) -> &str;
}
}
Benefits:
- Custom compliance frameworks
- Org-specific rules via TOML policies
- Combine multiple frameworks (e.g., HIPAA + PCI DSS)
Security Considerations
Data Handling
- No External Transmission: All scanning happens locally
- Local Cache Only: Findings stored in local SQLite database
- Configurable Retention: Auto-delete old findings (compliance requirement)
- Access Control: Cache file permissions restrict to root/admin
macOS Endpoint Security
🍎 Enterprise Edition:
- Native Endpoint Security framework (requires entitlements)
- System Extension approval required
- Full Disk Access permission for comprehensive monitoring
- Code signed and notarized for enterprise deployment
Linux Security
- inotify Limits: Configurable watch limits (sysctl tuning)
- File Permissions: Respects existing file ACLs
- Systemd Integration: Runs as systemd service with restart policies
- SELinux Support: Compatible with enforcing mode (policy module available)
Next Steps
- User Guide: Learn how to configure policies
- Deployment: Explore deployment options for your environment
- API Integration: Query findings via OSQuery tables
- Compliance: Review compliance frameworks for your industry
OSQuery Integration
Aquilon DLP exposes security findings through OSQuery virtual tables. This guide covers the available tables, column schemas, query examples, and alert triage workflows.
Overview
Aquilon DLP registers as an OSQuery extension, providing custom tables that can be queried using standard SQL. All interaction with Aquilon DLP data occurs through OSQuery queries.
Prerequisites
- OSQuery installed and running
- Aquilon DLP extension loaded (automatic with package installation)
aquilon_dlp_alerts Table
The primary table for accessing DLP findings and managing alert triage.
Column Reference
| Column | Type | Description |
|---|---|---|
id | TEXT | UUID (finding_id) for row identification |
timestamp | BIGINT | Unix timestamp of detection |
path | TEXT | Full path to the file containing the finding |
scanner | TEXT | Scanner that detected the data (e.g., ssn, credit_card, iban) |
severity | TEXT | Alert severity: critical, high, medium, low, info |
policy | TEXT | Policy that generated the violation (e.g., HIPAA, PCI, GDPR) |
data_type | TEXT | Category of sensitive data detected |
pattern | TEXT | Pattern or regex that matched |
confidence | INTEGER | Scanner confidence (0-100) |
match_count | INTEGER | Number of matches found in file |
frameworks | TEXT | Applicable compliance frameworks |
triage_status | TEXT | Triage state: new, acknowledged, resolved, ignored |
triage | TEXT | JSON object with triage details (see below) |
context | TEXT | JSON object with file metadata and context (see below) |
JSON Columns
The triage and context columns contain JSON data for flexible querying.
triage Column
Contains triage workflow information:
{
"owner": "analyst@company.com",
"comment": "False positive - test data file",
"timestamp": 1727794245
}
Empty fields are omitted. An alert with no triage data has an empty object: {}
context Column
Contains file metadata, text snippets, and container information:
{
"snippet": "...text around the match...",
"keywords": ["ssn", "pii"],
"file": {
"hash": "d7c4529ffe273e1dc...",
"size": 20666,
"container": {
"path": "archive.zip/inner.txt",
"depth": 1
}
},
"metadata": {"gdpr_article": "Article-4"}
}
The container object is only present when the finding is inside an archive (depth > 0).
Querying JSON Columns
Use SQLite JSON_EXTRACT to query JSON fields:
-- Extract file hash from context
SELECT path, JSON_EXTRACT(context, '$.file.hash') as file_hash
FROM aquilon_dlp_alerts
LIMIT 5;
-- Filter by file size
SELECT path, scanner, JSON_EXTRACT(context, '$.file.size') as size
FROM aquilon_dlp_alerts
WHERE CAST(JSON_EXTRACT(context, '$.file.size') AS INTEGER) > 10000;
-- Find findings inside containers
SELECT path, JSON_EXTRACT(context, '$.file.container.path') as container_path
FROM aquilon_dlp_alerts
WHERE JSON_EXTRACT(context, '$.file.container.depth') > 0;
-- Query triage owner
SELECT path, JSON_EXTRACT(triage, '$.owner') as owner
FROM aquilon_dlp_alerts
WHERE JSON_EXTRACT(triage, '$.owner') IS NOT NULL;
Basic Queries
View Recent Alerts
-- All alerts from last 24 hours
SELECT path, scanner, severity, policy, timestamp
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400)
ORDER BY timestamp DESC;
Filter by Severity
-- Critical and high severity alerts
SELECT path, scanner, policy, confidence, match_count
FROM aquilon_dlp_alerts
WHERE severity IN ('critical', 'high')
ORDER BY severity, timestamp DESC;
Filter by Policy
-- HIPAA violations
SELECT path, scanner, severity, data_type
FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA'
ORDER BY timestamp DESC;
-- PCI DSS violations
SELECT path, scanner, severity, match_count
FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
ORDER BY timestamp DESC;
Group by Scanner Type
-- Count findings by scanner
SELECT scanner, COUNT(*) as count,
AVG(confidence) as avg_confidence
FROM aquilon_dlp_alerts
GROUP BY scanner
ORDER BY count DESC;
Files with Multiple Finding Types
-- Files containing multiple types of sensitive data
SELECT path,
GROUP_CONCAT(DISTINCT scanner) as scanners,
COUNT(*) as total_findings
FROM aquilon_dlp_alerts
GROUP BY path
HAVING COUNT(DISTINCT scanner) > 1
ORDER BY total_findings DESC
LIMIT 20;
Container/Archive Findings
-- Findings within archives (ZIP, TAR, etc.)
SELECT path,
JSON_EXTRACT(context, '$.file.container.path') as container_path,
JSON_EXTRACT(context, '$.file.container.depth') as container_depth,
scanner, severity
FROM aquilon_dlp_alerts
WHERE JSON_EXTRACT(context, '$.file.container.depth') > 0
ORDER BY container_depth DESC, timestamp DESC;
Compliance Reporting
HIPAA PHI Summary
SELECT
scanner,
severity,
COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA'
GROUP BY scanner, severity
ORDER BY
CASE severity
WHEN 'critical' THEN 1
WHEN 'high' THEN 2
WHEN 'medium' THEN 3
ELSE 4
END;
PCI DSS Cardholder Data
SELECT path, scanner, match_count, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
AND scanner IN ('credit_card', 'magnetic_stripe', 'cvv')
ORDER BY timestamp DESC;
GDPR Personal Data
SELECT
scanner,
COUNT(*) as exposures,
COUNT(DISTINCT path) as files_affected
FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
GROUP BY scanner
ORDER BY exposures DESC;
Alert Triage
The aquilon_dlp_alerts table supports UPDATE operations for managing alert lifecycle. This allows security analysts to acknowledge, investigate, and resolve alerts directly through OSQuery.
Triage Status Values
| Status | Description |
|---|---|
new | Just detected, needs review (default) |
acknowledged | Analyst is investigating |
resolved | Issue has been handled |
ignored | Intentionally skipped (false positive, acceptable risk) |
Updating Triage Status
Use OSQuery UPDATE statements to manage alert triage. The triage_status column is a flat column, while triage is a JSON column containing owner, comment, and timestamp.
Acknowledge an Alert
UPDATE aquilon_dlp_alerts
SET triage_status = 'acknowledged',
triage = JSON_OBJECT('owner', 'analyst@company.com', 'comment', 'Investigating potential data exposure')
WHERE path = '/data/reports/customer_export.csv'
AND scanner = 'ssn';
Resolve an Alert
UPDATE aquilon_dlp_alerts
SET triage_status = 'resolved',
triage = JSON_OBJECT('owner', JSON_EXTRACT(triage, '$.owner'), 'comment', 'File moved to secure location and access restricted')
WHERE path = '/data/reports/customer_export.csv'
AND scanner = 'ssn';
Mark as False Positive
UPDATE aquilon_dlp_alerts
SET triage_status = 'ignored',
triage = JSON_OBJECT('owner', 'security-team', 'comment', 'False positive - test data file with synthetic SSNs')
WHERE path = '/test/fixtures/sample_data.txt';
Bulk Triage by Policy
-- Acknowledge all new PCI alerts for investigation
UPDATE aquilon_dlp_alerts
SET triage_status = 'acknowledged',
triage = JSON_OBJECT('owner', 'pci-compliance-team')
WHERE policy = 'PCI_DSS'
AND triage_status = 'new';
Triage Workflow Queries
Alerts Needing Review
-- New alerts requiring triage
SELECT path, scanner, severity, policy, timestamp
FROM aquilon_dlp_alerts
WHERE triage_status = 'new'
ORDER BY
CASE severity
WHEN 'critical' THEN 1
WHEN 'high' THEN 2
ELSE 3
END,
timestamp DESC;
My Assigned Alerts
-- Alerts assigned to specific analyst
SELECT path, scanner, severity, triage_status,
JSON_EXTRACT(triage, '$.comment') as triage_comment
FROM aquilon_dlp_alerts
WHERE JSON_EXTRACT(triage, '$.owner') = 'analyst@company.com'
AND triage_status IN ('new', 'acknowledged')
ORDER BY timestamp DESC;
Triage Summary
-- Overview of triage status
SELECT
triage_status,
COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY triage_status
ORDER BY
CASE triage_status
WHEN 'new' THEN 1
WHEN 'acknowledged' THEN 2
WHEN 'resolved' THEN 3
WHEN 'ignored' THEN 4
END;
Recently Resolved
-- Alerts resolved in last 7 days
SELECT path, scanner,
JSON_EXTRACT(triage, '$.owner') as triage_owner,
JSON_EXTRACT(triage, '$.comment') as triage_comment,
JSON_EXTRACT(triage, '$.timestamp') as triage_timestamp
FROM aquilon_dlp_alerts
WHERE triage_status = 'resolved'
AND JSON_EXTRACT(triage, '$.timestamp') > (strftime('%s', 'now') - 604800)
ORDER BY triage_timestamp DESC;
Triage Notes
- The
triage.timestampis automatically set when you update triage fields - INSERT and DELETE operations are not supported - alerts are generated only by the scanner
- Triage updates persist in the SQLite database
- Multiple alerts for the same file/scanner combination can be updated individually or in bulk
aquilon_config Table (Enterprise)
Enterprise Only: This table is only available in Aquilon DLP Enterprise edition.
The configuration table exposes all Aquilon DLP settings as queryable rows with a simple key-value schema. Scalar values are stored as strings, and arrays are stored as JSON arrays.
Column Reference
| Column | Type | Description |
|---|---|---|
key | TEXT | Configuration key (dot-notation, e.g., scan.max_scan_size_mb) |
value | TEXT | Current value (scalars as strings, arrays as JSON arrays) |
Basic Queries
View All Configuration
-- View all configuration keys and their current values
SELECT key, value
FROM aquilon_config
ORDER BY key;
Query Scan Settings
-- All scan-related configuration
SELECT key, value
FROM aquilon_config
WHERE key LIKE 'scan.%'
ORDER BY key;
View Array Configuration (as JSON)
Arrays are stored as JSON arrays. Use json_each() to expand them:
-- View array keys (value contains JSON array)
SELECT key, value
FROM aquilon_config
WHERE key IN ('watch_paths', 'exclude_paths', 'policies.enabled_policies')
ORDER BY key;
-- Expand array to individual rows
SELECT c.key, j.value AS item
FROM aquilon_config c, json_each(c.value) j
WHERE c.key = 'watch_paths';
Check Specific Setting
-- Get current value of a specific setting
SELECT key, value
FROM aquilon_config
WHERE key = 'scan.max_scan_size_mb';
Modifying Configuration
The aquilon_config table supports runtime modifications for mutable settings. Changes take effect immediately and persist to the configuration file.
Operation Rules:
- UPDATE: Set value for any mutable key (scalars as strings, arrays as JSON)
- DELETE: Reset mutable key to its compiled default value
- INSERT: Not supported (keys are predefined)
- Immutable keys: Cannot be modified at runtime
Update Settings
Use UPDATE to modify any mutable configuration value:
-- Update max scan size (integer setting)
UPDATE aquilon_config
SET value = '100'
WHERE key = 'scan.max_scan_size_mb';
-- Enable/disable cache (boolean setting)
UPDATE aquilon_config
SET value = 'false'
WHERE key = 'cache.enabled';
-- Update CPU limit (float setting)
UPDATE aquilon_config
SET value = '75.5'
WHERE key = 'resource_limits.max_cpu_percent';
Update Array Settings
Arrays are stored as JSON. Use UPDATE with a JSON array value:
-- Set exclusion paths (replaces entire array)
UPDATE aquilon_config
SET value = '["/var/log/*", "/tmp/*", "*.swp"]'
WHERE key = 'exclude_paths';
-- Set watch paths
UPDATE aquilon_config
SET value = '["/home/%%", "/data/sensitive/*"]'
WHERE key = 'watch_paths';
Reset to Default
Use DELETE to reset a mutable key to its compiled default value:
-- Reset exclusion paths to default
DELETE FROM aquilon_config
WHERE key = 'exclude_paths';
-- Reset cache TTL to default
DELETE FROM aquilon_config
WHERE key = 'cache.ttl_secs';
Configuration Operation Reference
| Operation | Mutable Keys | Immutable Keys |
|---|---|---|
| SELECT | ✓ | ✓ |
| UPDATE | ✓ (value as string or JSON array) | ✗ |
| DELETE | ✓ (resets to default) | ✗ |
| INSERT | ✗ | ✗ |
Common mutable scalar keys:
scan.max_scan_size_mb- Maximum file size to scan (MB)scan.max_findings_per_scanner- Limit findings per scannercache.enabled- Enable/disable file hash cachecache.ttl_secs- Cache time-to-liveresource_limits.enabled- Enable CPU/memory limitsresource_limits.max_cpu_percent- CPU usage limit
Common mutable array keys (use JSON arrays):
watch_paths- Paths to monitor for changesexclude_paths- Glob patterns for paths to skippolicies.enabled_policies- Active policy names
Error Cases
Understanding error messages helps diagnose configuration issues. The examples below show common mistakes and the errors they produce.
Immutable Key Error
Attempting to modify or delete an immutable key:
-- This will fail: database_path is immutable
UPDATE aquilon_config
SET value = '/new/path/aquilon.db'
WHERE key = 'database_path';
Error: Key 'database_path' is immutable
-- This will also fail
DELETE FROM aquilon_config
WHERE key = 'database_path';
Error: Key 'database_path' is immutable
INSERT Not Supported
INSERT operations are not supported (keys are predefined):
INSERT INTO aquilon_config (key, value)
VALUES ('custom_key', 'some_value');
Error: INSERT not supported, use UPDATE to modify values
Invalid Value Type
Providing wrong type for a key:
-- This will fail: expects integer
UPDATE aquilon_config
SET value = 'not_a_number'
WHERE key = 'scan.max_scan_size_mb';
Error: Invalid value for 'scan.max_scan_size_mb': expected Integer
Invalid JSON for Array
Providing invalid JSON for an array key:
-- This will fail: not valid JSON array
UPDATE aquilon_config
SET value = '/path1, /path2'
WHERE key = 'exclude_paths';
Error: Invalid value for 'exclude_paths': expected JSON array
Value Out of Range
Providing a value outside allowed range:
-- This will fail: max_scan_size_mb has limits
UPDATE aquilon_config
SET value = '999999'
WHERE key = 'scan.max_scan_size_mb';
Error: Value 999999 out of range for 'scan.max_scan_size_mb' (1-10000)
Troubleshooting with Config Table
The config table helps diagnose scanning issues by exposing current settings.
Why Isn’t My File Being Scanned?
Check if the file path matches an exclusion pattern:
-- Check current exclusion patterns (stored as JSON array)
SELECT key, value
FROM aquilon_config
WHERE key = 'exclude_paths';
-- Expand exclusion patterns for easier reading
SELECT j.value AS excluded_pattern
FROM aquilon_config c, json_each(c.value) j
WHERE c.key = 'exclude_paths';
Check Active Policies
Verify which policies are enabled:
-- List enabled policies (stored as JSON array)
SELECT key, value
FROM aquilon_config
WHERE key = 'policies.enabled_policies';
-- Expand policies for easier reading
SELECT j.value AS policy_name
FROM aquilon_config c, json_each(c.value) j
WHERE c.key = 'policies.enabled_policies';
Diagnose Performance Issues
Check resource limits and scan settings:
-- Check resource and scan limits
SELECT key, value
FROM aquilon_config
WHERE key LIKE 'resource_limits.%'
OR key LIKE 'scan.%'
ORDER BY key;
Verify Cache Configuration
Check if caching is enabled and its settings:
-- Check cache settings
SELECT key, value
FROM aquilon_config
WHERE key LIKE 'cache.%'
ORDER BY key;
Command Line Usage
Interactive Queries
# Start OSQuery interactive shell
osqueryi
# Run a query
osqueryi "SELECT * FROM aquilon_dlp_alerts LIMIT 10;"
JSON Output
# Get results as JSON for scripting
osqueryi --json "SELECT path, scanner, severity FROM aquilon_dlp_alerts WHERE severity = 'critical';"
Scheduled Queries
Configure scheduled queries in /etc/osquery/osquery.conf:
{
"schedule": {
"dlp_critical_alerts": {
"query": "SELECT * FROM aquilon_dlp_alerts WHERE severity = 'critical' AND triage_status = 'new'",
"interval": 300,
"description": "Critical DLP alerts needing triage"
},
"dlp_daily_summary": {
"query": "SELECT scanner, severity, COUNT(*) as count FROM aquilon_dlp_alerts WHERE timestamp > (strftime('%s', 'now') - 86400) GROUP BY scanner, severity",
"interval": 86400,
"description": "Daily DLP finding summary"
}
}
}
Related
Compliance Overview
Aquilon DLP includes built-in compliance policy frameworks that automatically classify findings and generate violations according to regulatory requirements.
Available Frameworks
| Framework | Description | Key Controls | Edition |
|---|---|---|---|
| GDPR | EU General Data Protection Regulation | Articles 5, 32, 33 | All |
| CCPA | California Consumer Privacy Act | §1798.100-199 | All |
| HIPAA | Health Insurance Portability and Accountability Act | §164.306, §164.312 | Enterprise |
| PCI DSS | Payment Card Industry Data Security Standard | Requirements 3, 4, 12 | Enterprise |
| SOX | Sarbanes-Oxley Act | Sections 302, 404, 409 | Enterprise |
| ISO 27001 | Information Security Management | Controls A.8.12, A.5.12, A.8.11 | Enterprise |
| CUI | Controlled Unclassified Information | NIST SP 800-171 | Enterprise |
| CMMC | Cybersecurity Maturity Model Certification | DFARS 252.204-7012 | Enterprise |
| FedRAMP | Federal Risk and Authorization Management | NIST SP 800-53 | Enterprise |
| FISMA | Federal Information Security Modernization Act | FIPS 199, NIST SP 800-53 | Enterprise |
How Policy Frameworks Work
Each policy framework:
- Evaluates scan findings from all 50+ scanner plugins
- Applies regulatory logic to determine violations
- Classifies severity based on data type and details
- Generates metadata for compliance reporting
Example Flow
File scanned → SSN detected → HIPAA evaluates → PHI violation (Critical)
→ PCI DSS evaluates → No violation (SSN not PAN)
→ GDPR evaluates → Personal data violation (High)
Enabling Policies
Configure policies in aquilon_dlp_config.toml:
[policies]
enabled_policies = ["gdpr", "hipaa", "pci_dss", "sox", "iso27001", "cui", "cmmc", "fedramp", "fisma"]
# Optional: customize specific policies
# [policies.policy_configs.hipaa]
# settings = { covered_entity = "true" }
# [policies.policy_configs.pci_dss]
# settings = { merchant_level = "2" }
# [policies.policy_configs.cmmc]
# settings = { level = "2" }
Policy Configuration Options
Each policy supports configuration options:
| Option | Description | Default |
|---|---|---|
enabled | Enable/disable the policy | true |
confidence_threshold | Minimum scanner confidence to generate violation | 0.7 |
sensitivity_level | Adjust severity calculation | 2 (1-3) |
Framework-Specific Settings
HIPAA:
covered_entity: Whether organization is a HIPAA covered entity
PCI DSS:
merchant_level: PCI merchant level (1-4)version: PCI DSS version (3.2.1 or 4.0)
ISO 27001:
enforce_data_masking: Require data masking for violationsclassification_level: Default classification (restricted/confidential/internal/public)
Violation Severity Levels
All frameworks use consistent severity levels:
| Level | Description | Typical Response |
|---|---|---|
| Critical | Immediate breach risk | Immediate investigation |
| High | Significant exposure | Investigate within 24 hours |
| Medium | Moderate risk | Investigate within 7 days |
| Low | Minor concern | Review during regular audit |
Compliance Reporting
OSQuery Queries
Query violations by policy:
-- All HIPAA critical findings
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA' AND severity = 'critical';
-- Policy violation summary
SELECT policy, severity, COUNT(*) as count
FROM aquilon_dlp_alerts
GROUP BY policy, severity;
-- Recent violations by framework
SELECT policy, path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE timestamp > (strftime('%s', 'now') - 86400)
ORDER BY timestamp DESC;
Audit Trail
Each violation includes metadata for audit purposes:
- Policy: Framework that generated the violation
- Severity: Risk classification
- Scanner: Detection method
- Context: Surrounding text for validation
- Timestamp: Detection time
- File path: Location of finding
Custom Policies
Beyond built-in frameworks, create custom policies for:
- Company-specific identifiers
- Internal compliance requirements
- Industry-specific patterns
See Policy Frameworks for custom policy creation.
Next Steps
- HIPAA - Healthcare data protection
- PCI DSS - Payment card security
- SOX - Financial controls
- ISO 27001 - Information security management
- GDPR - EU data protection
- CCPA - California consumer privacy
- CUI - Controlled Unclassified Information (NIST SP 800-171)
- CMMC - DoD contractor certification
- FedRAMP - Federal cloud authorization
- FISMA - Federal agency security
HIPAA Compliance
Note: HIPAA policy framework requires Enterprise Edition.
The Health Insurance Portability and Accountability Act (HIPAA) policy framework detects Protected Health Information (PHI) exposure and generates violations according to HIPAA Security Rule requirements.
Overview
HIPAA establishes standards for protecting sensitive patient health information. Aquilon DLP’s HIPAA policy helps covered entities and business associates comply with:
- §164.306 - Security standards: General rules
- §164.312 - Technical safeguards
- §164.308 - Administrative safeguards
Protected Health Information (PHI)
The HIPAA policy detects the following PHI categories:
| PHI Category | Scanners | Severity |
|---|---|---|
| Social Security Numbers | ssn | Critical |
| Medical Record Numbers | medical_record_number | Critical |
| Health Plan IDs | health_plan_id | Critical |
| National Provider IDs (NPI) | npi | High |
| Date of Birth | date_of_birth | High |
| Email (patient contact) | email | Medium |
| Phone Numbers | phone | Medium |
| Addresses | address | Medium |
International Patient Populations
Healthcare organizations serving international patients may encounter national identification numbers from other countries. Aquilon DLP includes 28 country-specific national ID scanners for comprehensive coverage.
Common International IDs in Healthcare
| Region | Scanners | Use Case |
|---|---|---|
| Europe | france_nir, germany_steurid, uk_nino, + 11 more | EU/EEA patients, medical tourism |
| Americas | brazil_cpf, canada_sin, + 2 more | Cross-border healthcare |
| Asia-Pacific | india_aadhaar, japan_my_number, + 6 more | International patients |
Note: While SSN remains the primary identifier for US healthcare, organizations with international patient populations should enable additional national ID scanners. All scanners use country-specific checksum validation.
See Policy Frameworks for the complete list of all 28 national ID scanners.
Scanner Mappings
Critical Severity
These findings always generate Critical violations under HIPAA:
- SSN: Direct patient identifier
- Medical Record Number: Unique patient identifier
- Health Plan Beneficiary Number: Insurance identifier
High Severity
- NPI: Healthcare provider identifier (may indicate patient-provider relationship)
- Date of Birth: Combined with other data enables patient identification
- Biometric Data: Fingerprints, retinal scans, voice prints
Medium Severity
- Contact Information: Email, phone when in healthcare details
- Geographic Data: Address, ZIP codes (smaller than state)
Configuration
Basic Configuration
[policies]
enabled_policies = ["hipaa"]
Advanced Configuration
[policies.policy_configs.hipaa]
settings = { covered_entity = "true", confidence_threshold = "0.8", sensitivity_level = "3" }
Configuration Options
| Option | Description | Default |
|---|---|---|
covered_entity | Indicates organization is a HIPAA covered entity | false |
confidence_threshold | Minimum scanner confidence (0.0-1.0) | 0.7 |
sensitivity_level | Severity multiplier (1=low, 2=medium, 3=high) | 2 |
Context Detection
The HIPAA policy elevates severity when healthcare details is detected:
Healthcare Context Keywords
- Medical terms: patient, diagnosis, prescription, treatment
- Healthcare entities: hospital, clinic, pharmacy, physician
- Insurance terms: claim, coverage, beneficiary, EOB
Example
Finding: SSN "122-15-6289"
Context: "Patient record for treatment on 03/15/2024"
Result: Severity elevated from High → Critical due to healthcare details
Violation Metadata
Each HIPAA violation includes:
{
"policy": "HIPAA",
"severity": "critical",
"phi_category": "ssn",
"safeguard": "technical",
"requirement": "164.312(a)(1)",
"breach_notification_required": true
}
Compliance Reporting
Query PHI Exposures
-- All PHI exposures requiring breach notification
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA'
AND severity = 'critical';
-- PHI by category
SELECT scanner, COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA'
GROUP BY scanner
ORDER BY count DESC;
Breach Risk Assessment
Under HIPAA Breach Notification Rule, unauthorized access to PHI requires risk assessment considering:
- Nature and extent of PHI involved
- Unauthorized person who accessed PHI
- Whether PHI was actually viewed or acquired
- Extent to which risk has been mitigated
Aquilon DLP findings provide evidence for factors 1 and 4.
Best Practices
Monitoring Strategy
- Alert on Critical immediately: SSN, MRN, Health Plan IDs
- Daily review of High: NPI, DOB exposures
- Weekly audit of Medium: Contact information in healthcare contexts
Remediation Workflow
- Identify: Aquilon DLP detects PHI exposure
- Assess: Determine if breach occurred
- Contain: Remove or encrypt exposed data
- Document: Record incident for compliance
- Notify: Follow breach notification requirements if applicable
Integration with Incident Response
Forward HIPAA critical alerts to your incident response system:
-- Real-time HIPAA breach candidates
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'HIPAA'
AND severity = 'critical'
AND timestamp > datetime('now', '-1 hour');
Related Resources
PCI DSS Compliance
Note: PCI DSS policy framework requires Enterprise Edition.
The Payment Card Industry Data Security Standard (PCI DSS) policy framework detects cardholder data exposure and generates violations according to PCI DSS requirements.
Overview
PCI DSS protects cardholder data during payment card transactions. Aquilon DLP’s PCI DSS policy helps merchants and service providers comply with:
- Requirement 3: Protect stored cardholder data
- Requirement 4: Encrypt transmission of cardholder data
- Requirement 12: Maintain information security policy
Cardholder Data Elements
The PCI DSS policy detects:
| Data Element | Scanner | Severity | PCI Category |
|---|---|---|---|
| Primary Account Number (PAN) | credit_card | Critical | CHD |
| Cardholder Name | credit_card | High | CHD |
| Service Code | credit_card | High | CHD |
| Expiration Date | credit_card | Medium | CHD |
| CVV/CVC/CVV2 | cvv | Critical | SAD |
| PIN/PIN Block | pin | Critical | SAD |
| Magnetic Stripe Data | magnetic_stripe | Critical | SAD |
CHD = Cardholder Data (may be stored if protected) SAD = Sensitive Authentication Data (must never be stored)
KYC and International Compliance
Payment processors and card issuers operating internationally often collect national identification numbers for Know Your Customer (KYC) verification. Aquilon DLP includes 28 country-specific national ID scanners to detect this data.
International Identity Verification
| Region | Scanners | KYC Use Case |
|---|---|---|
| Europe | germany_steurid, uk_nino, france_nir, + 11 more | EU PSD2 compliance, strong customer authentication |
| Americas | brazil_cpf, canada_sin, + 2 more | Cross-border merchant onboarding |
| Asia-Pacific | india_aadhaar, india_pan, + 6 more | Regional payment network compliance |
Note: While PCI DSS focuses on cardholder data, organizations subject to anti-money laundering (AML) and KYC regulations should monitor for national IDs collected during identity verification.
See Policy Frameworks for the complete list of all 28 national ID scanners.
Scanner Mappings
Critical Severity
Always Critical under PCI DSS:
- CVV/CVC: Sensitive authentication data - must never be stored
- Full PAN: Primary account number without masking
- Magnetic Stripe: Track data must never be stored
High Severity
- Masked PAN: Partial card numbers (first 6/last 4 may be stored)
- Cardholder Name: When associated with PAN
Medium Severity
- Expiration Date: Lower risk when isolated
- Partial Card Data: Fragments that may indicate CHD
Configuration
Basic Configuration
[policies]
enabled_policies = ["pci_dss"]
Advanced Configuration
[policies.policy_configs.pci_dss]
settings = { merchant_level = "2", version = "4.0", confidence_threshold = "0.85" }
Configuration Options
| Option | Description | Default |
|---|---|---|
merchant_level | PCI merchant level (1-4) | 2 |
version | PCI DSS version (3.2.1 or 4.0) | 4.0 |
confidence_threshold | Minimum scanner confidence | 0.8 |
detect_test_cards | Flag test card numbers | false |
PAN Detection
Supported Card Networks
- Visa (4xxx)
- Mastercard (51-55xx, 2221-2720)
- American Express (34xx, 37xx)
- Discover (6011, 644-649, 65xx)
- JCB (3528-3589)
- Diners Club (36xx, 38xx)
Luhn Validation
All detected PANs are validated using the Luhn algorithm to reduce false positives.
Context Analysis
The policy analyzes surrounding details to determine if numbers are actual PANs:
"Order #4111111111111111" → Likely PAN (Critical)
"Transaction ID: 4111111111111111" → Needs review (High)
Violation Metadata
Each PCI DSS violation includes:
{
"policy": "PCI_DSS",
"severity": "critical",
"data_element": "pan",
"card_network": "visa",
"requirement": "3.4",
"is_sad": false,
"masked_value": "411111******1111"
}
Compliance Reporting
Query Cardholder Data Exposures
-- All unmasked PANs (critical PCI violation)
SELECT path, scanner, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
AND severity = 'critical';
-- SAD storage violations (immediate action required)
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
AND scanner IN ('cvv', 'magnetic_stripe');
-- CHD exposure by file type
SELECT
SUBSTR(path, -4) as extension,
COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
GROUP BY extension;
QSA Audit Support
Generate reports for Qualified Security Assessor (QSA) audits:
-- Cardholder Data Environment (CDE) scope
SELECT
rtrim(path, replace(path, '/', '')) as directory,
COUNT(*) as finding_count
FROM aquilon_dlp_alerts
WHERE policy = 'PCI_DSS'
GROUP BY directory;
Best Practices
Monitoring Strategy
- Immediate alert: CVV, magnetic stripe, PIN data
- Same-day review: Full PAN exposures
- Weekly audit: Partial PAN, cardholder names
SAD Handling
Sensitive Authentication Data must never be stored:
CVV found → Immediate deletion required
Mag stripe found → Immediate deletion required
PIN found → Immediate deletion required
CDE Scope Reduction
Use findings to identify and reduce Cardholder Data Environment:
- Locate all CHD storage
- Determine if storage is necessary
- Delete or encrypt as appropriate
- Update CDE documentation
Related Resources
SOX Compliance
Note: SOX policy framework requires Enterprise Edition.
The Sarbanes-Oxley Act (SOX) policy framework detects exposure of financial data and internal controls information that could impact financial reporting integrity.
Overview
SOX establishes requirements for public company financial reporting and internal controls. Aquilon DLP’s SOX policy helps organizations comply with:
- Section 302: Corporate responsibility for financial reports
- Section 404: Management assessment of internal controls
- Section 409: Real-time issuer disclosures
Protected Data Categories
The SOX policy detects:
| Data Category | Scanners | Severity | SOX Section |
|---|---|---|---|
| Financial Account Numbers | bank_account, iban, aba_routing | Critical | 302, 404 |
| Tax Identifiers | ein, ssn | Critical | 302 |
| Internal Financial Data | financial_keyword | High | 404 |
| Audit Documentation | audit_keyword | High | 404 |
| Executive Communications | exec_keyword | Medium | 302 |
International Subsidiaries
Multinational corporations with global subsidiaries must protect employee and financial data across jurisdictions. Aquilon DLP includes 28 country-specific national ID scanners for comprehensive coverage.
Global Employee and Tax Data
| Region | Scanners | SOX Relevance |
|---|---|---|
| Europe | germany_steurid, france_nir, uk_nino, + 11 more | EU subsidiary employee tax data |
| Americas | brazil_cpf, argentina_cuit, + 2 more | Latin American subsidiary payroll |
| Asia-Pacific | india_pan, japan_my_number, + 6 more | APAC subsidiary financial records |
Note: SOX Section 404 internal controls extend to material subsidiaries. Unauthorized exposure of subsidiary employee tax identifiers or financial data may indicate control deficiencies.
See Policy Frameworks for the complete list of all 28 national ID scanners.
Scanner Mappings
Critical Severity
Financial data requiring immediate protection:
- Bank Account Numbers: Direct access to company funds
- IBAN/SWIFT: International financial identifiers
- ABA Routing Numbers: US bank routing information
- EIN: Employer Identification Number
- Tax Documents: Tax returns, W-2s, 1099s
High Severity
Internal controls and audit information:
- Financial Statements: Balance sheets, P&L, cash flow
- Audit Working Papers: Internal audit documentation
- Control Documentation: SOX control matrices, test results
- Material Information: Pre-earnings, M&A data
Medium Severity
- Executive Communications: C-suite financial discussions
- Budget Data: Forecasts, projections
- Vendor Financial Data: AP/AR information
Configuration
Basic Configuration
[policies]
enabled_policies = ["sox"]
Advanced Configuration
[policies.policy_configs.sox]
settings = { confidence_threshold = "0.75", sensitivity_level = "3", detect_material_info = "true" }
Configuration Options
| Option | Description | Default |
|---|---|---|
confidence_threshold | Minimum scanner confidence | 0.7 |
sensitivity_level | Severity multiplier | 2 |
detect_material_info | Flag material non-public information | true |
quiet_period_days | Days before earnings (heightened sensitivity) | 14 |
Context Detection
The SOX policy elevates severity when financial context is detected. Enable the sox_financial context profile for automatic keyword detection:
[context]
enabled_profiles = ["sox_financial"] # Add to existing profiles
Financial Context Keywords
The sox_financial profile detects:
- Strong indicators: 10-K, 10-Q, 8-K, SEC filing, GAAP, IFRS, PCAOB, balance sheet, income statement, SOX 404, material weakness
- Weak indicators: revenue, earnings, quarterly, annual, profit, EBITDA, margin, budget, forecast, audit
Note: The SOX policy requires explicit financial context signals for
financial_figuresfindings to avoid false positives on arbitrary currency amounts (e.g.,$10in shell scripts).
Quiet Period Detection
During earnings quiet periods, severity is elevated for:
- Financial projections
- Earnings estimates
- Material business changes
Violation Metadata
Each SOX violation includes:
{
"policy": "SOX",
"severity": "critical",
"data_category": "financial_account",
"sox_section": "302",
"material_info": false,
"control_impact": "financial_reporting"
}
Compliance Reporting
Query Financial Data Exposures
-- All critical financial data exposures
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'SOX'
AND severity = 'critical';
-- Internal controls documentation exposure
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'SOX'
AND scanner LIKE '%audit%';
-- Financial data by department (based on path)
SELECT
CASE
WHEN path LIKE '%/finance/%' THEN 'Finance'
WHEN path LIKE '%/accounting/%' THEN 'Accounting'
WHEN path LIKE '%/treasury/%' THEN 'Treasury'
ELSE 'Other'
END as department,
COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'SOX'
GROUP BY department;
Audit Committee Reporting
Generate reports for audit committee:
-- SOX control deficiency indicators
SELECT
date(timestamp) as date,
severity,
COUNT(*) as findings
FROM aquilon_dlp_alerts
WHERE policy = 'SOX'
AND timestamp > datetime('now', '-30 days')
GROUP BY date, severity
ORDER BY date DESC;
Best Practices
Monitoring Strategy
- Immediate alert: Bank accounts, tax IDs, material info
- Daily review: Financial statements, audit documentation
- Weekly audit: Executive communications, budget data
Control Environment
Use findings to strengthen internal controls:
- Identify: Where financial data is stored
- Assess: Whether storage is appropriate
- Remediate: Move to secure locations
- Document: Update control documentation
Segregation of Duties
Monitor for inappropriate access patterns:
-- Financial data in non-finance directories
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'SOX'
AND severity = 'critical'
AND path NOT LIKE '%/finance/%'
AND path NOT LIKE '%/accounting/%';
Related Resources
ISO 27001 Compliance
Note: ISO 27001 policy framework requires Enterprise Edition.
The ISO 27001:2022 policy framework implements information security management controls with a focus on data leakage prevention.
Overview
ISO 27001:2022 is the international standard for information security management. Aquilon DLP’s ISO 27001 policy implements key controls:
- A.8.12: Data leakage prevention (NEW in 2022 revision)
- A.5.12: Classification of information
- A.8.11: Data masking
Note: Control A.8.12 explicitly mandates DLP capabilities, making this a core requirement for ISO 27001:2022 certification.
Data Classification Levels
The ISO 27001 policy uses a four-level classification system:
| Level | Description | Examples | Severity |
|---|---|---|---|
| Restricted | Highest sensitivity | Cryptographic keys, master passwords | Critical |
| Confidential | Business-critical | Financial data, PII, trade secrets | High |
| Internal | Internal use only | Employee data, internal policies | Medium |
| Public | No restrictions | Marketing materials, public docs | Low |
Scanner Classifications
All 50+ scanners are automatically classified:
Restricted (Critical)
private_key,api_key,jwt,aws_access_keycredit_card,cvvssn(in certain contexts)
Confidential (High)
ssn,passport,drivers_licensebank_account,ibanhealth_record,medical_record_number
Internal (Medium)
email,phone,addressdate_of_birthemployee_id
Public (Low)
- Generic patterns without sensitive details
Global PII Coverage
ISO 27001 is an international standard. Organizations operating across multiple jurisdictions need comprehensive national ID detection. Aquilon DLP includes 28 country-specific national ID scanners with checksum validation.
Europe (14 scanners)
| Country | Scanner | Format | Validation |
|---|---|---|---|
| France | france_nir | 15 digits (NIR) | Mod 97 |
| Germany | germany_steurid | 11 digits (Steuer-ID) | Format rules |
| Italy | italy_cf | 16 chars (Codice Fiscale) | Mod 26 |
| Spain | spain_dni | 8-9 chars (DNI/NIE) | Mod 23 |
| Poland | poland_pesel | 11 digits (PESEL) | Weighted mod 10 |
| Netherlands | netherlands_bsn | 9 digits (BSN) | 11-proof |
| Belgium | belgium_nrn | 11 digits (NRN) | Mod 97 |
| UK | uk_nino | 9 chars (NINO) | Format rules |
| Sweden | sweden_personnummer | 10-12 digits | Luhn |
| Norway | norway_fodselsnummer | 11 digits | Dual mod-11 |
| Finland | finland_hetu | 11 chars (HETU) | Mod 31 |
| Portugal | portugal_nif | 9 digits (NIF) | Weighted mod 11 |
| Romania | romania_cnp | 13 digits (CNP) | Weighted mod 11 |
| Czech/Slovakia | czech_rodne_cislo | 9-10 digits | Mod 11 |
Americas (4 scanners)
| Country | Scanner | Format | Validation |
|---|---|---|---|
| Brazil | brazil_cpf | 11 digits (CPF) | Dual mod 11 |
| Canada | canada_sin | 9 digits (SIN) | Luhn |
| Chile | chile_rut | 8-9 chars (RUT) | Mod 11 |
| Argentina | argentina_cuit | 11 digits (CUIT/CUIL) | Weighted mod 11 |
Asia-Pacific (8 scanners)
| Country | Scanner | Format | Validation |
|---|---|---|---|
| Australia | australia_tfn | 9 digits (TFN) | Weighted mod 11 |
| India | india_aadhaar | 12 digits (Aadhaar) | Verhoeff |
| India | india_pan | 10 chars (PAN) | Format rules |
| South Korea | south_korea_rrn | 13 digits (RRN) | Weighted mod 11 |
| Japan | japan_my_number | 12 digits | Government checksum |
| China | china_resident_id | 18 chars | ISO 7064 MOD 11-2 |
| Taiwan | taiwan_national_id | 10 chars | Weighted mod 10 |
| New Zealand | new_zealand_ird | 8-9 digits (IRD) | Mod 11 |
Middle East & Africa (2 scanners)
| Country | Scanner | Format | Validation |
|---|---|---|---|
| Israel | israel_teudat_zehut | 9 digits | Luhn variant |
| Turkey | turkey_tc_kimlik | 11 digits (TC Kimlik) | Two-step checksum |
Note: All national ID scanners use country-specific context keywords to increase detection confidence and reduce false positives.
See Policy Frameworks for detailed scanner documentation.
Configuration
Basic Configuration
[policies]
enabled_policies = ["iso27001"]
Advanced Configuration
[policies.policy_configs.iso27001]
settings = { confidence_threshold = "0.7", enforce_data_masking = "true", classification_level = "confidential" }
Configuration Options
| Option | Description | Default |
|---|---|---|
confidence_threshold | Minimum scanner confidence | 0.7 |
enforce_data_masking | Require data masking in output | false |
classification_level | Default classification level | confidential |
control_a812_strict | Strict A.8.12 enforcement | true |
Control Implementation
Control A.8.12 - Data Leakage Prevention
Aquilon DLP directly implements A.8.12 by:
- Monitoring data at rest: Scans file systems for sensitive data
- Classification: Automatically classifies detected data
- Alerting: Generates violations for inappropriate storage
- Reporting: Provides audit trails for compliance
Control A.5.12 - Classification of Information
Each finding includes classification metadata:
{
"classification_level": "confidential",
"classification_reason": "Contains SSN (direct identifier)",
"handling_requirements": ["encryption_at_rest", "access_logging"]
}
Control A.8.11 - Data Masking
When enforce_data_masking is enabled, detected values are masked:
Original: 122-45-6789
Masked: ***-**-6789
Violation Metadata
Each ISO 27001 violation includes:
{
"policy": "ISO27001",
"severity": "high",
"classification": "confidential",
"iso_control": "A.8.12",
"control_name": "Data leakage prevention",
"handling_requirements": [
"encrypt_at_rest",
"restrict_access",
"audit_logging"
]
}
Compliance Reporting
Query by Classification Level
-- All restricted data exposures (immediate action)
SELECT path, scanner, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'ISO27001'
AND severity = 'critical';
-- Classification distribution
SELECT severity as classification, COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'ISO27001'
GROUP BY severity
ORDER BY count DESC;
-- Control A.8.12 compliance status
SELECT
date(timestamp) as date,
COUNT(*) as findings
FROM aquilon_dlp_alerts
WHERE policy = 'ISO27001'
GROUP BY date
ORDER BY date DESC
LIMIT 30;
Certification Audit Support
Generate reports for ISO 27001 auditors:
-- Data leakage prevention evidence (Control A.8.12)
SELECT
'Files with Findings' as metric,
(SELECT COUNT(DISTINCT path) FROM aquilon_dlp_alerts WHERE policy = 'ISO27001') as value
UNION ALL
SELECT
'Total Findings',
(SELECT COUNT(*) FROM aquilon_dlp_alerts WHERE policy = 'ISO27001')
UNION ALL
SELECT
'Critical Findings',
(SELECT COUNT(*) FROM aquilon_dlp_alerts
WHERE policy = 'ISO27001' AND severity = 'critical');
Best Practices
Monitoring Strategy
- Immediate alert: Restricted classification findings
- Daily review: Confidential data exposures
- Weekly audit: Internal data, classification accuracy
Information Security Management System (ISMS)
Use Aquilon DLP findings to support ISMS:
- Risk Assessment: Identify data exposure risks
- Risk Treatment: Implement controls based on classification
- Monitoring: Continuous compliance monitoring
- Improvement: Refine policies based on findings
Statement of Applicability (SoA)
Document control implementation:
| Control | Implementation | Aquilon DLP Support |
|---|---|---|
| A.8.12 | DLP monitoring | Primary implementation |
| A.5.12 | Classification | Automatic classification |
| A.8.11 | Data masking | Optional enforcement |
Certification Support
Pre-Audit Checklist
- ISO 27001 policy enabled and configured
- All data locations included in watch_paths
- Classification levels match organization’s scheme
- Historical findings retained for audit period
- Remediation process documented
Evidence Collection
Collect evidence for auditors:
-- Export findings for audit period
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'ISO27001'
AND timestamp BETWEEN '2024-01-01' AND '2024-12-31'
ORDER BY timestamp;
Related Resources
GDPR Compliance
The General Data Protection Regulation (GDPR) policy framework detects personal data exposure and generates violations according to EU data protection requirements.
Availability: GDPR policy is included in all editions (Basic and Enterprise).
Overview
GDPR establishes requirements for protecting personal data of EU residents. Aquilon DLP’s GDPR policy helps data controllers and processors comply with:
- Article 5: Principles relating to processing of personal data
- Article 32: Security of processing
- Article 33: Notification of personal data breach
Personal Data Categories
The GDPR policy detects:
| Data Category | Scanners | Severity | GDPR Article |
|---|---|---|---|
| National ID Numbers | ssn, EU national IDs (see below) | Critical | 9 (Special) |
| Financial Data | iban, credit_card, bank_account | High | 9 |
| Health Data | health_record, medical_record_number | Critical | 9 |
| Biometric Data | biometric | Critical | 9 |
email | Medium | 4 | |
| Phone | phone | Medium | 4 |
| Address | address | Medium | 4 |
| Date of Birth | date_of_birth | Medium | 4 |
| Passport | passport | High | 4 |
EU/EEA National ID Coverage
The GDPR policy includes 15 specialized scanners for national identification numbers across EU and EEA member states. Each scanner validates country-specific checksum algorithms to reduce false positives.
European National IDs
| Country | Scanner | Format | Validation |
|---|---|---|---|
| France | france_nir | 15 digits (NIR) | Mod 97 |
| Germany | germany_steurid | 11 digits (Steuer-ID) | Format rules |
| Italy | italy_cf | 16 chars (Codice Fiscale) | Mod 26 |
| Spain | spain_dni | 8-9 chars (DNI/NIE) | Mod 23 |
| Poland | poland_pesel | 11 digits (PESEL) | Weighted mod 10 |
| Netherlands | netherlands_bsn | 9 digits (BSN) | 11-proof |
| Belgium | belgium_nrn | 11 digits (NRN) | Mod 97 |
| UK | uk_nino | 9 chars (NINO) | Format rules |
| Sweden | sweden_personnummer | 10-12 digits | Luhn |
| Norway | norway_fodselsnummer | 11 digits | Dual mod-11 |
| Finland | finland_hetu | 11 chars (HETU) | Mod 31 |
| Portugal | portugal_nif | 9 digits (NIF) | Weighted mod 11 |
| Romania | romania_cnp | 13 digits (CNP) | Weighted mod 11 |
| Czech/Slovakia | czech_rodne_cislo | 9-10 digits | Mod 11 |
| Turkey | turkey_tc_kimlik | 11 digits (TC Kimlik) | Two-step checksum |
Note: Turkey’s KVKK (Kişisel Verilerin Korunması Kanunu) is modeled on GDPR. Turkish national IDs are included for organizations processing Turkish residents’ data under GDPR-equivalent requirements.
Context Detection
Each national ID scanner uses country-specific context keywords to increase detection confidence:
- Nordic: personnummer, fødselsnummer, henkilötunnus, Skatteverket, Folkeregisteret
- Western Europe: NIR, Steuer-ID, codice fiscale, DNI, BSN, NRN, NINO
- Eastern Europe: PESEL, CNP, rodné číslo
- Turkey: TC Kimlik, Kimlik No, Nüfus
See the Policy Frameworks guide for the complete list of all 28 national ID scanners across all regions.
Special Category Data
Article 9 special category data receives elevated severity:
- Racial or ethnic origin
- Political opinions
- Religious beliefs
- Trade union membership
- Genetic data
- Biometric data
- Health data
- Sex life or sexual orientation
Scanner Mappings
Critical Severity
Special category data under Article 9:
- Health Data: Medical records, health information
- Biometric Data: Fingerprints, facial recognition
- National IDs: SSN, government-issued identifiers (when combined with health details)
High Severity
Personal data enabling direct identification:
- Financial Identifiers: IBAN, credit cards, bank accounts
- Travel Documents: Passport numbers
- National IDs: In general contexts
Medium Severity
Personal data requiring details:
- Contact Information: Email, phone, address
- Dates: Date of birth
- Names: When combined with other data
Configuration
Basic Configuration
[policies]
enabled_policies = ["gdpr"]
Advanced Configuration
[policies.policy_configs.gdpr]
settings = { confidence_threshold = "0.7", sensitivity_level = "2", detect_special_categories = "true" }
Configuration Options
| Option | Description | Default |
|---|---|---|
confidence_threshold | Minimum scanner confidence | 0.7 |
sensitivity_level | Severity multiplier (1-3) | 2 |
detect_special_categories | Elevate Article 9 data | true |
Context Detection
The GDPR policy analyzes context to determine severity. Enable the gdpr_phone context profile for phone number classification:
[context]
enabled_profiles = ["gdpr_phone"] # Add to existing profiles
Phone Number Context
The gdpr_phone profile distinguishes personal from business phone numbers:
- Personal indicators (triggers violation): mobile, cell, home phone, personal, private, emergency contact
- Business indicators (suppresses violation): office, fax, support, helpdesk, extension, toll-free, switchboard
Note: Phone numbers without personal context do NOT trigger GDPR violations. A bare phone number like
555-123-4567requires nearby keywords like “mobile” or “cell” to be flagged.
EU Context Keywords
- EU member states: Germany, France, Italy, Spain, etc.
- GDPR terms: data subject, controller, processor, consent
- Languages: Non-English European languages increase confidence
Employee vs Customer Context
Employee data in HR systems may have reduced severity (legitimate interest):
Finding: Email "employee@company.com"
Context: "HR records for performance review"
Result: Severity reduced from High → Medium (employee details)
Customer data maintains full severity.
Violation Metadata
Each GDPR violation includes:
{
"policy": "GDPR",
"severity": "high",
"data_category": "personal_data",
"special_category": false,
"gdpr_article": "5(1)(f)",
"lawful_basis_required": true,
"breach_notification_hours": 72
}
Compliance Reporting
Query Personal Data Exposures
-- All GDPR violations requiring attention
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
ORDER BY severity DESC, timestamp DESC;
Special Category Data (Article 9)
-- Special category data requiring elevated protection
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
AND severity = 'critical';
Personal Data by Type
-- Personal data grouped by scanner type
SELECT scanner, COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
GROUP BY scanner
ORDER BY count DESC;
Breach Notification Support
Under Article 33, breaches must be reported within 72 hours:
-- Recent critical findings (potential breach)
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
AND severity = 'critical'
AND timestamp > datetime('now', '-72 hours');
Data Subject Rights
Aquilon DLP findings support data subject rights:
Article 15 - Right of Access
Locate all personal data for a data subject:
-- Find all data for specific identifier
SELECT path, scanner, JSON_EXTRACT(context, '$.snippet') as snippet
FROM aquilon_dlp_alerts
WHERE JSON_EXTRACT(context, '$.snippet') LIKE '%email@example.com%';
Article 17 - Right to Erasure
Verify deletion completeness:
-- Confirm no remaining data after erasure request
SELECT * FROM aquilon_dlp_alerts
WHERE JSON_EXTRACT(context, '$.snippet') LIKE '%data_subject_id%';
Article 20 - Right to Data Portability
Identify structured personal data:
-- Portable data formats
SELECT path, scanner
FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
AND (path LIKE '%.json'
OR path LIKE '%.csv'
OR path LIKE '%.xml');
Best Practices
Monitoring Strategy
- Immediate alert: Special category data (Article 9)
- Daily review: High severity personal data
- Weekly audit: Medium severity, details accuracy
Data Mapping
Use findings to maintain data mapping:
- Identify: Where personal data is stored
- Classify: By data category and lawful basis
- Document: In Records of Processing Activities
- Review: Regularly for accuracy
Privacy by Design
Integrate findings into development:
-- Personal data in development environments
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
AND (path LIKE '%/dev/%'
OR path LIKE '%/test/%'
OR path LIKE '%/staging/%');
Cross-Border Considerations
EU-Specific Context
The policy detects EU details to determine applicability:
- EU country names or codes
- EU-specific identifiers (IBAN, national IDs)
- EU languages
International Transfers
Monitor for personal data in locations outside EU:
-- Potential international transfers
SELECT * FROM aquilon_dlp_alerts
WHERE policy = 'GDPR'
AND path LIKE '%/external/%';
Related Resources
CCPA Compliance
The California Consumer Privacy Act (CCPA) and California Privacy Rights Act (CPRA) policy framework detects California consumer personal information and generates violations according to California privacy requirements.
Overview
CCPA/CPRA establishes privacy rights for California consumers and obligations for businesses handling their personal information. Aquilon DLP’s CCPA policy helps organizations comply with:
- 1798.100 - Right to Know (consumer data collection disclosure)
- 1798.105 - Right to Delete
- 1798.120 - Right to Opt-Out (sale of personal information)
- CPRA 2023 - Enhanced sensitive personal information categories
Personal Information Categories
The CCPA policy detects the following personal information categories:
| Category | Scanners | Severity |
|---|---|---|
| Direct Identifiers | ssn, drivers_license | Critical |
| Contact Information | email, phone, address | High |
| Financial Information | credit_card, bank_account | Critical |
| Geolocation Data | ip_address, address | High |
| Biometric Information | biometric | Critical |
| Professional/Employment | Context-based detection | Medium |
CPRA Sensitive Personal Information
CPRA (effective 2023) added enhanced protections for sensitive personal information:
- Social Security numbers
- Driver’s license and state ID numbers
- Financial account credentials
- Precise geolocation
- Racial/ethnic origin
- Religious beliefs
- Biometric data
- Health information
- Sexual orientation
Configuration
Basic Configuration
[policies]
enabled_policies = ["ccpa"]
Advanced Configuration
[policies.policy_configs.ccpa]
settings = { california_business = "true", sensitivity_level = "2", detect_sensitive_pi = "true", confidence_threshold = "0.7" }
Configuration Options
| Option | Description | Default |
|---|---|---|
california_business | Whether organization does business in California | true |
sensitivity_level | Compliance strictness (1=basic, 2=standard, 3=strict) | 2 |
detect_sensitive_pi | Detect CPRA sensitive personal information | true |
detect_consumer_data | Detect commercial/behavioral data | true |
confidence_threshold | Minimum scanner confidence (0.0-1.0) | 0.7 |
Context Detection
The CCPA policy uses context signals to determine applicability and severity:
California Context Keywords
- Location terms: California, CA, Calif
- Regulation terms: CCPA, CPRA, consumer privacy
- Business terms: consumer, customer, resident
Consumer Context Keywords
- Consumer terms: consumer, customer, subscriber, member
- Commercial terms: purchase, transaction, order, account
- Marketing terms: profile, preference, behavioral, targeting
Violation Metadata
Each CCPA violation includes:
{
"policy": "CCPA",
"severity": "high",
"pi_category": "direct_identifier",
"cpra_sensitive": true,
"consumer_rights": ["right_to_know", "right_to_delete"],
"section": "1798.100"
}
Compliance Reporting
Query Consumer PI Exposures
-- All CCPA findings
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'CCPA'
ORDER BY timestamp DESC;
Sensitive PI Detection
-- Critical findings (sensitive PI under CPRA)
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'CCPA'
AND severity = 'critical';
Best Practices
Consumer Rights Support
CCPA grants consumers specific rights. Aquilon DLP findings help you:
Right to Know (1798.100):
- Identify what personal information you’ve collected
- Document categories of PI by data type
Right to Delete (1798.105):
- Locate all instances of a consumer’s data
- Verify deletion completeness
Right to Opt-Out (1798.120):
- Identify data used for sales/sharing
- Track third-party data exposure
Monitoring Strategy
- Alert on Critical immediately: SSN, financial data, biometrics
- Daily review of High: Contact information, geolocation
- Weekly audit of Medium: Professional/employment context
Remediation Workflow
- Identify: Aquilon DLP detects PI exposure
- Classify: Determine PI category and CPRA sensitivity
- Assess: Evaluate consumer rights implications
- Remediate: Secure or delete exposed data
- Document: Record for compliance audit
CCPA vs GDPR
Both policies protect personal data but have different scopes:
| Aspect | CCPA | GDPR |
|---|---|---|
| Scope | California residents | EU residents |
| Threshold | Revenue/data volume based | Any processing |
| Consent | Opt-out model | Opt-in model |
| Penalties | Up to $7,500/violation | Up to 4% revenue |
Organizations serving both jurisdictions should enable both policies:
[policies]
enabled_policies = ["gdpr", "ccpa"]
Related Resources
- Compliance Overview
- GDPR - EU data protection
- Configuration Guide
CUI Compliance
Note: CUI policy framework requires Enterprise Edition.
The Controlled Unclassified Information (CUI) policy framework detects CUI exposure and generates violations according to NIST SP 800-171 requirements for federal contractors.
Overview
CUI is government-created or government-possessed information that requires safeguarding per 32 CFR Part 2002 and NIST SP 800-171. Aquilon DLP’s CUI policy helps federal contractors comply with:
- 3.1.1 - Limit system access to authorized users
- 3.1.2 - Limit system access to permitted transactions/functions
- 3.1.3 - Control CUI flow per authorizations
- 3.8.1 - Protect system media (physical and digital)
- 3.8.2 - Limit CUI access to authorized users
CUI Categories
The CUI policy detects multiple categories defined by the CUI Registry:
| Category | Description | Severity |
|---|---|---|
| Basic CUI | Standard CUI without specified handling | High |
| Specified CUI (SP-*) | CUI with additional safeguard requirements | Critical |
| FCI | Federal Contract Information | High |
| CDI | Covered Defense Information (DFARS 252.204-7012) | Critical |
| CTI | Controlled Technical Information (DoD 5230.24) | Critical |
Detection Methods
Government-Specific Scanners
| Scanner | Detects | Severity |
|---|---|---|
cui_marking | CUI banners, markings (CUI, CUI//SP-*, CONTROLLED) | Critical |
export_control | ITAR, EAR, ECCN markings | Critical |
gov_identifier | DoD EDI-PI identifiers | High |
PII in Government Context
When PII appears with government context signals, it triggers CUI violations:
| Scanner | Government Context Required | Severity |
|---|---|---|
ssn | Federal employee/contractor context | Critical |
email | .gov/.mil domain or federal context | Medium |
api_key | Government system credentials | Critical |
database_connection | Federal database strings | Critical |
crypto | Encryption keys in government context | Critical |
Configuration
Basic Configuration
[policies]
enabled_policies = ["cui"]
Advanced Configuration
[policies.policy_configs.cui]
settings = { detect_basic_cui = "true", detect_specified_cui = "true", detect_fci = "true", detect_cdi = "true", detect_cti = "true", confidence_threshold = "0.7" }
Configuration Options
| Option | Description | Default |
|---|---|---|
detect_basic_cui | Detect standard CUI markings | true |
detect_specified_cui | Detect CUI//SP-* specified markings | true |
detect_fci | Detect Federal Contract Information | true |
detect_cdi | Detect Covered Defense Information | true |
detect_cti | Detect Controlled Technical Information | true |
confidence_threshold | Minimum scanner confidence (0.0-1.0) | 0.7 |
Context Detection
The CUI policy uses context signals to determine CUI category and severity:
Government Context Keywords
- Federal terms: federal, government, agency, DoD, contractor, grantee
- Contract terms: contract, DFARS, FAR, solicitation, RFP, task order, IDIQ
- Classification: controlled, unclassified, FOUO, CUI, SBU, LES
- Document types: statement of work, SOW, PWS, CDRL, DD254
Defense Context Keywords
- Defense terms: defense, military, DoD, Pentagon, armed forces, warfighter
- Contractor terms: prime, subcontractor, DIB, defense industrial base, CAGE code
- Technical terms: technical data, specifications, engineering, schematics, drawings
- Programs: ACAT, PEO, PM, acquisition, milestone
CUI Marking Patterns
The policy detects standard CUI banner and footer markings:
- Banner formats:
CUI,CONTROLLED,CUI//SP-* - Specified markings:
CUI//SP-EXPT,CUI//SP-CTI,CUI//SP-PRVCY - Legacy markings:
FOUO,SBU,LES(mapped to CUI categories) - Distribution statements:
Distribution A-F,EXPORT CONTROLLED
Source Code Context
CUI in development environments receives elevated severity:
- Repository indicators:
.git,src/,lib/,include/ - Code file extensions:
.c,.cpp,.h,.py,.java,.rs - Build systems:
Makefile,CMakeLists.txt,Cargo.toml
Example Context Flow
Finding: SSN "123-45-6789"
Context: "DFARS contractor employee records for contract W911NF-20-C-0001"
Result: Severity elevated to Critical (CDI context - DFARS contract number)
Finding: CUI marking "CUI//SP-CTI"
Context: Found in file.cpp within git repository
Result: Critical violation (CUI spillage into source code)
Violation Metadata
Each CUI violation includes:
{
"policy": "CUI",
"severity": "critical",
"cui_category": "cdi",
"nist_control": "3.8.1",
"regulation": "DFARS 252.204-7012"
}
Compliance Reporting
Query CUI Exposures
-- All CUI exposures
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'CUI'
ORDER BY timestamp DESC;
Defense Contract Compliance
For DFARS compliance reporting:
-- CDI and CTI exposures (DFARS 252.204-7012)
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'CUI'
AND severity = 'critical';
Best Practices
Monitoring Strategy
- Alert on Critical immediately: CUI markings, export control, CDI/CTI
- Daily review of High: FCI, government identifiers
- Weekly audit: PII in government context
CUI Category Prioritization
Different CUI categories require different response times:
Immediate Response (< 1 hour):
- CUI//SP-* (Specified CUI with additional safeguards)
- CDI (Covered Defense Information under DFARS)
- CTI (Controlled Technical Information)
- Export-controlled data (ITAR/EAR)
Same-Day Response:
- Basic CUI markings
- FCI (Federal Contract Information)
- Government credentials/API keys
Weekly Review:
- PII with government context (no explicit CUI marking)
- Legacy markings (FOUO, SBU) requiring reclassification
Spillage Response Procedures
When CUI is detected outside authorized boundaries:
- Contain: Immediately restrict access to the file/location
- Preserve: Do not delete - preserve for incident investigation
- Notify: Alert your Facility Security Officer (FSO) or ISSO
- Document: Record in incident tracking system
- Assess: Determine if spillage constitutes a reportable incident
- Remediate: Securely delete or move to authorized storage
- Report: Update SPRS score if control failure identified
NIST SP 800-171 Assessment Support
Use Aquilon DLP findings to support your NIST assessment:
- 3.1.1/3.1.2 (Access Control): Unauthorized access detected by CUI exposure
- 3.8.1 (Media Protection): CUI on unprotected storage locations
- 3.8.2 (Media Access): CUI accessible to unauthorized users
- 3.13.1 (Boundary Protection): CUI spillage outside authorization boundary
Remediation Workflow
- Identify: Aquilon DLP detects CUI exposure
- Classify: Determine CUI category (Basic, Specified, CDI, CTI)
- Assess: Evaluate spillage scope and potential impact
- Contain: Move to authorized storage or encrypt
- Document: Record for NIST SP 800-171 assessment
- Report: Include in SPRS score if applicable
- Prevent: Implement controls to prevent recurrence
Related Resources
- Compliance Overview
- CMMC - DoD certification requirements
- Configuration Guide
CMMC Compliance
Note: CMMC policy framework requires Enterprise Edition.
The Cybersecurity Maturity Model Certification (CMMC) policy framework helps 350,000+ Defense Industrial Base (DIB) contractors achieve CMMC compliance for DoD contract eligibility.
Overview
CMMC 2.0 establishes cybersecurity requirements for organizations handling Federal Contract Information (FCI) and Controlled Unclassified Information (CUI) in defense contracts. Aquilon DLP’s CMMC policy helps contractors comply with:
- FAR 52.204-21 - Basic Safeguarding of Covered Contractor Information Systems
- DFARS 252.204-7012 - Safeguarding Covered Defense Information
- DFARS 252.204-7019 - Notice of NIST SP 800-171 Assessment
- DFARS 252.204-7020 - NIST SP 800-171 DoD Assessment
CMMC Levels
| Level | Data Types | Controls | Assessment |
|---|---|---|---|
| Level 1 | FCI only | 17 practices | Self-assessment |
| Level 2 | FCI + CUI | 110 practices (NIST SP 800-171) | Self or third-party |
| Level 3 | FCI + CUI + Enhanced | 110+ practices (includes SP 800-172) | Government-led |
Detection Methods
Government-Specific Scanners
| Scanner | Detects | CMMC Level | Severity |
|---|---|---|---|
cui_marking | CUI banners, markings | 2+ | Critical |
export_control | ITAR, EAR, ECCN markings | 2+ | Critical |
gov_identifier | DoD EDI-PI identifiers | All | High |
PII Relevant to Defense Contracts
| Scanner | Relevance | Severity |
|---|---|---|
ssn | Employee/subcontractor PII | Critical |
email | Government communications | Medium |
api_key | System credentials | Critical |
crypto | Encryption keys | Critical |
bank_account | Contract payment data | High |
Configuration
Basic Configuration (Level 2)
[policies]
enabled_policies = ["cmmc"]
Level-Specific Configuration
[policies.policy_configs.cmmc]
settings = { level = "2", confidence_threshold = "0.7" }
Configuration Options
| Option | Description | Default |
|---|---|---|
level | CMMC level (1, 2, or 3) | 2 |
detect_cui_markings | Detect CUI banners/markings | true |
detect_export_control | Detect ITAR/EAR markings | true |
detect_pii | Detect PII in defense context | true |
detect_credentials | Detect API keys, database strings | true |
confidence_threshold | Minimum scanner confidence (0.0-1.0) | 0.7 |
Context Detection
Defense Industrial Base Context
- Contract terms: prime contractor, subcontractor, DIB, defense contract, teaming agreement
- DoD terms: DoD, Department of Defense, Pentagon, armed forces, military branch names
- Program terms: CAGE code, DUNS, SAM registration, UEI, contract number (W/N prefixes)
- Roles: contracting officer, COR, COTR, program manager, DCMA
Technical Context
- Technical data: engineering drawings, specifications, schematics, BOMs, ICDs
- Export control: ITAR, EAR, ECCN, defense article, USML category
- System terms: CDS, cross-domain, classified system, enclave, authorization boundary
- Development: source code, firmware, software, algorithm, design document
Contract Vehicle Context
Different contract types affect CMMC applicability:
- Prime contracts: Direct DoD contracts requiring flow-down
- Subcontracts: DFARS flow-down requirements apply
- SBIR/STTR: Small business innovation research with CUI potential
- GSA Schedule: May include DoD task orders
- OTA: Other Transaction Agreements with DoD
Supply Chain Context
Multi-tier supply chain indicators:
- Tier references: Tier 1, Tier 2, subcontractor, supplier
- Flow-down terms: DFARS flow-down, 252.204-7012, prime requirements
- Assessment references: SPRS, NIST assessment, POA&M, SSP
Example Context Flow
Finding: Database connection string with credentials
Context: "DFARS contract W52P1J-21-C-0045 subcontractor portal"
Result: Critical violation (CMMC Level 2 - credentials in defense contract context)
Finding: Technical drawing (.dwg file)
Context: File metadata contains "CAGE: 1ABC2" and "ECCN: 9A515"
Result: Critical violation (export-controlled technical data)
Violation Metadata
Each CMMC violation includes:
{
"policy": "CMMC",
"severity": "critical",
"cmmc_level": 2,
"data_type": "cui",
"dfars_clause": "DFARS 252.204-7012",
"sprs_relevant": true
}
Compliance Reporting
SPRS Score Support
Query findings that may affect your Supplier Performance Risk System (SPRS) score:
-- All CMMC findings by severity
SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'CMMC'
GROUP BY severity
ORDER BY count DESC;
Pre-Assessment Audit
Before a CMMC assessment:
-- Critical CUI exposures requiring remediation
SELECT path, scanner, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'CMMC'
AND severity = 'critical'
ORDER BY timestamp DESC;
Best Practices
By CMMC Level
Level 1 (FCI only):
- Focus on basic PII protection
- Monitor for accidental data spillage
- Self-assessment annually with affirmation
Level 2 (CUI):
- Enable all CUI detection settings
- Implement continuous monitoring
- Document findings for POA&M
- Prepare for third-party assessment (C3PAO)
Level 3 (Enhanced):
- Strict alerting on any detection
- Integration with SIEM
- Real-time incident response
- Government-led assessment preparation
SPRS Score Impact Assessment
Aquilon DLP findings can identify gaps affecting your SPRS score:
Score-Impacting Findings:
- Unencrypted CUI storage → impacts AC.L2-3.1.19 (-5 points)
- Credentials in plaintext → impacts IA.L2-3.5.10 (-5 points)
- Missing access controls → impacts AC.L2-3.1.1 (-5 points)
Using DLP for SPRS Improvement:
- Query critical findings to identify control gaps
- Map findings to NIST SP 800-171 controls
- Document remediation in POA&M
- Re-scan to verify remediation
- Update SPRS score with improved controls
Level-Based Remediation Priorities
Level 1 Remediation Focus:
- Remove FCI from unauthorized locations
- Ensure basic access controls on FCI systems
- Document FCI boundaries
Level 2 Remediation Focus:
- Eliminate CUI spillage outside enclave
- Implement encryption for CUI at rest and in transit
- Remove hardcoded credentials from CUI systems
- Document in System Security Plan (SSP)
Level 3 Remediation Focus:
- Zero tolerance for any critical findings
- Implement advanced threat detection
- Enhanced logging and monitoring
- Prepare for government assessment evidence
Assessment Preparation
- Inventory: Use Aquilon to discover where CUI resides
- Categorize: Map findings to CMMC practice requirements
- Scope: Define assessment boundary using DLP data
- Remediate: Address critical exposures before assessment
- Document: Export findings for POA&M evidence
- Evidence: Generate compliance reports for assessors
- Monitor: Maintain continuous compliance post-assessment
Related Resources
- Compliance Overview
- CUI - NIST SP 800-171 details
- Configuration Guide
FedRAMP Compliance
Note: FedRAMP policy framework requires Enterprise Edition.
The Federal Risk and Authorization Management Program (FedRAMP) policy framework helps cloud service providers (CSPs) protect federal data and achieve FedRAMP authorization.
Overview
FedRAMP provides a standardized approach to security assessment for cloud products and services used by federal agencies. Aquilon DLP’s FedRAMP policy implements NIST SP 800-53 control families:
- AC - Access Control: System access and authorization
- AU - Audit and Accountability: Audit logging and review
- IA - Identification and Authentication: User/device identification
- MP - Media Protection: Digital and physical media safeguards
- SC - System and Communications Protection: Communication security
- SI - System and Information Integrity: Integrity protection
FedRAMP Baselines
| Baseline | Impact Level | Controls | Use Cases |
|---|---|---|---|
| Low | Low impact | ~125 | Public-facing sites, low-sensitivity data |
| Moderate | Moderate impact | ~325 | Most federal applications, PII |
| High | High impact | ~421 | Law enforcement, emergency services, financial |
Detection Methods
Cloud-Specific Scanners
| Scanner | Detects | Severity |
|---|---|---|
cui_marking | CUI in cloud storage | Critical |
api_key | Cloud service credentials | Critical |
database_connection | Database connection strings | Critical |
crypto | Encryption keys | Critical |
Federal Data in Cloud Context
| Scanner | Cloud Context Required | Severity |
|---|---|---|
ssn | Multi-tenant cloud environment | Critical |
email | .gov domain or federal agency | Medium |
ip_address | Federal network ranges | High |
gov_identifier | DoD EDI-PI in cloud systems | High |
Configuration
Basic Configuration (Moderate Baseline)
[policies]
enabled_policies = ["fedramp"]
Baseline-Specific Configuration
[policies.policy_configs.fedramp]
settings = { baseline = "moderate", confidence_threshold = "0.7" }
Configuration Options
| Option | Description | Default |
|---|---|---|
baseline | FedRAMP baseline (low, moderate, high) | moderate |
detect_cui | Detect CUI in cloud storage | true |
detect_credentials | Detect API keys, connection strings | true |
detect_pii | Detect PII in multi-tenant environments | true |
confidence_threshold | Minimum scanner confidence (0.0-1.0) | 0.7 |
Context Detection
Cloud Context Keywords
- Cloud terms: cloud, SaaS, IaaS, PaaS, tenant, multi-tenant, serverless
- Provider terms: AWS, Azure, GCP, FedRAMP authorized, GovCloud, Azure Government
- Service terms: API, endpoint, webhook, microservice, Lambda, Functions
- Storage: S3, Blob, object storage, bucket, container registry
Federal Agency Context
- Agency terms: federal, agency, government, GSA, FedRAMP PMO
- Authorization terms: ATO, authorization, JAB, P-ATO, agency ATO
- Compliance terms: continuous monitoring, ConMon, POA&M, 3PAO, SSP
Cloud Infrastructure Context
Multi-tenant and infrastructure indicators:
- Tenant isolation: tenant ID, account ID, subscription, organization
- Network: VPC, VNET, security group, NSG, firewall rules
- Identity: IAM, RBAC, service principal, managed identity
- Secrets: Key Vault, Secrets Manager, Parameter Store
Authorization Boundary Context
FedRAMP authorization boundaries require clear data classification:
- Boundary terms: authorization boundary, system boundary, enclave
- Data flow: ingress, egress, data flow diagram, DFD
- Interconnection: ISA, MOU, interconnection security agreement
- External: external system, third-party, SaaS integration
Example Context Flow
Finding: API key "AKIA..." in configuration file
Context: "AWS GovCloud deployment for agency.gov"
Result: Critical violation (cloud credentials in federal context)
Finding: SSN in database export
Context: "Multi-tenant SaaS platform, FedRAMP Moderate ATO"
Result: Critical violation (PII in shared cloud environment)
Violation Metadata
Each FedRAMP violation includes:
{
"policy": "FedRAMP",
"severity": "critical",
"baseline": "moderate",
"nist_control": "SC-28",
"control_family": "System and Communications Protection"
}
Compliance Reporting
Authorization Boundary Monitoring
-- All FedRAMP findings
SELECT severity, scanner, COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'FedRAMP'
GROUP BY severity, scanner
ORDER BY count DESC;
Continuous Monitoring Support
FedRAMP requires continuous monitoring (ConMon). Query for recent issues:
-- Last 30 days of FedRAMP findings
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'FedRAMP'
AND timestamp > datetime('now', '-30 days')
ORDER BY timestamp DESC;
Best Practices
By Baseline
Low Baseline:
- Monitor for basic data exposure
- Focus on API key and credential leaks
- Annual assessment with ConMon
Moderate Baseline:
- Enable full PII detection
- Monitor CUI in cloud storage
- Integrate with SIEM for ConMon
- Monthly vulnerability scanning integration
High Baseline:
- Strict alerting on any detection
- Real-time incident response integration
- Enhanced audit logging
- Weekly vulnerability correlation
Baseline-Specific Remediation
Low Baseline Remediation:
- Remove exposed credentials from repositories
- Rotate any detected API keys
- Document in POA&M if not immediately remediable
Moderate Baseline Remediation:
- Encrypt PII at rest and in transit
- Implement tenant isolation for sensitive data
- Remove CUI from unauthorized storage locations
- Enable audit logging for all access
- Update SSP with control implementations
High Baseline Remediation:
- Zero tolerance - immediate remediation required
- Incident response activation for any critical finding
- Document in 24-hour significant change report
- Review authorization boundary for spillage
ConMon Integration Patterns
Integrate DLP findings into your Continuous Monitoring program:
Daily Operations:
- Query critical findings for immediate response
- Correlate with vulnerability scan results
- Update incident tracking system
Monthly Reporting:
- Generate finding trends for ConMon report
- Map findings to NIST SP 800-53 controls
- Update POA&M with remediation progress
Annual Assessment:
- Export historical findings for 3PAO review
- Demonstrate control effectiveness
- Support reauthorization evidence
Authorization Maintenance
- Discover: Use Aquilon to identify sensitive data in cloud boundaries
- Classify: Map findings to NIST SP 800-53 controls
- Scope: Validate authorization boundary accuracy
- Remediate: Address findings before assessment
- Report: Include DLP findings in ConMon reports
- Evidence: Generate assessment-ready reports
- Maintain: Continuous monitoring for authorization renewal
Related Resources
- Compliance Overview
- CUI - CUI detection details
- FISMA - Related federal requirements
- Configuration Guide
FISMA Compliance
Note: FISMA policy framework requires Enterprise Edition.
The Federal Information Security Modernization Act (FISMA) policy framework helps federal agencies and contractors protect federal information systems according to NIST guidelines.
Overview
FISMA requires federal agencies to develop, document, and implement information security programs. Aquilon DLP’s FISMA policy implements FIPS 199 categorization and NIST SP 800-53 controls:
- AC - Access Control: Limit system access to authorized users
- AU - Audit and Accountability: Create and retain audit records
- IA - Identification and Authentication: Identify users and devices
- MP - Media Protection: Protect digital and physical media
- SC - System and Communications Protection: Protect communications
- SI - System and Information Integrity: Protect information integrity
FIPS 199 Impact Levels
| Impact Level | Confidentiality | Description | Controls |
|---|---|---|---|
| Low | Limited adverse effect | Public-facing systems | ~127 |
| Moderate | Serious adverse effect | Most agency systems | ~325 |
| High | Severe/catastrophic effect | National security, financial | ~421 |
Detection Methods
Federal System Scanners
| Scanner | Detects | Severity |
|---|---|---|
cui_marking | CUI in federal systems | Critical |
gov_identifier | DoD EDI-PI, federal IDs | High |
export_control | ITAR/EAR controlled data | Critical |
PII in Federal Context
| Scanner | Federal Context Required | Severity |
|---|---|---|
ssn | Federal employee/citizen records | Critical |
email | .gov/.mil communications | Medium |
address | Federal facility addresses | Medium |
date_of_birth | Personnel records | High |
api_key | Federal system credentials | Critical |
Configuration
Basic Configuration (Moderate Impact)
[policies]
enabled_policies = ["fisma"]
Impact Level Configuration
[policies.policy_configs.fisma]
settings = { impact_level = "moderate", confidence_threshold = "0.7" }
Configuration Options
| Option | Description | Default |
|---|---|---|
impact_level | FIPS 199 impact level (low, moderate, high) | moderate |
detect_cui | Detect CUI markings | true |
detect_pii | Detect PII in federal context | true |
detect_credentials | Detect system credentials | true |
confidence_threshold | Minimum scanner confidence (0.0-1.0) | 0.7 |
Context Detection
Federal Agency Context
- Agency terms: federal, agency, government, bureau, department, administration
- Specific agencies: DoD, VA, HHS, DHS, Treasury, DOJ, DOE, NASA, USDA
- System terms: FISMA, ATO, authorization boundary, system owner, ISSO, ISSM
- Roles: authorizing official, AO, system security officer, privacy officer
Contractor Context
- Contractor terms: contractor, grantee, subrecipient, awardee
- Contract terms: FAR, DFARS, task order, contract vehicle, BPA, IDIQ
- Compliance: NIST, RMF, POA&M, SSP, SAR, CAP
- Oversight: DCAA, OIG, GAO, inspector general
State/Local Government
- Terms: state, county, municipal, local government, tribal
- Programs: grants.gov, federal funding, pass-through, SLFRF
- Compliance: Single Audit, 2 CFR 200, Uniform Guidance
System Categorization Context
FIPS 199 categorization indicators:
- Impact terms: confidentiality, integrity, availability, CIA
- Levels: low impact, moderate impact, high impact
- Categories: national security, PII, financial, law enforcement
- Documents: system security plan, SSP, contingency plan, BIA
Personnel Context
Federal personnel data receives elevated severity:
- HR terms: SF-86, OPM, personnel file, background investigation
- Clearance: security clearance, TS, SCI, Q clearance, L clearance
- Benefits: FEHB, TSP, retirement, pension, FERS, CSRS
- Records: eOPF, employee record, personnel action, SF-50
Example Context Flow
Finding: SSN "123-45-6789"
Context: "OPM background investigation SF-86 supplemental"
Result: Critical violation (PII in federal personnel context - high sensitivity)
Finding: Email list with .gov addresses
Context: "DHS employee directory for FISMA-moderate system"
Result: High violation (federal employee PII requiring protection)
Violation Metadata
Each FISMA violation includes:
{
"policy": "FISMA",
"severity": "critical",
"fips_199_level": "moderate",
"nist_control": "AC-3",
"control_family": "Access Control",
"rmf_step": "assess"
}
Compliance Reporting
FISMA Metrics
-- FISMA findings by severity for reporting
SELECT severity, COUNT(*) as count
FROM aquilon_dlp_alerts
WHERE policy = 'FISMA'
GROUP BY severity
ORDER BY count DESC;
POA&M Support
Query findings for Plan of Action and Milestones:
-- Critical findings for POA&M
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
WHERE policy = 'FISMA'
AND severity IN ('critical', 'high')
ORDER BY timestamp DESC;
Best Practices
By Impact Level
Low Impact:
- Monitor for basic data exposure
- Focus on public-facing system boundaries
- Annual assessment cycle
Moderate Impact:
- Enable full PII detection
- Monitor for CUI spillage
- Regular compliance reporting
- Quarterly POA&M updates
High Impact:
- Strict alerting on any detection
- Real-time incident response
- Enhanced audit trail integration
- Weekly security status reporting
RMF Step-by-Step Integration
Step 1 - Categorize:
Use Aquilon DLP to support system categorization:
- Discover data types stored and processed
- Identify PII, CUI, and sensitive data locations
- Document information types for FIPS 199 assessment
- Generate evidence for categorization decision
Step 2 - Select:
Map DLP findings to control requirements:
- AC-3 (Access Enforcement): Unauthorized access detection
- MP-2 (Media Access): Sensitive data on removable media
- SC-28 (Protection of Information at Rest): Unencrypted sensitive data
- SI-4 (Information System Monitoring): DLP as monitoring control
Step 3 - Implement:
Deploy DLP as part of control implementation:
- Configure policies matching system impact level
- Integrate alerts with security operations
- Document DLP coverage in SSP
Step 4 - Assess:
Use findings for control assessment:
- Generate reports for Security Assessment Report (SAR)
- Provide evidence of control effectiveness
- Document findings requiring POA&M entries
Step 5 - Authorize:
Include DLP in authorization package:
- Control implementation evidence
- Monitoring capability documentation
- Risk acceptance for any open findings
Step 6 - Monitor:
Continuous monitoring with Aquilon:
- Ongoing detection of new exposures
- Trend analysis for ISCM reporting
- POA&M remediation verification
ATO Package Documentation
Generate DLP reports for authorization packages:
Required Documentation:
- System boundary sensitive data inventory
- Control implementation evidence (AC, MP, SC, SI families)
- Monitoring capability description
- Incident detection and response integration
Assessment Evidence:
- Historical finding trends
- Remediation timelines
- False positive rates and tuning
Annual FISMA Reporting
Aquilon findings support FISMA metrics including:
- Number of systems with sensitive PII
- Data spillage incidents
- Remediation timelines
- Control effectiveness measures
- CIO FISMA metrics support
Related Resources
- Compliance Overview
- CUI - CUI detection details
- FedRAMP - Cloud service authorization
- Configuration Guide
Troubleshooting
Common issues and solutions for Aquilon DLP.
Installation Issues
“osquery not found” during installation
Aquilon DLP requires osquery 5.0.1 or later. Install osquery first:
macOS:
# Using Homebrew
brew install --cask osquery
# Or download PKG from https://github.com/osquery/osquery/releases
Ubuntu/Debian:
wget https://pkg.osquery.io/deb/osquery_5.10.2-1.linux_amd64.deb
sudo apt install ./osquery_5.10.2-1.linux_amd64.deb
CentOS/RHEL:
wget https://pkg.osquery.io/rpm/osquery-5.10.2-1.linux.x86_64.rpm
sudo dnf install ./osquery-5.10.2-1.linux.x86_64.rpm
Verify installation:
osqueryd --version
# Expected: osqueryd version 5.0.1 or later
“Unsupported osquery version”
Upgrade osquery to 5.0.1 or later from osquery releases.
“Signature verification failed” (macOS)
The PKG may be corrupted. Re-download from the official source and verify:
spctl -a -v aquilon-dlp-enterprise.pkg
“Installation already in progress” (macOS)
Another installation is running. If a previous installation crashed:
sudo rm -rf /var/run/aquilon-install.lock
macOS Issues
Full Disk Access Not Granted
Aquilon DLP requires Full Disk Access for file monitoring.
Diagnosis:
# Check TCC database
sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
"SELECT auth_value FROM access
WHERE service = 'kTCCServiceSystemPolicyAllFiles'
AND client = 'dev.aquilon.dlp-plugin';"
# Expected: 2 (granted)
Solution:
- Open System Settings > Privacy & Security > Full Disk Access
- Click the lock icon and authenticate
- Click + and navigate to
/opt/aquilon/aquilon-dlp.app - Ensure the toggle is enabled
- Restart the service
For MDM deployments: Deploy a PPPC (Privacy Preferences Policy Control) profile. See MDM Deployment.
Extension Not Loading in osquery
Diagnosis:
# Check extension registered
cat /var/osquery/extensions.load
# Check osquery sees extension
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM osquery_extensions;'
Solutions:
-
Verify Full Disk Access (see above)
-
Restart osqueryd:
sudo launchctl unload /Library/LaunchDaemons/io.osquery.agent.plist sudo launchctl load /Library/LaunchDaemons/io.osquery.agent.plistNote: OSQuery 5.0.1+ uses
io.osquery.agent.plist. Older versions usecom.facebook.osqueryd.plist. -
Check logs:
tail -f /var/log/aquilon/aquilon-dlp.log
“Unsupported macOS version”
Aquilon DLP requires macOS 11.0 (Big Sur) or later:
sw_vers -productVersion
Linux Issues
Extension Not Loading
Diagnosis:
# Check extension registered
cat /etc/osquery/extensions.load
# Check osquery status
sudo systemctl status osqueryd
# Check logs
journalctl -u osqueryd -f
Solutions:
-
Restart osqueryd:
sudo systemctl restart osqueryd -
Check extension permissions:
ls -la /usr/lib/osquery/extensions/aquilon-dlp-*.ext # Should be: -rwxr-xr-x root root
SELinux Blocking Access (RHEL/CentOS)
Diagnosis:
# Check for SELinux denials
sudo ausearch -m avc -ts recent
# Check SELinux status
getenforce
Solution:
# Restore security contexts
sudo restorecon -Rv /usr/lib/osquery/extensions/
sudo restorecon -Rv /etc/aquilon/
Permission Denied Errors
Diagnosis:
ls -la /usr/lib/osquery/extensions/aquilon-dlp-*.ext
Solution:
sudo chmod 755 /usr/lib/osquery/extensions/aquilon-dlp-*.ext
sudo chown root:root /usr/lib/osquery/extensions/aquilon-dlp-*.ext
OSQuery Integration Issues
Table Not Found
If aquilon_dlp_alerts table is not available:
# Verify extension is loaded
osqueryi --connect /var/osquery/osquery.sock 'SELECT * FROM osquery_extensions WHERE name LIKE "%aquilon%";'
# Check table exists
osqueryi 'PRAGMA table_info(aquilon_dlp_alerts);'
If empty, the extension isn’t loaded. See platform-specific extension loading issues above.
SQL Query Errors
Common column name mistakes:
| Wrong | Correct |
|---|---|
created_at | timestamp |
policy_name | policy |
pattern_matched | pattern |
confidence_score | confidence |
hash | JSON_EXTRACT(context, ‘$.file.hash’) |
Correct query example:
SELECT path, scanner, severity, timestamp
FROM aquilon_dlp_alerts
ORDER BY timestamp DESC
LIMIT 10;
Configuration Issues
Configuration Not Loading
Diagnosis:
# Validate configuration
aquilon-dlp --validate-config /etc/aquilon/config.toml
Common errors:
- Unknown field: Check field names match exactly (e.g.,
max_scan_size_mbnotmax_file_size_mb) - Unknown section: Use correct section names (
[scan]not[scanner],[resource_limits]not[resources]) - Invalid TOML: Check syntax with a TOML validator
watch_paths Not Working
Ensure watch_paths is at the top level of your config, not under a section:
# CORRECT - at top level
watch_paths = ["/home/%%", "/var/data/%%"]
[policies]
enabled_policies = ["gdpr", "ccpa"]
# WRONG - under [policies] section
[policies]
enabled_policies = ["gdpr", "ccpa"]
watch_paths = ["/home/%%"] # This won't work!
Performance Issues
High CPU Usage
Diagnosis:
# Check process
top -pid $(pgrep -f aquilon)
# Check alert volume
osqueryi "SELECT COUNT(*) FROM aquilon_dlp_alerts;"
Solutions:
-
Add exclusions for high-churn directories:
exclude_paths = [ "/home/*/.cache/%%", "/home/*/.npm/%%", "/var/log/%%", "**/*.iso", "**/*.dmg" ] -
Limit file size:
[scan] max_scan_size_mb = 40 -
Reduce workers:
[worker] num_workers = 2 -
Enable resource limits:
[resource_limits] enabled = true max_cpu_percent = 50.0 max_memory_mb = 512
High Memory Usage
Solutions:
-
Reduce cache size:
[cache] ttl_secs = 1800 # 30 minutes instead of longer -
Limit memory:
[resource_limits] enabled = true max_memory_mb = 256
No Alerts Generated
Diagnosis
-- Check for any alerts
SELECT COUNT(*) FROM aquilon_dlp_alerts;
-- Check recent alerts
SELECT * FROM aquilon_dlp_alerts
ORDER BY timestamp DESC
LIMIT 5;
Common Causes
-
No sensitive data: Test with known sensitive data:
echo "SSN: 122-15-6289" > /tmp/test-sensitive.txt -
Policies not enabled: Check configuration:
[policies] enabled_policies = ["gdpr", "ccpa", "hipaa"] -
Wrong watch paths: Ensure paths are monitored:
watch_paths = ["/home/%%", "/var/data/%%"] -
Exclusions too broad: Review exclude_paths
-
Cache returning old results: Clear cache or wait for TTL expiry
Debugging Enrichment
When investigating false positives (legitimate data flagged as sensitive) or false negatives (sensitive data not detected), use context trace mode to understand enrichment decisions.
Enable Context Trace
Add to your configuration:
[context_trace]
enabled = true
Understanding Trace Output
With tracing enabled, the logs show each enrichment decision in JSON format:
{
"event": "context_enrichment",
"scanner": "ssn",
"original_confidence": 0.75,
"context_profiles_matched": ["personal_data", "employment"],
"adjustments": [
{"profile": "personal_data", "boost": 0.15, "reason": "SSN keyword in context"},
{"profile": "employment", "boost": 0.10, "reason": "W-2 form indicator"}
],
"final_confidence": 0.95
}
Key fields:
| Field | Description |
|---|---|
original_confidence | Scanner’s base confidence before context analysis |
context_profiles_matched | Which context profiles found relevant keywords |
adjustments | Individual confidence boosts with reasoning |
final_confidence | Result after all adjustments applied |
Common Debugging Scenarios
False Positive Investigation:
- Enable context trace
- Scan the file generating false positives
- Check which context profiles are boosting confidence
- Consider adjusting
enabled_profilesin[context]config
False Negative Investigation:
- Enable context trace
- Scan the file that should generate alerts
- Look for low
original_confidence(scanner issue) vs noadjustments(context issue) - Consider enabling additional context profiles
Disable After Debugging
Context tracing generates significant log volume. Disable when done:
[context_trace]
enabled = false
Getting Help
Collecting Diagnostic Information
macOS:
# Collect logs
tail -n 500 /var/log/aquilon/aquilon-dlp.log > dlp-diagnostics.txt
# System info
sw_vers >> dlp-diagnostics.txt
# FDA status
sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \
"SELECT * FROM access WHERE client LIKE '%aquilon%';" >> dlp-diagnostics.txt
Linux:
# Collect logs
sudo journalctl -u osqueryd -n 500 > dlp-diagnostics.txt
# System info
uname -a >> dlp-diagnostics.txt
cat /etc/os-release >> dlp-diagnostics.txt
# Service status
systemctl status aquilon-dlp >> dlp-diagnostics.txt
Support Channels
- Basic Edition: GitHub Issues
- Enterprise Edition: support@aquilonsecurity.com
- Critical (P1): 4-hour response
- High (P2): 8-hour response
- Normal (P3): 24-hour response
Changelog
[Unreleased]
Added
- National ID Scanners: Added 28 national ID scanners across 4 regions with country-specific checksum validation
- Europe (14): France NIR, Germany Steuer-ID, Italy Codice Fiscale, Spain DNI/NIE, Poland PESEL, Netherlands BSN, Belgium NRN, UK NINO, Sweden Personnummer, Norway Fødselsnummer, Finland HETU, Portugal NIF, Romania CNP, Czech Rodné číslo
- Americas (4): Brazil CPF, Canada SIN, Chile RUT, Argentina CUIT
- Asia-Pacific (8): Australia TFN, India Aadhaar, India PAN, South Korea RRN, Japan My Number, China Resident ID, Taiwan National ID, New Zealand IRD
- Middle East & Africa (2): Israel Teudat Zehut, Turkey TC Kimlik
- GDPR policy now automatically detects EU/EEA national IDs with specialized validation
- Turkey TC Kimlik included for KVKK (GDPR-equivalent) compliance