Data Encryption at Rest and In Transit: A Practical Guide
Data Encryption at Rest and In Transit: A Practical Guide
Encryption is the non-negotiable baseline for any production data platform handling sensitive information. Yet "we encrypt our data" is one of the most misunderstood statements in cloud security—it can mean anything from S3 default encryption (good) to a hand-rolled crypto library (terrifying).
This guide gives you the practitioner's view: what to encrypt, how to do it correctly, where the gotchas are, and how to verify you've actually done it.
The Threat Model
Before picking algorithms and key sizes, understand what you're protecting against:
Loading diagram...
| Threat | Encryption at Rest | Encryption in Transit | Both |
|---|---|---|---|
| Stolen S3 bucket | ✅ | ❌ | ✅ |
| Compromised network path | ❌ | ✅ | ✅ |
| Compromised host OS | ❌ | ❌ | ❌ |
| Rogue DBA | ❌ | ❌ | ❌ |
Key insight: Encryption protects data at storage and wire level. It does not protect against compromised compute or identity. Combine encryption with IAM, VPC boundaries, and audit logging.
Encryption at Rest
AWS KMS: The Right Way
Never manage your own key material in a cloud environment. Use a managed Key Management Service (KMS).
Loading diagram...
This is envelope encryption: KMS never sees your data—it only wraps/unwraps the DEK.
Terraform: KMS Key with Rotation
resource "aws_kms_key" "data_platform" {
description = "Data platform encryption key"
deletion_window_in_days = 30
enable_key_rotation = true
multi_region = false
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "AllowKeyAdministration"
Effect = "Allow"
Principal = {
AWS = "arn:aws:iam::${var.account_id}:role/DataPlatformAdmin"
}
Action = ["kms:*"]
Resource = "*"
},
{
Sid = "AllowServiceUse"
Effect = "Allow"
Principal = {
Service = ["s3.amazonaws.com", "rds.amazonaws.com", "glue.amazonaws.com"]
}
Action = [
"kms:GenerateDataKey",
"kms:Decrypt",
"kms:DescribeKey"
]
Resource = "*"
}
]
})
tags = {
Purpose = "data-platform-encryption"
Rotation = "annual-automatic"
}
}
resource "aws_kms_alias" "data_platform" {
name = "alias/data-platform-${var.environment}"
target_key_id = aws_kms_key.data_platform.key_id
}
S3 Bucket Encryption
resource "aws_s3_bucket_server_side_encryption_configuration" "data_lake" {
bucket = aws_s3_bucket.data_lake["bronze"].id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
kms_master_key_id = aws_kms_key.data_platform.arn
}
bucket_key_enabled = true # Reduce KMS API call costs by 99%
}
}
# Enforce encryption on upload (deny unencrypted PutObject)
resource "aws_s3_bucket_policy" "enforce_encryption" {
bucket = aws_s3_bucket.data_lake["bronze"].id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Sid = "DenyUnencryptedObjectUploads"
Effect = "Deny"
Principal = "*"
Action = "s3:PutObject"
Resource = "${aws_s3_bucket.data_lake["bronze"].arn}/*"
Condition = {
StringNotEquals = {
"s3:x-amz-server-side-encryption" = "aws:kms"
}
}
}]
})
}
Database Encryption: RDS and Redshift
resource "aws_db_instance" "data_warehouse" {
identifier = "data-warehouse-${var.environment}"
engine = "postgres"
engine_version = "15.4"
instance_class = "db.r6g.xlarge"
storage_encrypted = true
kms_key_id = aws_kms_key.data_platform.arn
# Enable Performance Insights with encryption
performance_insights_enabled = true
performance_insights_kms_key_id = aws_kms_key.data_platform.arn
performance_insights_retention_period = 7
# Automated backups (also encrypted with KMS)
backup_retention_period = 30
backup_window = "03:00-04:00"
}
Column-Level Encryption
For PII fields that must be encrypted even from database admins, use column-level encryption. This is where envelope encryption really shines.
# AWS CLI: encrypt a single field value with KMS
aws kms encrypt --key-id alias/data-platform-prod --plaintext fileb://<(echo -n "user@example.com") --output text --query CiphertextBlob | base64 --decode > encrypted_email.bin
# Decrypt
aws kms decrypt --ciphertext-blob fileb://encrypted_email.bin --output text --query Plaintext | base64 --decode
For at-scale column encryption in a data pipeline:
# Spark job config for PII column encryption
encryption:
enabled: true
pii_columns:
- email
- phone_number
- national_id
- credit_card_number
strategy: deterministic # vs random (for lookup joins)
key_provider: aws_kms
kms_key_id: "alias/data-platform-prod"
cache_ttl_seconds: 300 # Cache DEKs to reduce KMS API calls
Encryption in Transit
TLS 1.3: The Baseline
TLS 1.3 eliminates the weak cipher suites that plagued 1.2. Enforce it everywhere.
# Verify TLS version and cipher suite on an endpoint
openssl s_client -connect my-kafka-broker:9093 -tls1_3 2>/dev/null | grep -E "Protocol|Cipher"
# Expected output:
# Protocol : TLSv1.3
# Cipher : TLS_AES_256_GCM_SHA384
Kafka TLS Configuration
# server.properties - Kafka broker
listeners=PLAINTEXT://localhost:9092,SSL://0.0.0.0:9093
ssl.keystore.location=/etc/kafka/ssl/kafka.server.keystore.jks
ssl.keystore.password=${KEYSTORE_PASSWORD}
ssl.key.password=${KEY_PASSWORD}
ssl.truststore.location=/etc/kafka/ssl/kafka.server.truststore.jks
ssl.truststore.password=${TRUSTSTORE_PASSWORD}
ssl.client.auth=required
ssl.enabled.protocols=TLSv1.3
ssl.cipher.suites=TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256
# consumer.properties
security.protocol=SSL
ssl.truststore.location=/etc/kafka/ssl/client.truststore.jks
ssl.keystore.location=/etc/kafka/ssl/client.keystore.jks
mTLS for Service-to-Service
Mutual TLS ensures both parties authenticate. Essential for internal microservice communication handling data.
# Generate CA, server cert, client cert with OpenSSL
# 1. Create CA
openssl genrsa -out ca.key 4096
openssl req -new -x509 -days 3650 -key ca.key -out ca.crt -subj "/CN=DataPlatform-CA/O=MyOrg"
# 2. Server cert
openssl genrsa -out server.key 2048
openssl req -new -key server.key -out server.csr -subj "/CN=kafka-broker-1.internal/O=MyOrg"
openssl x509 -req -days 365 -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server.crt
# 3. Client cert (for ETL service)
openssl genrsa -out etl-client.key 2048
openssl req -new -key etl-client.key -out etl-client.csr -subj "/CN=etl-pipeline/O=MyOrg"
openssl x509 -req -days 365 -in etl-client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out etl-client.crt
ALB + ACM: TLS for HTTP Endpoints
resource "aws_lb_listener" "https" {
load_balancer_arn = aws_lb.data_api.arn
port = "443"
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-TLS13-1-2-2021-06"
certificate_arn = aws_acm_certificate_validation.api.certificate_arn
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.data_api.arn
}
}
# Redirect HTTP to HTTPS
resource "aws_lb_listener" "http_redirect" {
load_balancer_arn = aws_lb.data_api.arn
port = "80"
protocol = "HTTP"
default_action {
type = "redirect"
redirect {
port = "443"
protocol = "HTTPS"
status_code = "HTTP_301"
}
}
}
Key Management Best Practices
Key Rotation Schedule
| Key Type | Rotation Frequency | Method |
|---|---|---|
| KMS CMK (AWS-managed) | Annual (automatic) | Enable key_rotation |
| Data Encryption Keys | Per-job or per-day | Re-wrap with new CMK |
| TLS certificates | 90 days (Let's Encrypt) / Annual (ACM) | Auto-renew via ACM |
| Database master password | 90 days | Secrets Manager rotation |
| API keys | 30–90 days | Manual + CI/CD rotation |
AWS Secrets Manager for Credential Rotation
resource "aws_secretsmanager_secret" "db_password" {
name = "data-platform/${var.environment}/db-master-password"
kms_key_id = aws_kms_key.data_platform.arn
recovery_window_in_days = 7
rotation_rules {
automatically_after_days = 90
}
}
resource "aws_secretsmanager_secret_rotation" "db_password" {
secret_id = aws_secretsmanager_secret.db_password.id
rotation_lambda_arn = aws_lambda_function.db_password_rotator.arn
rotation_rules {
automatically_after_days = 90
}
}
Compliance Mapping
| Requirement | Control | Implementation |
|---|---|---|
| GDPR Art. 32 | Encryption of personal data | KMS + column-level for PII |
| SOC 2 CC6.1 | Logical access controls | KMS key policies + IAM |
| PCI DSS 3.4 | Cardholder data encryption | Deterministic column encryption |
| HIPAA § 164.312 | PHI encryption at rest + transit | KMS + TLS 1.3 |
| ISO 27001 A.10 | Cryptographic controls | Key management policy |
Encryption Audit with AWS Config
resource "aws_config_config_rule" "s3_encryption" {
name = "s3-bucket-server-side-encryption-enabled"
source {
owner = "AWS"
source_identifier = "S3_BUCKET_SERVER_SIDE_ENCRYPTION_ENABLED"
}
}
resource "aws_config_config_rule" "rds_encryption" {
name = "rds-storage-encrypted"
source {
owner = "AWS"
source_identifier = "RDS_STORAGE_ENCRYPTED"
}
}
resource "aws_config_config_rule" "kms_rotation" {
name = "cmk-backing-key-rotation-enabled"
source {
owner = "AWS"
source_identifier = "CMK_BACKING_KEY_ROTATION_ENABLED"
}
}
Common Mistakes
- S3 default encryption ≠ customer-managed keys — AWS-managed keys don't give you key usage logs or rotation control
- Encrypting backups with the same key as prod — A compromised key means both are lost
- Ignoring KMS request quotas — Without bucket-key enabled, high-throughput workloads hit 10,000 req/s KMS limits
- TLS terminating at the load balancer only — Backend-to-backend traffic must also be encrypted
- Hardcoding encryption keys in application code — Use Secrets Manager or environment injection
Monitoring Encryption Health
Platforms like Harbinger Explorer can continuously audit your data platform's encryption posture—flagging unencrypted resources, expiring certificates, and key policy drift before they become compliance incidents.
# Quick CLI audit: find unencrypted S3 buckets
aws s3api list-buckets --query 'Buckets[].Name' --output text | tr '\t' '\n' | xargs -I{} aws s3api get-bucket-encryption --bucket {} 2>&1 | grep -B1 "ServerSideEncryptionConfigurationNotFoundError"
Summary
Encryption is a system, not a checkbox. Implement it correctly:
- Use KMS with customer-managed keys + automatic rotation
- Enable bucket-key to control costs
- Enforce TLS 1.3 everywhere, mTLS for internal services
- Apply column-level encryption for PII fields
- Audit continuously with AWS Config rules
- Never co-locate prod and backup keys
Try Harbinger Explorer free for 7 days and get automated encryption posture monitoring across your entire cloud data platform—no agents required.
Continue Reading
GDPR Compliance for Cloud Data Platforms: A Technical Deep Dive
A comprehensive technical guide to building GDPR-compliant cloud data platforms — covering pseudonymisation architecture, Terraform infrastructure, Kubernetes deployments, right-to-erasure workflows, and cloud provider comparison tables.
Cloud Cost Allocation Strategies for Data Teams
A practitioner's guide to cloud cost allocation for data teams—covering tagging strategies, chargeback models, Spot instance patterns, query cost optimization, and FinOps tooling with real Terraform and CLI examples.
API Gateway Architecture Patterns for Data Platforms
A deep-dive into API gateway architecture patterns for data platforms — covering data serving APIs, rate limiting, authentication, schema versioning, and the gateway-as-data-mesh pattern.
Try Harbinger Explorer for free
Connect any API, upload files, and explore with AI — all in your browser. No credit card required.
Start Free Trial