Files
banks2ff/specs/encrypted-transaction-caching-plan.md

7.2 KiB

Encrypted Transaction Caching Implementation Plan

Overview

Implement encrypted caching for GoCardless transactions to minimize API calls against the extremely low rate limits (4 reqs/day per account). Cache raw transaction data with automatic range merging and deduplication.

Architecture

  • Location: banks2ff/src/adapters/gocardless/
  • Storage: data/cache/ directory
  • Encryption: AES-GCM for disk storage only
  • No API Client Changes: All caching logic in adapter layer

Components to Create

1. Transaction Cache Module

File: banks2ff/src/adapters/gocardless/transaction_cache.rs

Structures:

#[derive(Serialize, Deserialize)]
pub struct AccountTransactionCache {
    account_id: String,
    ranges: Vec<CachedRange>,
}

#[derive(Serialize, Deserialize)]
struct CachedRange {
    start_date: NaiveDate,
    end_date: NaiveDate,
    transactions: Vec<gocardless_client::models::Transaction>,
}

Methods:

  • load(account_id: &str) -> Result<Self>
  • save(&self) -> Result<()>
  • get_cached_transactions(start: NaiveDate, end: NaiveDate) -> Vec<gocardless_client::models::Transaction>
  • get_uncovered_ranges(start: NaiveDate, end: NaiveDate) -> Vec<(NaiveDate, NaiveDate)>
  • store_transactions(start: NaiveDate, end: NaiveDate, transactions: Vec<gocardless_client::models::Transaction>)
  • merge_ranges(new_range: CachedRange)

Configuration

  • BANKS2FF_CACHE_KEY: Required encryption key
  • BANKS2FF_CACHE_DIR: Optional cache directory (default: data/cache)

Testing

  • Tests run with automatic environment variable setup
  • Each test uses isolated cache directories in tmp/ for parallel execution
  • No manual environment variable configuration required
  • Test artifacts are automatically cleaned up

2. Encryption Module

File: banks2ff/src/adapters/gocardless/encryption.rs

Features:

  • AES-GCM encryption/decryption
  • PBKDF2 key derivation from BANKS2FF_CACHE_KEY env var
  • Encrypt/decrypt binary data for disk I/O

3. Range Merging Algorithm

Logic:

  1. Detect overlapping/adjacent ranges
  2. Merge transactions with deduplication by transaction_id
  3. Combine date ranges
  4. Remove redundant entries

Modified Components

1. GoCardlessAdapter

File: banks2ff/src/adapters/gocardless/client.rs

Changes:

  • Add TransactionCache field
  • Modify get_transactions() to:
    1. Check cache for covered ranges
    2. Fetch missing ranges from API
    3. Store new data with merging
    4. Return combined results

2. Account Cache

File: banks2ff/src/adapters/gocardless/cache.rs

Changes:

  • Move storage to data/cache/accounts.enc
  • Add encryption for account mappings
  • Update file path and I/O methods

Actionable Implementation Steps

Phase 1: Core Infrastructure + Basic Testing COMPLETED

  1. Create data/cache/ directory
  2. Implement encryption module with AES-GCM
  3. Create transaction cache module with basic load/save
  4. Update account cache to use encryption and new location
  5. Add unit tests for encryption/decryption round-trip
  6. Add unit tests for basic cache load/save operations

Phase 2: Range Management + Range Testing

  1. Implement range overlap detection algorithms
  2. Add transaction deduplication logic
  3. Implement range merging for overlapping/adjacent ranges
  4. Add cache coverage checking
  5. Add unit tests for range overlap detection
  6. Add unit tests for transaction deduplication
  7. Add unit tests for range merging edge cases

Phase 3: Adapter Integration + Integration Testing

  1. Add TransactionCache to GoCardlessAdapter struct
  2. Modify get_transactions() to use cache-first approach
  3. Implement missing range fetching logic
  4. Add cache storage after API calls
  5. Add integration tests with mock API responses
  6. Test full cache workflow (hit/miss scenarios)

Phase 4: Migration & Full Testing

  1. Create migration script for existing .banks2ff-cache.json
  2. Add comprehensive unit tests for all cache operations
  3. Add performance benchmarks for cache operations
  4. Test migration preserves existing data

Key Design Decisions

Encryption Scope

  • In Memory: Plain structs (no performance overhead)
  • On Disk: Full AES-GCM encryption
  • Key Source: Environment variable BANKS2FF_CACHE_KEY

Range Merging Strategy

  • Overlap Detection: Check date range intersections
  • Transaction Deduplication: Use transaction_id as unique key
  • Adjacent Merging: Combine contiguous date ranges
  • Storage: Single file per account with multiple ranges

Cache Structure

  • Per Account: Separate encrypted files
  • Multiple Ranges: Allow gaps and overlaps (merged on write)
  • JSON Format: Use serde_json for serialization (already available)

Dependencies to Add

  • aes-gcm: For encryption
  • pbkdf2: For key derivation
  • rand: For encryption nonces

Security Considerations

  • Encryption: AES-GCM with 256-bit keys and PBKDF2 (200,000 iterations)
  • Salt Security: Random 16-byte salt per encryption (prepended to ciphertext)
  • Key Management: Environment variable BANKS2FF_CACHE_KEY required
  • Data Protection: Financial data encrypted at rest, no sensitive data in logs
  • Authentication: GCM provides integrity protection against tampering
  • Forward Security: Unique salt/nonce prevents rainbow table attacks

Performance Expectations

  • Cache Hit: Sub-millisecond retrieval
  • Cache Miss: API call + encryption overhead
  • Merge Operations: Minimal impact (done on write, not read)
  • Storage Growth: Linear with transaction volume

Testing Requirements

  • Unit tests for all cache operations
  • Encryption/decryption round-trip tests
  • Range merging edge cases
  • Mock API integration tests
  • Performance benchmarks

Rollback Plan

  • Cache files are additive - can delete to reset
  • API client unchanged - can disable cache feature
  • Migration preserves old cache during transition

Phase 1 Implementation Status COMPLETED

Security Improvements Implemented

  1. PBKDF2 Iterations: Increased from 100,000 to 200,000 for better brute-force resistance
  2. Random Salt: Implemented random 16-byte salt per encryption operation (prepended to ciphertext)
  3. Module Documentation: Added comprehensive security documentation with performance characteristics
  4. Configurable Cache Directory: Added BANKS2FF_CACHE_DIR environment variable for test isolation

Technical Details

  • Ciphertext Format: [salt(16)][nonce(12)][ciphertext] for forward security
  • Key Derivation: PBKDF2-SHA256 with 200,000 iterations
  • Error Handling: Proper validation of encrypted data format
  • Testing: All security features tested with round-trip validation
  • Test Isolation: Unique cache directories per test to prevent interference

Security Audit Results

  • Encryption Strength: Excellent (AES-GCM + strengthened PBKDF2)
  • Forward Security: Excellent (unique salt per operation)
  • Key Security: Strong (200k iterations + random salt)
  • Data Integrity: Protected (GCM authentication)
  • Test Suite: 24/24 tests passing (parallel execution with isolated cache directories)
  • Forward Security: Excellent (unique salt/nonce per encryption) specs/encrypted-transaction-caching-plan.md