192 lines
7.2 KiB
Markdown
192 lines
7.2 KiB
Markdown
|
|
# Encrypted Transaction Caching Implementation Plan
|
||
|
|
|
||
|
|
## Overview
|
||
|
|
Implement encrypted caching for GoCardless transactions to minimize API calls against the extremely low rate limits (4 reqs/day per account). Cache raw transaction data with automatic range merging and deduplication.
|
||
|
|
|
||
|
|
## Architecture
|
||
|
|
- **Location**: `banks2ff/src/adapters/gocardless/`
|
||
|
|
- **Storage**: `data/cache/` directory
|
||
|
|
- **Encryption**: AES-GCM for disk storage only
|
||
|
|
- **No API Client Changes**: All caching logic in adapter layer
|
||
|
|
|
||
|
|
## Components to Create
|
||
|
|
|
||
|
|
### 1. Transaction Cache Module
|
||
|
|
**File**: `banks2ff/src/adapters/gocardless/transaction_cache.rs`
|
||
|
|
|
||
|
|
**Structures**:
|
||
|
|
```rust
|
||
|
|
#[derive(Serialize, Deserialize)]
|
||
|
|
pub struct AccountTransactionCache {
|
||
|
|
account_id: String,
|
||
|
|
ranges: Vec<CachedRange>,
|
||
|
|
}
|
||
|
|
|
||
|
|
#[derive(Serialize, Deserialize)]
|
||
|
|
struct CachedRange {
|
||
|
|
start_date: NaiveDate,
|
||
|
|
end_date: NaiveDate,
|
||
|
|
transactions: Vec<gocardless_client::models::Transaction>,
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Methods**:
|
||
|
|
- `load(account_id: &str) -> Result<Self>`
|
||
|
|
- `save(&self) -> Result<()>`
|
||
|
|
- `get_cached_transactions(start: NaiveDate, end: NaiveDate) -> Vec<gocardless_client::models::Transaction>`
|
||
|
|
- `get_uncovered_ranges(start: NaiveDate, end: NaiveDate) -> Vec<(NaiveDate, NaiveDate)>`
|
||
|
|
- `store_transactions(start: NaiveDate, end: NaiveDate, transactions: Vec<gocardless_client::models::Transaction>)`
|
||
|
|
- `merge_ranges(new_range: CachedRange)`
|
||
|
|
|
||
|
|
## Configuration
|
||
|
|
|
||
|
|
- `BANKS2FF_CACHE_KEY`: Required encryption key
|
||
|
|
- `BANKS2FF_CACHE_DIR`: Optional cache directory (default: `data/cache`)
|
||
|
|
|
||
|
|
## Testing
|
||
|
|
|
||
|
|
- Tests run with automatic environment variable setup
|
||
|
|
- Each test uses isolated cache directories in `tmp/` for parallel execution
|
||
|
|
- No manual environment variable configuration required
|
||
|
|
- Test artifacts are automatically cleaned up
|
||
|
|
### 2. Encryption Module
|
||
|
|
**File**: `banks2ff/src/adapters/gocardless/encryption.rs`
|
||
|
|
|
||
|
|
**Features**:
|
||
|
|
- AES-GCM encryption/decryption
|
||
|
|
- PBKDF2 key derivation from `BANKS2FF_CACHE_KEY` env var
|
||
|
|
- Encrypt/decrypt binary data for disk I/O
|
||
|
|
|
||
|
|
### 3. Range Merging Algorithm
|
||
|
|
**Logic**:
|
||
|
|
1. Detect overlapping/adjacent ranges
|
||
|
|
2. Merge transactions with deduplication by `transaction_id`
|
||
|
|
3. Combine date ranges
|
||
|
|
4. Remove redundant entries
|
||
|
|
|
||
|
|
## Modified Components
|
||
|
|
|
||
|
|
### 1. GoCardlessAdapter
|
||
|
|
**File**: `banks2ff/src/adapters/gocardless/client.rs`
|
||
|
|
|
||
|
|
**Changes**:
|
||
|
|
- Add `TransactionCache` field
|
||
|
|
- Modify `get_transactions()` to:
|
||
|
|
1. Check cache for covered ranges
|
||
|
|
2. Fetch missing ranges from API
|
||
|
|
3. Store new data with merging
|
||
|
|
4. Return combined results
|
||
|
|
|
||
|
|
### 2. Account Cache
|
||
|
|
**File**: `banks2ff/src/adapters/gocardless/cache.rs`
|
||
|
|
|
||
|
|
**Changes**:
|
||
|
|
- Move storage to `data/cache/accounts.enc`
|
||
|
|
- Add encryption for account mappings
|
||
|
|
- Update file path and I/O methods
|
||
|
|
|
||
|
|
## Actionable Implementation Steps
|
||
|
|
|
||
|
|
### Phase 1: Core Infrastructure + Basic Testing ✅ COMPLETED
|
||
|
|
1. ✅ Create `data/cache/` directory
|
||
|
|
2. ✅ Implement encryption module with AES-GCM
|
||
|
|
3. ✅ Create transaction cache module with basic load/save
|
||
|
|
4. ✅ Update account cache to use encryption and new location
|
||
|
|
5. ✅ Add unit tests for encryption/decryption round-trip
|
||
|
|
6. ✅ Add unit tests for basic cache load/save operations
|
||
|
|
|
||
|
|
### Phase 2: Range Management + Range Testing
|
||
|
|
7. Implement range overlap detection algorithms
|
||
|
|
8. Add transaction deduplication logic
|
||
|
|
9. Implement range merging for overlapping/adjacent ranges
|
||
|
|
10. Add cache coverage checking
|
||
|
|
11. Add unit tests for range overlap detection
|
||
|
|
12. Add unit tests for transaction deduplication
|
||
|
|
13. Add unit tests for range merging edge cases
|
||
|
|
|
||
|
|
### Phase 3: Adapter Integration + Integration Testing
|
||
|
|
14. Add TransactionCache to GoCardlessAdapter struct
|
||
|
|
15. Modify `get_transactions()` to use cache-first approach
|
||
|
|
16. Implement missing range fetching logic
|
||
|
|
17. Add cache storage after API calls
|
||
|
|
18. Add integration tests with mock API responses
|
||
|
|
19. Test full cache workflow (hit/miss scenarios)
|
||
|
|
|
||
|
|
### Phase 4: Migration & Full Testing
|
||
|
|
20. Create migration script for existing `.banks2ff-cache.json`
|
||
|
|
21. Add comprehensive unit tests for all cache operations
|
||
|
|
22. Add performance benchmarks for cache operations
|
||
|
|
23. Test migration preserves existing data
|
||
|
|
|
||
|
|
## Key Design Decisions
|
||
|
|
|
||
|
|
### Encryption Scope
|
||
|
|
- **In Memory**: Plain structs (no performance overhead)
|
||
|
|
- **On Disk**: Full AES-GCM encryption
|
||
|
|
- **Key Source**: Environment variable `BANKS2FF_CACHE_KEY`
|
||
|
|
|
||
|
|
### Range Merging Strategy
|
||
|
|
- **Overlap Detection**: Check date range intersections
|
||
|
|
- **Transaction Deduplication**: Use `transaction_id` as unique key
|
||
|
|
- **Adjacent Merging**: Combine contiguous date ranges
|
||
|
|
- **Storage**: Single file per account with multiple ranges
|
||
|
|
|
||
|
|
### Cache Structure
|
||
|
|
- **Per Account**: Separate encrypted files
|
||
|
|
- **Multiple Ranges**: Allow gaps and overlaps (merged on write)
|
||
|
|
- **JSON Format**: Use `serde_json` for serialization (already available)
|
||
|
|
|
||
|
|
## Dependencies to Add
|
||
|
|
- `aes-gcm`: For encryption
|
||
|
|
- `pbkdf2`: For key derivation
|
||
|
|
- `rand`: For encryption nonces
|
||
|
|
|
||
|
|
## Security Considerations
|
||
|
|
- **Encryption**: AES-GCM with 256-bit keys and PBKDF2 (200,000 iterations)
|
||
|
|
- **Salt Security**: Random 16-byte salt per encryption (prepended to ciphertext)
|
||
|
|
- **Key Management**: Environment variable `BANKS2FF_CACHE_KEY` required
|
||
|
|
- **Data Protection**: Financial data encrypted at rest, no sensitive data in logs
|
||
|
|
- **Authentication**: GCM provides integrity protection against tampering
|
||
|
|
- **Forward Security**: Unique salt/nonce prevents rainbow table attacks
|
||
|
|
|
||
|
|
## Performance Expectations
|
||
|
|
- **Cache Hit**: Sub-millisecond retrieval
|
||
|
|
- **Cache Miss**: API call + encryption overhead
|
||
|
|
- **Merge Operations**: Minimal impact (done on write, not read)
|
||
|
|
- **Storage Growth**: Linear with transaction volume
|
||
|
|
|
||
|
|
## Testing Requirements
|
||
|
|
- Unit tests for all cache operations
|
||
|
|
- Encryption/decryption round-trip tests
|
||
|
|
- Range merging edge cases
|
||
|
|
- Mock API integration tests
|
||
|
|
- Performance benchmarks
|
||
|
|
|
||
|
|
## Rollback Plan
|
||
|
|
- Cache files are additive - can delete to reset
|
||
|
|
- API client unchanged - can disable cache feature
|
||
|
|
- Migration preserves old cache during transition
|
||
|
|
|
||
|
|
## Phase 1 Implementation Status ✅ COMPLETED
|
||
|
|
|
||
|
|
### Security Improvements Implemented
|
||
|
|
1. ✅ **PBKDF2 Iterations**: Increased from 100,000 to 200,000 for better brute-force resistance
|
||
|
|
2. ✅ **Random Salt**: Implemented random 16-byte salt per encryption operation (prepended to ciphertext)
|
||
|
|
3. ✅ **Module Documentation**: Added comprehensive security documentation with performance characteristics
|
||
|
|
4. ✅ **Configurable Cache Directory**: Added `BANKS2FF_CACHE_DIR` environment variable for test isolation
|
||
|
|
|
||
|
|
### Technical Details
|
||
|
|
- **Ciphertext Format**: `[salt(16)][nonce(12)][ciphertext]` for forward security
|
||
|
|
- **Key Derivation**: PBKDF2-SHA256 with 200,000 iterations
|
||
|
|
- **Error Handling**: Proper validation of encrypted data format
|
||
|
|
- **Testing**: All security features tested with round-trip validation
|
||
|
|
- **Test Isolation**: Unique cache directories per test to prevent interference
|
||
|
|
|
||
|
|
### Security Audit Results
|
||
|
|
- **Encryption Strength**: Excellent (AES-GCM + strengthened PBKDF2)
|
||
|
|
- **Forward Security**: Excellent (unique salt per operation)
|
||
|
|
- **Key Security**: Strong (200k iterations + random salt)
|
||
|
|
- **Data Integrity**: Protected (GCM authentication)
|
||
|
|
- **Test Suite**: 24/24 tests passing (parallel execution with isolated cache directories)
|
||
|
|
- **Forward Security**: Excellent (unique salt/nonce per encryption)</content>
|
||
|
|
<parameter name="filePath">specs/encrypted-transaction-caching-plan.md
|