Files
banks2ff/specs/cli-refactor-plan.md

300 lines
14 KiB
Markdown
Raw Permalink Normal View History

# CLI Refactor Plan: Decoupling for Multi-Source Financial Sync
## Overview
This document outlines a phased plan to refactor the `banks2ff` CLI from a tightly coupled, single-purpose sync tool into a modular, multi-source financial synchronization application. The refactor maintains the existing hexagonal architecture while enabling inspection of accounts, transactions, and sync status, support for multiple data sources (GoCardless, CSV, CAMT.053, MT940), and preparation for web API exposure.
## Goals
- **Decouple CLI Architecture**: Separate CLI logic from core business logic to enable multiple entry points (CLI, web API)
- **Retain Sync Functionality**: Keep existing sync as primary subcommand with backward compatibility
- **Add Financial Entity Management**: Enable viewing/managing accounts, transactions, and sync status
- **Support Multiple Sources/Destinations**: Implement pluggable adapters for different data sources and destinations
- **Prepare for Web API**: Ensure core logic returns serializable data structures
- **Maintain Security**: Preserve financial data masking and compliance protocols
- **Follow Best Practices**: Adhere to Rust idioms, error handling, testing, and project guidelines
## Revised CLI Structure
```bash
banks2ff [OPTIONS] <COMMAND>
OPTIONS:
--config <FILE> Path to config file
--dry-run Preview changes without applying
--debug Enable debug logging (advanced users)
COMMANDS:
sync <SOURCE> <DESTINATION> [OPTIONS]
Synchronize transactions between source and destination
--start <DATE> Start date (YYYY-MM-DD)
--end <DATE> End date (YYYY-MM-DD)
sources List all available source types
destinations List all available destination types
help Show help
```
## Implementation Phases
### Phase 1: CLI Structure Refactor ✅ COMPLETED
**Objective**: Establish new subcommand architecture while preserving existing sync functionality.
**Steps:**
1. ✅ Refactor `main.rs` to use `clap::Subcommand` with nested enums for commands and subcommands
2. ✅ Extract environment loading and client initialization into a `cli::setup` module
3. ✅ Update argument parsing to handle source/destination as positional arguments
4. ✅ Implement basic command dispatch logic with placeholder handlers
5. ✅ Ensure backward compatibility for existing sync usage
**Testing:**
- ✅ Unit tests for new CLI argument parsing
- ✅ Integration tests verifying existing sync command works unchanged
- ✅ Mock tests for new subcommand structure
**Implementation Details:**
- Created `cli/` module with `setup.rs` containing `AppContext` for client initialization
- Implemented subcommand structure: `sync`, `accounts`, `transactions`, `status`, `sources`, `destinations`
- Added dynamic adapter registry in `core::adapters.rs` for discoverability and validation
- Implemented comprehensive input validation with helpful error messages
- Added conditional logging (INFO for sync, WARN for interactive commands)
- All placeholder commands log appropriate messages for future implementation
- Maintained all existing sync functionality and flags
### Phase 2: Core Port Extensions ✅ COMPLETED
**Objective**: Extend ports and adapters to support inspection capabilities.
**Steps:**
1. ✅ Add inspection methods to `TransactionSource` and `TransactionDestination` traits:
- `list_accounts()`: Return account summaries
- `get_account_status()`: Return sync status for accounts
- `get_transaction_info()`: Return transaction metadata
- `get_cache_info()`: Return caching status
2. ✅ Update existing adapters (GoCardless, Firefly) to implement new methods
3. ✅ Define serializable response structs in `core::models` for inspection data
4. ✅ Ensure all new methods handle errors gracefully with `anyhow`
**Testing:**
- Unit tests for trait implementations on existing adapters
- Mock tests for new inspection methods
- Integration tests verifying data serialization
**Implementation Details:**
- Added `AccountSummary`, `AccountStatus`, `TransactionInfo`, and `CacheInfo` structs with `Serialize` and `Debug` traits
- Extended both `TransactionSource` and `TransactionDestination` traits with inspection methods
- Implemented methods in `GoCardlessAdapter` using existing client calls and cache data
- Implemented methods in `FireflyAdapter` using existing client calls
- All code formatted with `cargo fmt` and linted with `cargo clippy`
- Existing tests pass; new methods compile but not yet tested due to CLI not implemented
### Phase 3: Account Linking and Management ✅ COMPLETED
**Objective**: Implement comprehensive account linking between sources and destinations to enable reliable sync, with auto-linking where possible and manual overrides.
**Steps:**
1. ✅ Create `core::linking` module with data structures:
- `AccountLink`: Links source account ID to destination account ID with metadata
- `LinkStore`: Persistent storage for links, aliases, and account registries
- Auto-linking logic (IBAN/name similarity scoring)
2. ✅ Extend adapters with account discovery:
- `TransactionSource::discover_accounts()`: Full account list without filtering
- `TransactionDestination::discover_accounts()`: Full account list
3. ✅ Implement linking management:
- Auto-link on sync/account discovery (IBAN/name matches)
- CLI commands: `banks2ff accounts link list`, `banks2ff accounts link create <source_account> <dest_account>`, `banks2ff accounts link delete <link_id>`
- Alias support: `banks2ff accounts alias set <link_id> <alias>`, `banks2ff accounts alias update <link_id> <new_alias>`
4. ✅ Integrate with sync:
- Always discover accounts during sync and update stores
- Use links in `run_sync()` instead of IBAN-only matching
- Handle unlinked accounts (skip with warning or prompt for manual linking)
5. ✅ Update CLI help text:
- Explain linking process in `banks2ff accounts --help`
- Note that sync auto-discovers and attempts linking
**Testing:**
- Unit tests for auto-linking algorithms
- Integration tests for various account scenarios (IBAN matches, name matches, no matches)
- Persistence tests for link store
- CLI tests for link management commands
**Implementation Details:**
- Created `core::linking` with `LinkStore` using nested `HashMap`s for organized storage by adapter type
- Extended traits with `discover_accounts()` and implemented in GoCardless/Firefly adapters
- Integrated account discovery and auto-linking into `run_sync()` with persistent storage
- Added CLI commands under `banks2ff accounts link` with full CRUD operations and alias support
- Updated README with new account linking feature, examples, and troubleshooting
### Phase 4: CLI Output and Formatting ✅ COMPLETED
**Objective**: Implement user-friendly output for inspection commands.
**Steps:**
1. ✅ Create `cli::formatters` module for consistent output formatting
2. ✅ Implement table-based display for accounts and transactions
3. ✅ Add JSON output option for programmatic use
4. ✅ Ensure sensitive data masking in all outputs
5. Add progress indicators for long-running operations (pending)
6. ✅ Implement `accounts` command with `list` and `status` subcommands
7. ✅ Implement `transactions` command with `list`, `cache-status`, and `clear-cache` subcommands
8. ✅ Add account and transaction inspection methods to adapter traits
**Testing:**
- Unit tests for formatter functions
- Integration tests for CLI output with sample data
- Accessibility tests for output readability
- Unit tests for new command implementations
- Integration tests for account/transaction inspection
**Implementation Details:**
- Created `cli::formatters` module with `Formattable` trait and table formatting using `comfy-table`
- Implemented table display for `AccountSummary`, `AccountStatus`, `TransactionInfo`, and `CacheInfo` structs
- Added IBAN masking (showing only last 4 characters) for privacy
- Updated CLI structure with new `accounts` and `transactions` commands
- Added `print_list_output` function for displaying collections of data
- All code formatted with `cargo fmt` and linted with `cargo clippy`
### Phase 5: Status and Cache Management
**Objective**: Implement status overview and cache management commands.
**Steps:**
1. Implement `status` command aggregating data from all adapters
2. Add cache inspection and clearing functionality to `transactions cache-status` and `transactions clear-cache`
3. Create status models for sync health metrics
4. Integrate with existing debug logging infrastructure
**Testing:**
- Unit tests for status aggregation logic
- Integration tests for cache operations
- Mock tests for status data collection
### Phase 6: Sync Logic Updates
**Objective**: Make sync logic adapter-agnostic and reusable.
**Steps:**
1. Modify `core::sync::run_sync()` to accept source/destination traits instead of concrete types
2. Update sync result structures to include inspection data
3. Refactor account processing to work with any `TransactionSource`
4. Ensure dry-run mode works with all adapter types
**Testing:**
- Unit tests for sync logic with mock adapters
- Integration tests with different source/destination combinations
- Regression tests ensuring existing functionality unchanged
### Phase 7: Adapter Factory Implementation
**Objective**: Enable dynamic adapter instantiation for multiple sources/destinations.
**Steps:**
1. Create `core::adapter_factory` module with factory functions
2. Implement source factory supporting "gocardless", "csv", "camt053", "mt940"
3. Implement destination factory supporting "firefly" (extensible for others)
4. Add configuration structs for adapter-specific settings
5. Integrate factory into CLI setup logic
**Testing:**
- Unit tests for factory functions with valid/invalid inputs
- Mock tests for adapter creation
- Integration tests with real configurations
### Phase 8: Integration and Validation
**Objective**: Ensure all components work together and prepare for web API.
**Steps:**
1. Full integration testing across all source/destination combinations
2. Performance testing with realistic data volumes
3. Documentation updates in `docs/architecture.md`
4. Code review against project guidelines
5. Update `AGENTS.md` with new development patterns
**Testing:**
- End-to-end tests for complete workflows
- Load tests for sync operations
- Security audits for data handling
- Compatibility tests with existing configurations
### Phase 9.5: Command Handler Extraction ✅ COMPLETED
**Objective**: Extract command handling logic from main.rs into dedicated modules for better maintainability and separation of concerns.
**Steps:**
1. ✅ Create `commands/` module structure with submodules for each command group
2. ✅ Extract table printing utilities to `cli/tables.rs`
3. ✅ Move command handlers to appropriate modules:
- `commands/sync.rs`: Sync command logic
- `commands/accounts/`: Account management (link, list, status)
- `commands/transactions/`: Transaction operations (list, cache, clear)
- `commands/list.rs`: Source/destination listing
4. ✅ Update main.rs to dispatch to new command modules
5. ✅ Remove extracted functions from main.rs, reducing it from 1049 to ~150 lines
6. ✅ Update documentation to reflect new structure
**Implementation Details:**
- Created hierarchical module structure with focused responsibilities
- Maintained all existing functionality and CLI interface
- Improved code organization and testability
- Updated architecture documentation with new module structure
**Testing:**
- All existing tests pass
- CLI functionality preserved
- Code formatting and linting applied
### Phase 10: File-Based Source Adapters
**Objective**: Implement adapters for file-based transaction sources.
**Steps:**
1. Create `adapters::csv` module implementing `TransactionSource`
- Parse CSV files with configurable column mappings
- Implement caching similar to GoCardless adapter
- Add inspection methods for file status and transaction counts
2. Create `adapters::camt053` and `adapters::mt940` modules
- Parse respective financial file formats
- Implement transaction mapping and validation
- Add format-specific caching and inspection
3. Update `adapter_factory` to instantiate file adapters with file paths
**Testing:**
- Unit tests for file parsing with sample data
- Mock tests for adapter implementations
- Integration tests with fixture files from `tests/fixtures/`
- Performance tests for large file handling
## Architecture Considerations
- **Hexagonal Architecture**: Maintain separation between core business logic, ports, and adapters
- **Error Handling**: Use `thiserror` for domain errors, `anyhow` for application errors
- **Async Programming**: Leverage `tokio` for concurrent operations where beneficial
- **Testing Strategy**: Combine unit tests, integration tests, and mocks using `mockall`
- **Dependencies**: Add new crates only if necessary, preferring workspace dependencies
- **Code Organization**: Keep modules focused and single-responsibility
- **Performance**: Implement caching and batching for file-based sources
## Security and Compliance Notes
- **Financial Data Masking**: Never expose amounts, IBANs, or personal data in logs/outputs
- **Input Validation**: Validate all external data before processing
- **Error Messages**: Avoid sensitive information in error responses
- **Audit Trail**: Maintain structured logging for operations
- **Compliance**: Ensure GDPR/privacy compliance for financial data handling
## Success Criteria
- All existing sync functionality preserved
- New commands work with all supported sources/destinations
- Core logic remains adapter-agnostic
- Comprehensive test coverage maintained
- Performance meets or exceeds current benchmarks
- Architecture supports future web API development</content>
<parameter name="filePath">specs/cli-refactor-plan.md