DHCP Recovery Handler
The DHCP Recovery Handler provides persistence and recovery functionality for the DHCP Client Manager. It ensures that DHCP client state and lease information survive system restarts, crashes, and other service interruptions. The component implements a comprehensive recovery system that monitors client processes, stores lease data, and restores operational state upon system restart.
Architecture
The recovery handler implements a multi-layered persistence and monitoring architecture:
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ Process Monitor │────►│ Recovery Handler │────►│ Persistence Layer │
│ │ │ │ │ │
│ • PID Tracking │ │ • State Management │ │ • File Storage │
│ • Health Checks │ │ • Recovery Logic │ │ • Data Integrity │
│ • Event Detection │ │ • Restart Control │ │ • Atomic Operations │
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
│
▼
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ System Startup │◄────│ Recovery Process │────►│ Client Restoration │
│ │ │ │ │ │
│ • Service Init │ │ • Data Validation │ │ • State Restoration │
│ • File System Check │ │ • Lease Recovery │ │ • Process Restart │
│ • Resource Alloc │ │ • Configuration │ │ • TR-181 Update │
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
Key Components
Files
- Source:
source/DHCPMgrUtils/dhcpmgr_recovery_handler.c - Header:
source/DHCPMgrUtils/dhcpmgr_recovery_handler.h
Core Functions
DhcpMgr_Dhcp_Recovery_Start()
- Purpose: Initializes and starts the DHCP recovery system
- Returns: 0 on success, negative error code on failure
- Initialization: Sets up process monitoring and loads existing lease data
DHCPMgr_loadDhcpLeases()
- Purpose: Loads previously stored DHCP lease information during system startup
- Scope: Processes both DHCPv4 and DHCPv6 lease files
- Recovery: Restores TR-181 data model and client state
Persistence System
Storage Architecture
Directory Structure
/tmp/Dhcp_manager/ ├── dhcpLease_1_v4 # DHCPv4 client instance 1 ├── dhcpLease_1_v6 # DHCPv6 client instance 1 ├── dhcpLease_2_v4 # DHCPv4 client instance 2 └── dhcpLease_2_v6 # DHCPv6 client instance 2
File Naming Convention
- Format:
dhcpLease_{instanceNumber}_{version} - Instance Number: TR-181 client instance identifier
- Version:
v4for DHCPv4,v6for DHCPv6
DHCPv4 Lease Storage
DHCPMgr_storeDhcpv4Lease(PCOSA_DML_DHCPC_FULL data)
Purpose: Persists DHCPv4 lease information to filesystem
Storage Process:
- Directory Creation: Ensure storage directory exists
- File Creation: Create instance-specific lease file
- Data Writing: Write complete client structure
- Lease Storage: Append current lease information
- Error Handling: Cleanup on write failures
Data Structure Storage:
// Main client structure
fwrite(data, sizeof(COSA_DML_DHCPC_FULL), 1, file);
// Current lease information (if available)
if (data->currentLease != NULL) {
fwrite(data->currentLease, sizeof(DHCPv4_PLUGIN_MSG), 1, file);
}
Stored Information:
- Client Configuration: TR-181 client settings
- Operational State: Current client status
- Lease Data: Active lease information
- Timing Information: Lease acquisition and expiration times
- Network Configuration: IP address, gateway, DNS settings
DHCPv6 Lease Storage
DHCPMgr_storeDhcpv6Lease(PCOSA_DML_DHCPCV6_FULL data)
Purpose: Persists DHCPv6 lease information to filesystem
IPv6-Specific Storage:
- Client Structure: Complete DHCPv6 client configuration
- IANA Information: Non-temporary address data
- IAPD Information: Prefix delegation data
- Timing Parameters: T1, T2, preferred/valid lifetimes
- Server Information: DHCPv6 server details
Storage Format:
// Main client structure
fwrite(data, sizeof(COSA_DML_DHCPCV6_FULL), 1, file);
// Current lease information (if available)
if (data->currentLease != NULL) {
fwrite(data->currentLease, sizeof(DHCPv6_PLUGIN_MSG), 1, file);
}
Recovery Process
System Startup Recovery
DHCPMgr_loadDhcpLeases()
Recovery Flow:
- Directory Scan: Check for existing lease files
- File Validation: Verify file integrity and format
- Data Loading: Read client and lease structures
- State Restoration: Restore TR-181 data model
- Client Restart: Restart DHCP clients with recovered state
Error Handling:
- Corrupted Files: Skip corrupted lease files
- Missing Data: Use default configurations
- Permission Issues: Log errors and continue
- Memory Allocation: Graceful degradation
DHCPv4 Recovery
load_v4dhcp_leases()
Recovery Steps:
- Client Enumeration: Iterate through TR-181 DHCPv4 clients
- File Location: Locate corresponding lease files
- Data Validation: Verify lease data integrity
- Structure Restoration: Rebuild client structures
- TR-181 Update: Update data model with recovered information
Validation Checks:
// File size validation
if (fileSize < sizeof(COSA_DML_DHCPC_FULL)) {
DHCPMGR_LOG_ERROR("Invalid file size for client %lu", instanceNum);
continue;
}
// Data integrity check
if (fread(&tempClient, sizeof(COSA_DML_DHCPC_FULL), 1, file) != 1) {
DHCPMGR_LOG_ERROR("Failed to read client data");
continue;
}
DHCPv6 Recovery
load_v6dhcp_leases()
IPv6-Specific Recovery:
- Client Iteration: Process all DHCPv6 client instances
- File Processing: Load DHCPv6 lease files
- IANA/IAPD Recovery: Restore address and prefix information
- Timer Restoration: Recover lease timing parameters
- Configuration Apply: Apply recovered IPv6 configuration
IPv6 Validation:
- Address Format: Validate IPv6 address formats
- Prefix Length: Verify prefix length validity
- Timer Values: Check timer consistency
- Server Information: Validate server data
Process Monitoring
PID Tracking System
Process Registration
int pids[MAX_PIDS]; // Array of monitored PIDs int pid_count = 0; // Number of active PIDs
dhcp_pid_mon(void *args)
Monitoring Process:
- PID Collection: Gather DHCP client process IDs
- Health Monitoring: Continuously check process status
- Failure Detection: Detect abnormal process termination
- Recovery Trigger: Initiate recovery on process failure
- Cleanup: Remove terminated processes from monitoring
Process Status Check:
for (int i = 0; i < pid_count; i++) {
if (pids[i] > 0) {
int status;
pid_t result = waitpid(pids[i], &status, WNOHANG);
if (result == pids[i]) {
// Process terminated
DHCPMGR_LOG_INFO("Process %d terminated", pids[i]);
processKilled(pids[i]);
pids[i] = -1;
active_pids--;
}
}
}
Process Recovery
processKilled(pid_t pid)
Recovery Actions:
- Process Identification: Determine which client process died
- State Assessment: Evaluate impact of process termination
- Cleanup: Clean up resources associated with dead process
- Restart Decision: Determine if restart is appropriate
- Recovery Execution: Restart client with recovered state
Data Integrity
File System Operations
Atomic Operations
- Temporary Files: Write to temporary files
- Backup Strategy: Maintain backup copies during updates
- Rollback Capability: Restore from backup on failure
Create_Dir_ifnEx(const char *path)
Directory Management:
if (access(path, F_OK) == -1) {
if (mkdir(path, 0755) == -1) {
DHCPMGR_LOG_ERROR("Failed to create directory %s", path);
return EXIT_FAIL;
}
}
Data Validation
File Integrity Checks
- Size Validation: Verify file size matches expected structure size
- Magic Numbers: Use file headers for format validation
- Checksum Verification: Validate data integrity
- Version Compatibility: Check file format versions
Structure Validation
- Pointer Validation: Ensure pointers are valid or NULL
- Range Checking: Validate numeric values within expected ranges
- String Validation: Check string lengths and null termination
- Consistency Checks: Verify related fields are consistent
Error Handling and Resilience
Recovery Strategies
Graceful Degradation
- Partial Recovery: Recover what data is available
- Default Configuration: Use defaults for missing data
- Continue Operation: Don’t let recovery failures stop service
- Error Reporting: Log issues for later investigation
Failure Scenarios
Corrupted Lease Files
- Detection: File size or format validation fails
- Action: Skip corrupted file, use default configuration
- Logging: Record corruption for investigation
Missing Files
- Detection: Expected lease file not found
- Action: Start with clean client configuration
- Logging: Note missing file (may be normal)
Permission Issues
- Detection: File system access denied
- Action: Continue without persistence
- Logging: Record permission issues
Memory Allocation Failures
- Detection: malloc() returns NULL
- Action: Skip affected client, continue with others
- Logging: Record memory pressure
Cleanup and Resource Management
remove_dhcp_lease_file(int instanceNumber, int dhcpVersion)
Purpose: Removes lease files when clients are deleted
Parameters:
- instanceNumber: TR-181 client instance number
- dhcpVersion: DHCP_v4 (0) or DHCP_v6 (1)
Cleanup Process:
- File Path Construction: Build path to lease file
- Existence Check: Verify file exists before deletion
- File Removal: Delete lease file from filesystem
- Error Handling: Log but don’t fail on deletion errors
Configuration and Customization
Storage Configuration
Directory Settings
#define TMP_DIR_PATH "/tmp/Dhcp_manager"
File Permissions
- Directory: 0755 (readable/executable by all, writable by owner)
- Files: Default permissions (typically 0644)
Size Limits
#define MAX_PIDS 20 // Maximum monitored processes
Recovery Behavior
Startup Behavior
- Auto-Recovery: Automatically load existing lease data
- Validation: Verify data integrity before use
- Fallback: Use defaults if recovery fails
Runtime Behavior
- Continuous Monitoring: Monitor process health
- Immediate Storage: Store lease changes immediately
- Periodic Cleanup: Remove stale lease files
Debugging and Troubleshooting
Debug Features
Comprehensive Logging
- Recovery Events: Log all recovery operations
- File Operations: Log file system operations
- Process Events: Log process monitoring events
- Error Conditions: Detailed error reporting
Common Issues
Recovery Failures
- Corrupted Files: Check file integrity and format
- Permission Denied: Verify file system permissions
- Memory Issues: Check available memory
- Process Issues: Verify process monitoring
Performance Issues
- Slow Recovery: Check file system performance
- High CPU Usage: Monitor process monitoring overhead
- Memory Leaks: Check resource cleanup
- File System Full: Monitor disk space usage
Diagnostic Tools
File Inspection
- File Size Check: Verify lease file sizes
- Content Validation: Manually inspect file contents
- Timestamp Analysis: Check file modification times
Process Monitoring
- PID Tracking: Monitor active process IDs
- Status Checking: Verify process health
- Resource Usage: Monitor process resource consumption

