Legacy String Parser Removal - Complete

Date: 2025-07-29 14:15

Summary

Successfully completed the removal of legacy string parser code from the XMD project. The system now exclusively uses the AST-based parser for all XMD processing.

Changes Made

1. Updated Test Files

  • Modified all test files to use AST functions instead of legacy string parser functions
  • Updated function declarations and imports in:
    • test/xmd_processor/*.c
    • test/integration/*.c
    • test/stress/*.c

2. Removed Legacy Parser Files (26 directories)

Deleted the following legacy string parser implementations:

Primary String Parsers:

  • process_xmd_content/ - Main legacy string parser
  • process_xmd_content_enhanced/ - Enhanced string parser (hybrid)
  • process_xmd_content_fixed/ - Fixed string parser variant

Directive Processors:

  • process_directive/ - Single directive processor
  • process_multiline_directive/ - Multiline directive handler
  • process_multiline_directive_enhanced/ - Enhanced multiline handler
  • process_multiline_block/ - Block processor

Utility Functions:

  • parse_array_literal/ - String-based array parser
  • substitute_variables/ - String-based variable substitution
  • process_xmd_directive/ - Directive wrapper

Individual Directive Handlers:

  • process_exec/, process_set/, process_if/, process_elif/
  • process_else/, process_endif/, process_for/
  • process_import/, process_loop_body/
  • find_matching_endfor/

Function Processors:

  • process_cmd_function/ - cmd() function handler
  • process_print_function/ - print() function handler
  • process_script_block/ - Script block handler

Dead Code:

  • parse_range/, parse_collection/ - Unused parsers
  • preprocess_variables/ - Unused preprocessor

3. Files Retained

The following files were kept as they are still actively used:

  • trim_whitespace/ - Shared utility
  • should_execute_block/ - Conditional execution utility
  • xmd_validate_syntax/ - Used by CLI
  • process_text_with_directives/ - Uses AST internally
  • xmd_process_string/ - Main API that calls AST

Results

  • Build succeeds with all legacy parsers removed
  • The project is now ~26 directories lighter
  • All production code paths use the AST parser
  • Clear separation between AST implementation and utilities

Known Issues

  1. AST Parser elif/else Bug: The AST parser has a bug where it outputs content from both the matching elif branch AND the else branch. This needs to be fixed in ast_process_xmd_content.c.

  2. Test Failures: Some tests fail due to the elif/else bug, not due to the migration itself.

Benefits

  1. Cleaner Codebase: Removed confusing hybrid implementations
  2. Single Parser: Only one parsing implementation to maintain
  3. Clear Architecture: No more confusion about which parser is being used
  4. Reduced Complexity: Eliminated self-referential legacy code

Next Steps

  1. Fix the AST parser elif/else bug
  2. Update CMakeLists.txt if needed (though GLOB_RECURSE handles it automatically)
  3. Verify all tests pass after bug fix