Releases: ctrlbreak/filematcher
v1.5.2
What's Changed
Fixed
- Auto-confirm display bug — when pressing 'a' to process all remaining groups, output was corrupted due to incorrect cursor positioning
Changed
- Simplified README — reduced from 571 to 214 lines, extracted JSON schema to separate file
- Improved documentation — clearer examples using
master_dir/other_dirnaming, expanded jq examples
Added
- Animated demo GIF — shows interactive hardlink execution with colored output
- Demo regeneration script —
create_demo.shfor updating the GIF - JSON schema reference —
JSON_SCHEMA.mdwith full schema documentation
v1.5.1 - Code Quality Improvements
What's Changed
This patch release documents internal code quality improvements made since v1.5.0.
Code Quality Improvements
- Reduced public API surface from 89 to 18 exports in
__init__.py- cleaner, more focused public interface - Fixed exit code inconsistency - partial failures now correctly return exit code 2 (was inconsistent)
- Reduced main() complexity from 425 to 145 lines through extraction of 6 helper functions:
_validate_args()- argument validation_setup_logging()- audit log initialization_build_master_results()- master directory result assembly_dispatch_compare_mode()- compare mode handling_dispatch_execute_mode()- execute mode handling_dispatch_preview_mode()- preview mode handling
Internal Changes
These changes improve maintainability without affecting user-facing behavior:
- Cleaner separation of concerns in CLI module
- Better testability through smaller functions
- Consistent exit codes across all code paths
Full Changelog: https://github.com/PatrickKoss/filematcher/compare/v1.5.0...v1.5.1
v1.5.0 - Interactive Execute
What's New
Interactive Execute Mode - Execute mode now prompts for confirmation on each file, giving you full control over which duplicates are processed.
Features
- Per-file confirmation (
y/n/a/q) - Review each duplicate before actiony- Yes, process this filen- No, skip this filea- All, process remaining without promptsq- Quit, stop and show summary
- JSON schema v2.0 - Restructured output with unified header object
- Version, timestamp, mode, action, hash algorithm in header
- Consistent
master/duplicatedirectory naming
- Fail-fast validation - Invalid flag combinations caught before file scanning
- Enhanced summaries - Shows confirmed/skipped/failed counts with space savings
Breaking Changes
- JSON output schema changed to v2.0 (header is now an object)
--json --executenow requires--yesflag--quiet --executenow requires--yesflag
Exit Codes
0- Success1- Error2- Partial failure (some operations failed)130- User quit (Ctrl+C orq)
Full Changelog
https://github.com/pchaganti/filematcher/compare/v1.4.0...v1.5.0
v1.4.0
File Matcher v1.4.0 Release Notes
Release Date: 2026-01-28
Overview
File Matcher v1.4.0 refactors the codebase from a single-file implementation to a proper Python package structure. This improves code navigation, IDE support, and maintainability while maintaining full backward compatibility.
What's New
Package Structure
The codebase has been reorganized into a filematcher/ package with focused modules:
filematcher/
├── __init__.py # Package exports
├── __main__.py # python -m filematcher support
├── cli.py # Command-line interface and main()
├── colors.py # TTY-aware color output
├── hashing.py # MD5/SHA-256 content hashing
├── filesystem.py # Filesystem helpers (hardlink detection, etc.)
├── actions.py # Action execution and audit logging
├── formatters.py # Text and JSON output formatters
└── directory.py # Directory indexing and matching
Installation Options
# Install via pip (recommended)
pip install .
filematcher dir1 dir2
# Or run directly (still works)
python file_matcher.py dir1 dir2Full Backward Compatibility
All existing usage patterns continue to work:
# CLI usage unchanged
python file_matcher.py dir1 dir2
filematcher dir1 dir2
# Imports still work
from file_matcher import get_file_hash, find_matching_files
from filematcher import get_file_hash, find_matching_files # New preferred styleTechnical Details
- 7 modules in filematcher/ package
- file_matcher.py is now a thin wrapper (re-exports from package)
- 218 unit tests (217 original + 1 circular import verification test)
- No circular imports (verified via subprocess test)
- Pure Python standard library (no external dependencies)
- Python 3.9+ required
Module Organization
| Module | Purpose |
|---|---|
cli.py |
Argument parsing, main() entry point, logging setup |
colors.py |
ColorConfig, ColorMode, ANSI helpers (green, red, etc.) |
hashing.py |
get_file_hash(), get_sparse_hash() for content hashing |
filesystem.py |
is_hardlink_to(), is_symlink_to(), check_cross_filesystem() |
actions.py |
execute_action(), safe_replace_with_link(), audit logging |
formatters.py |
ActionFormatter ABC, TextActionFormatter, JsonActionFormatter |
directory.py |
find_matching_files(), index_directory() |
Upgrading from v1.3.0
v1.4.0 is fully backward compatible. No changes required to existing scripts or workflows.
Optional migration: Update imports from file_matcher to filematcher:
# Old (still works)
from file_matcher import get_file_hash, find_matching_files
# New (preferred)
from filematcher import get_file_hash, find_matching_filesWhy Package Structure?
- Better IDE Support: Jump-to-definition works across modules
- Easier Navigation: Find code by module name instead of scrolling 2000+ lines
- AI Tooling: LLMs can read focused modules instead of entire codebase
- Maintainability: Changes to one module don't affect others
- Testing: Easier to test individual modules in isolation
Links
v1.1.0 Deduplication
File Matcher v1.1.0 Release Notes
Release Date: 2026-01-20
Overview
File Matcher v1.1.0 adds full file deduplication capability. You can now replace duplicate files with hard links, symbolic links, or delete them entirely—all with preview-by-default safety and comprehensive audit logging.
New Features
Master Directory Protection
Designate one directory as "master" using --master. Files in the master directory are never modified or deleted.
filematcher dir1 dir2 --master dir1 --action hardlinkDeduplication Actions
Three ways to handle duplicates:
- hardlink — Replace duplicate with hard link to master (same inode, saves space)
- symlink — Replace duplicate with symbolic link to master
- delete — Remove duplicate file entirely (irreversible)
Preview-by-Default Safety
Running with --action shows a preview of what would happen. No files are modified until you add --execute.
# Preview only (safe)
filematcher dir1 dir2 --master dir1 --action hardlink
# Actually execute
filematcher dir1 dir2 --master dir1 --action hardlink --executeConfirmation Prompt
Before execution, you'll see a confirmation prompt. Use --yes to skip for scripted use.
Audit Logging
Every modification is logged with timestamps. Use --log for a custom log path.
filematcher dir1 dir2 --master dir1 --action hardlink --execute --log changes.logCross-Filesystem Support
Hard links can't span filesystems. Use --fallback-symlink to automatically use symbolic links when hard links fail.
New Command-Line Options
| Option | Description |
|---|---|
--master, -m |
Master directory (files never modified) |
--action, -a |
Action: hardlink, symlink, or delete |
--execute |
Execute changes (default: preview only) |
--yes, -y |
Skip confirmation prompt |
--log, -l |
Custom audit log path |
--fallback-symlink |
Use symlink if hardlink fails |
Exit Codes
| Code | Meaning |
|---|---|
| 0 | Success (or user aborted) |
| 1 | All operations failed |
| 2 | Invalid arguments |
| 3 | Partial success (some failed) |
Example Workflow
# 1. Find duplicates
filematcher photos_backup photos_main
# 2. Preview deduplication
filematcher photos_backup photos_main --master photos_main --action hardlink
# 3. Execute with logging
filematcher photos_backup photos_main --master photos_main --action hardlink --execute --log dedup.logTechnical Details
- Atomic operations using temp-rename pattern (no data loss on failure)
- 1,374 lines of Python
- 114 unit tests, all passing
- Pure Python standard library (no external dependencies)
- Python 3.9+ required
Upgrading from v1.0.0
v1.1.0 is fully backward compatible. All v1.0.0 commands work unchanged. The new deduplication features are opt-in via --master and --action flags.
Links
v1.0.0
File Matcher v1.0.0 Release Notes
🎉 First Stable Release!
This is the first stable release of File Matcher, a powerful tool for finding duplicate files across different directory structures.
✨ What's New
Core Features
- File Matching: Find files with identical content using content hashing
- Multiple Hash Algorithms: Support for both MD5 (fast) and SHA-256 (secure)
- Directory Traversal: Recursively scan nested directory structures
- Cross-Platform: Works on Windows, macOS, and Linux
Advanced Features
- Fast Mode: Efficient comparison of large files (>100MB) using sparse sampling
- Summary Mode: Quick overview showing match counts and unmatched files
- Detailed Mode: Full listing of all file paths and matches
- Large File Support: Efficient chunked reading (4KB chunks) for memory management
Developer Experience
- Comprehensive Test Suite: 16 unit tests covering all functionality
- No Dependencies: Uses only Python standard library
- Well Documented: Complete README with examples and usage instructions
🚀 Getting Started
Quick Install
# Download and extract the release package
# No installation required - just Python 3.6+
# Run tests to verify
python3 run_tests.py
# Basic usage
python3 file_matcher.py <directory1> <directory2>Examples
# Compare two directories
python3 file_matcher.py test_dir1 test_dir2
# Use fast mode for large files
python3 file_matcher.py test_dir1 test_dir2 --fast
# Get summary only
python3 file_matcher.py test_dir1 test_dir2 --summary
# Use SHA-256 hashing
python3 file_matcher.py test_dir1 test_dir2 --hash sha256🔧 Technical Details
- Python Version: 3.6 or higher
- Dependencies: None (standard library only)
- File Size Limit: No practical limit (handles multi-GB files efficiently)
- Memory Usage: Optimized with chunked reading and sparse sampling
📁 What's Included
file_matcher.py- Main scriptREADME.md- Complete documentationCHANGELOG.md- Version historytests/- Comprehensive test suitetest_dir1/,test_dir2/- Example test directoriescomplex_test/- Advanced test scenarios
🐛 Bug Reports & Feedback
If you find any issues or have suggestions, please open an issue on GitHub.
📄 License
This project is open source and available under the MIT License.
Download: Choose the appropriate archive for your platform:
filematcher-1.0.0.zip- Windows/macOS usersfilematcher-1.0.0.tar.gz- Linux/Unix users