Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
dedup
120 changes: 94 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,43 @@ go install github.com/datumbrain/dedup@latest
## Usage

```bash
# Scan current directory
# Scan current directory (shows what would be deleted)
dedup

# Scan specific directory
dedup /path/to/folder
dedup "C:\Users\Username\Downloads"

# Delete duplicates with confirmation
dedup -d /path/to/folder

# Delete duplicates without confirmation (force)
dedup -d -f /path/to/folder

# Verbose output with detailed information
dedup -v /path/to/folder

# Combine flags
dedup -d -f -v ~/Downloads
```

### Flags

- **`-d`**: Delete duplicate files (asks for confirmation before deleting)
- **`-f`**: Force deletion without confirmation (**must be combined with `-d`**)
- **`-v`**: Verbose output showing detailed priority information and deletion commands

**Note**: The `-f` flag only works when combined with `-d`. Using `-f` alone will not delete files.

## Features

- **Smart Detection**: Uses SHA-256 checksums for accurate duplicate detection
- **Intelligent Recommendations**: Automatically identifies which files to keep vs delete
- **Platform-Specific Commands**: Generates deletion commands for your OS (Windows/macOS/Linux)
- **Safe by Default**: Only scans files, never deletes anything automatically
- **Simple & Clean UI**: Minimal output by default, verbose mode available with `-v` flag
- **Safe Deletion**: Delete duplicates with `-d` flag, confirmation prompt by default
- **Force Mode**: Skip confirmation with `-f` flag for automated workflows
- **Platform-Specific Commands**: Generates deletion commands in verbose mode for your OS
- **Safe by Default**: Scan-only mode unless `-d` flag is specified
- **Non-Recursive**: Only scans the specified folder (doesn't go into subdirectories)

## How It Works
Expand All @@ -47,38 +70,69 @@ The tool prioritizes files based on common naming patterns:

## Example Output

```raw
============================================================
DUPLICATE FILES REPORT
============================================================
### Default Mode (Simple & Clean)

Duplicate Group #1 (Checksum: a665a45920422f9d...)
File Size: 1024 bytes
Files:
✓ KEEP: /Users/john/Downloads/invoice.pdf (Priority: 0)
✗ DELETE: /Users/john/Downloads/invoice (1).pdf (Priority: 1001)
```bash
$ dedup ~/Downloads

============================================================
DELETION RECOMMENDATIONS
============================================================
📁 Scanned 15 files in /Users/john/Downloads

🔍 Found 3 duplicate group(s)
💾 Can free 5.42 MB by deleting 8 file(s)

💡 Use -d to delete files (with confirmation)
💡 Use -d -f to delete without confirmation
💡 Use -v for detailed output
```

### Delete Mode with Confirmation

```bash
$ dedup -d ~/Downloads

Files recommended for deletion:
1. /Users/john/Downloads/invoice (1).pdf
📁 Scanned 15 files in /Users/john/Downloads

Summary:
- Total duplicate groups: 1
- Files recommended for deletion: 1
- Disk space that can be freed: 1.00 MB (1048576 bytes)
🔍 Found 3 duplicate group(s)
💾 Can free 5.42 MB by deleting 8 file(s)

Files to delete:
1. invoice (1).pdf
2. report-2.xlsx
3. image_copy.jpg
...

⚠️ About to delete 8 file(s). Continue? [y/N]: y

🗑️ Deleting files...

✅ Deleted 8 file(s), freed 5.42 MB
```

### Verbose Mode

```bash
$ dedup -v ~/Downloads

Scanning files in: /Users/john/Downloads
Calculating checksums...
Processing: invoice.pdf
Processing: invoice (1).pdf
...

============================================================
DELETION COMMANDS FOR DARWIN
DUPLICATE FILES REPORT
============================================================

# Terminal (recommended):
rm "/Users/john/Downloads/invoice (1).pdf"
Duplicate Group #1 (Checksum: a665a45920422f9d...)
File Size: 1024 bytes
Files with priorities:
Priority 0: invoice.pdf
Priority 1001: invoice (1).pdf
Decision:
✓ KEEP: /Users/john/Downloads/invoice.pdf (Priority: 0)
✗ DELETE: /Users/john/Downloads/invoice (1).pdf (Priority: 1001)

# Move to Trash (safer option):
osascript -e "tell application \"Finder\" to delete POSIX file \"/Users/john/Downloads/invoice (1).pdf\""
...
```

## Safety Features
Expand All @@ -103,6 +157,20 @@ cd dedup
go build -o dedup
```

## Testing

Run the unit tests:

```bash
go test -v
```

Run tests with coverage:

```bash
go test -v -cover
```

## License

MIT License - Feel free to use, modify, and distribute.
Expand Down
140 changes: 115 additions & 25 deletions main.go
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
package main

import (
"bufio"
"crypto/sha256"
"flag"
"fmt"
"io"
"os"
Expand Down Expand Up @@ -231,8 +233,25 @@ func generateDeletionCommands(filesToDelete []*FileInfo) {
fmt.Printf("\n• Consider creating a backup of important files first\n")
}

// deleteFile deletes a file and returns an error if it fails
func deleteFile(filePath string) error {
return os.Remove(filePath)
}

// confirmDeletion asks the user to confirm deletion
func confirmDeletion(filesToDelete []*FileInfo) bool {
fmt.Printf("\n⚠️ About to delete %d file(s). Continue? [y/N]: ", len(filesToDelete))
reader := bufio.NewReader(os.Stdin)
response, err := reader.ReadString('\n')
if err != nil {
return false
}
response = strings.TrimSpace(strings.ToLower(response))
return response == "y" || response == "yes"
}

// findDuplicates finds all duplicate files in the specified folder
func findDuplicates(folderPath string) error {
func findDuplicates(folderPath string, deleteMode bool, forceDelete bool, verbose bool) error {
// Map to store checksum -> list of files with that checksum
checksumMap := make(map[string][]*FileInfo)

Expand All @@ -242,8 +261,10 @@ func findDuplicates(folderPath string) error {
return fmt.Errorf("error reading directory: %v", err)
}

fmt.Printf("Scanning files in: %s\n", folderPath)
fmt.Println("Calculating checksums...")
if verbose {
fmt.Printf("Scanning files in: %s\n", folderPath)
fmt.Println("Calculating checksums...")
}

// Process each file (skip directories)
for _, entry := range entries {
Expand All @@ -253,7 +274,9 @@ func findDuplicates(folderPath string) error {

filePath := filepath.Join(folderPath, entry.Name())

fmt.Printf("Processing: %s\n", entry.Name())
if verbose {
fmt.Printf("Processing: %s\n", entry.Name())
}

fileInfo, err := getFileInfo(filePath)
if err != nil {
Expand All @@ -266,9 +289,13 @@ func findDuplicates(folderPath string) error {
}

// Find and display duplicates
fmt.Println("\n" + strings.Repeat("=", 60))
fmt.Println("DUPLICATE FILES REPORT")
fmt.Println(strings.Repeat("=", 60))
if !verbose {
fmt.Printf("\n📁 Scanned %d files in %s\n", len(checksumMap), folderPath)
} else {
fmt.Println("\n" + strings.Repeat("=", 60))
fmt.Println("DUPLICATE FILES REPORT")
fmt.Println(strings.Repeat("=", 60))
}

duplicateGroups := 0
var filesToDelete []*FileInfo
Expand All @@ -283,38 +310,55 @@ func findDuplicates(folderPath string) error {
return files[i].Priority < files[j].Priority
})

fmt.Printf("\nDuplicate Group #%d (Checksum: %s)\n", duplicateGroups, checksum[:16]+"...")
fmt.Printf("File Size: %d bytes\n", files[0].Size)
fmt.Println("Files with priorities:")
if verbose {
fmt.Printf("\nDuplicate Group #%d (Checksum: %s)\n", duplicateGroups, checksum[:16]+"...")
fmt.Printf("File Size: %d bytes\n", files[0].Size)
fmt.Println("Files with priorities:")

// Show all files with their priorities for debugging
for _, file := range files {
fmt.Printf(" Priority %d: %s\n", file.Priority, filepath.Base(file.Path))
}

// Show all files with their priorities for debugging
for _, file := range files {
fmt.Printf(" Priority %d: %s\n", file.Priority, filepath.Base(file.Path))
fmt.Println("Decision:")
}

fmt.Println("Decision:")
// First file (lowest priority number) should be kept
keepFile := files[0]
fmt.Printf(" ✓ KEEP: %s (Priority: %d)\n", keepFile.Path, keepFile.Priority)
if verbose {
fmt.Printf(" ✓ KEEP: %s (Priority: %d)\n", keepFile.Path, keepFile.Priority)
}

// Rest should be deleted
for i := 1; i < len(files); i++ {
file := files[i]
fmt.Printf(" ✗ DELETE: %s (Priority: %d)\n", file.Path, file.Priority)
if verbose {
fmt.Printf(" ✗ DELETE: %s (Priority: %d)\n", file.Path, file.Priority)
}
filesToDelete = append(filesToDelete, file)
totalSizeToSave += file.Size
}
}
}

if duplicateGroups == 0 {
fmt.Println("\nNo duplicate files found!")
} else {
fmt.Println("\n✅ No duplicate files found!")
return nil
}

// Summary output
if verbose {
fmt.Printf("\n" + strings.Repeat("=", 60))
fmt.Printf("\nDELETION RECOMMENDATIONS")
fmt.Printf("\n" + strings.Repeat("=", 60))
}

if len(filesToDelete) > 0 {
if len(filesToDelete) > 0 {
if !verbose {
fmt.Printf("\n🔍 Found %d duplicate group(s)\n", duplicateGroups)
fmt.Printf("💾 Can free %.2f MB by deleting %d file(s)\n\n",
float64(totalSizeToSave)/(1024*1024), len(filesToDelete))
} else {
fmt.Printf("\nFiles recommended for deletion:\n")
for i, file := range filesToDelete {
fmt.Printf("%d. %s\n", i+1, file.Path)
Expand All @@ -325,20 +369,66 @@ func findDuplicates(folderPath string) error {
fmt.Printf("- Files recommended for deletion: %d\n", len(filesToDelete))
fmt.Printf("- Disk space that can be freed: %.2f MB (%.0f bytes)\n",
float64(totalSizeToSave)/(1024*1024), float64(totalSizeToSave))
}

// Generate platform-specific deletion commands
generateDeletionCommands(filesToDelete)
// Delete mode
if deleteMode {
// Ask for confirmation unless force flag is set
if !forceDelete {
if !verbose {
fmt.Println("Files to delete:")
for i, file := range filesToDelete {
fmt.Printf(" %d. %s\n", i+1, filepath.Base(file.Path))
}
}
if !confirmDeletion(filesToDelete) {
fmt.Println("\n❌ Deletion cancelled.")
return nil
}
}

// Perform deletion
fmt.Println("\n🗑️ Deleting files...")
deleted := 0
var deletedSize int64
for _, file := range filesToDelete {
if err := deleteFile(file.Path); err != nil {
fmt.Printf(" ❌ Failed to delete %s: %v\n", filepath.Base(file.Path), err)
} else {
deleted++
deletedSize += file.Size
if verbose {
fmt.Printf(" ✓ Deleted: %s\n", file.Path)
}
}
}
fmt.Printf("\n✅ Deleted %d file(s), freed %.2f MB\n", deleted, float64(deletedSize)/(1024*1024))
} else {
// Show deletion commands only in verbose mode
if verbose {
generateDeletionCommands(filesToDelete)
} else {
fmt.Println("💡 Use -d to delete files (with confirmation)")
fmt.Println("💡 Use -d -f to delete without confirmation")
fmt.Println("💡 Use -v for detailed output")
}
}
}

return nil
}

func main() {
// Get folder path from command line argument or use current directory
// Define flags
deleteFlag := flag.Bool("d", false, "Delete duplicate files")
forceFlag := flag.Bool("f", false, "Force deletion without confirmation (use with -d)")
verboseFlag := flag.Bool("v", false, "Verbose output with detailed information")
flag.Parse()

// Get folder path from remaining arguments or use current directory
folderPath := "."
if len(os.Args) > 1 {
folderPath = os.Args[1]
if flag.NArg() > 0 {
folderPath = flag.Arg(0)
}

// Verify the path exists and is a directory
Expand All @@ -354,7 +444,7 @@ func main() {
}

// Find duplicates
if err := findDuplicates(folderPath); err != nil {
if err := findDuplicates(folderPath, *deleteFlag, *forceFlag, *verboseFlag); err != nil {
fmt.Printf("Error: %v\n", err)
os.Exit(1)
}
Expand Down
Loading