-
Notifications
You must be signed in to change notification settings - Fork 0
FOLLOW-UP: Make containsWord rune-aware for non-ASCII headings (from PR #207) #209
Description
Context
This follow-up task was identified during the review of PR #207.
Source PR: #207
PR Title: feat: auto-tables — infer interactive UI from markdown tables + YAML sources
Suggested by: @claude[bot]
Task Description
containsWord in auto_tables.go uses isAlphanumeric(b byte) for word boundary checks, which operates on raw bytes. For multi-byte UTF-8 characters at word boundaries (e.g., accented characters in headings like "café expenses"), the byte at text[idx-1] could be a trailing byte of a multi-byte sequence.
Currently this happens to work correctly (trailing UTF-8 bytes are in 0x80-0xBF range, which isAlphanumeric treats as non-alphanumeric), but it's fragile. Use utf8.DecodeLastRuneInString / utf8.DecodeRuneInString and unicode.IsLetter / unicode.IsDigit for proper rune-level boundary checking.
This issue was automatically created by prmonitor from PR review comments.