Skip to content

FOLLOW-UP: Make containsWord rune-aware for non-ASCII headings (from PR #207) #209

@adnaan

Description

@adnaan

Context

This follow-up task was identified during the review of PR #207.

Source PR: #207
PR Title: feat: auto-tables — infer interactive UI from markdown tables + YAML sources
Suggested by: @claude[bot]

Task Description

containsWord in auto_tables.go uses isAlphanumeric(b byte) for word boundary checks, which operates on raw bytes. For multi-byte UTF-8 characters at word boundaries (e.g., accented characters in headings like "café expenses"), the byte at text[idx-1] could be a trailing byte of a multi-byte sequence.

Currently this happens to work correctly (trailing UTF-8 bytes are in 0x80-0xBF range, which isAlphanumeric treats as non-alphanumeric), but it's fragile. Use utf8.DecodeLastRuneInString / utf8.DecodeRuneInString and unicode.IsLetter / unicode.IsDigit for proper rune-level boundary checking.


This issue was automatically created by prmonitor from PR review comments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3-lowLow: extended features, operational docsfollow-upFollow-up task from PR reviewfrom-reviewIssue originated from PR review

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions