Skip to content

Fix ExifTool test crashes: substr warning, named sub return, bitwise cast#256

Merged
fglock merged 9 commits intomasterfrom
fix-exiftool-crashes
Mar 1, 2026
Merged

Fix ExifTool test crashes: substr warning, named sub return, bitwise cast#256
fglock merged 9 commits intomasterfrom
fix-exiftool-crashes

Conversation

@fglock
Copy link
Owner

@fglock fglock commented Mar 1, 2026

Summary

Fixes crashes and binary data handling issues encountered when running ExifTool tests:

Crash fixes

  • substr outside of string: Changed from fatal exception to warning + return undef, matching Perl 5 behavior. Fixed both the read path (Operator.substr()) and lvalue write path (RuntimeSubstrLvalue.set()).
  • Named sub "did not return a true value": handleNamedSubWithFilter now returns NumberNode("1") instead of an empty ListNode.
  • Interpreter bitwise ClassCastException: All 8 interpreter bitwise operator handlers now use .scalar() instead of direct casts.

Binary data (BYTE_STRING) preservation

  • Encode::is_utf8(): Fixed inverted result.
  • sysread for scalar refs: ScalarBackedIO now properly supports sysread.
  • STRING+BYTE_STRING concat: Returns BYTE_STRING when no wide chars present, matching Perl 5 utf8 flag semantics.
  • File read/sysread: CustomFileChannel and ScalarBackedIO return byte[] (BYTE_STRING) instead of StringBuilder (STRING).
  • Readline.read(): Preserves BYTE_STRING type through offset/buffer handling.

Results

  • Writer.t: went from crashing to 29/61 passing
  • RIFF.t: substr crash fixed
  • All unit tests pass (make succeeds)

Test plan

  • make passes (build + unit tests)
  • lvalue_substr.t updated and passes (16/16)
  • Writer.t runs to completion without crashes
  • Binary data round-trips correctly through read/concat/write

Generated with Devin

fglock and others added 9 commits March 1, 2026 19:50
…cast

- substr outside of string: emit warning + return undef instead of
  throwing, both read path (Operator.substr) and write path
  (RuntimeSubstrLvalue.set). Matches Perl 5 behavior.
- Named sub definitions now return NumberNode("1") instead of empty
  ListNode, fixing "did not return a true value" for modules ending
  with a sub definition (e.g. Protobuf.pm).
- Interpreter bitwise operators use .scalar() instead of direct
  RuntimeScalar casts, fixing ClassCastException when register holds
  RuntimeList (e.g. Pentax.pm via interpreter backend).

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
Operators like unpack, localtime, gmtime, caller return RuntimeList
even in scalar context. The JVM backend handles this with explicit
.scalar() calls, but the interpreter backend stored RuntimeList
directly into registers. Add scalar conversion in MiscOpcodeHandler
before storing results, mirroring the JVM backend behavior.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
When a forward-declared sub (e.g. sub foo;) had no implementation,
calling it would throw Undefined subroutine without trying AUTOLOAD.
This affected both instance apply() methods in RuntimeCode.

The fix adds explicit methodHandle null checks after compilerSupplier
runs, with AUTOLOAD fallback logic. Also fixes subroutine name resolution
to prefer the code objects own packageName::subName over the caller-provided
name (which could be tailcall from the JVM trampoline).

Fixes 19 ExifTool Lang.t test failures.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
Encode::is_utf8() was returning true for BYTE_STRING and false for
STRING - the exact opposite of Perl 5 semantics. This caused ExifTool
Sanitize() and WriteInfo() to double-encode UTF-8 data.

Also fixed sysread() on scalar-backed IO handles (open $var) which
was returning undef instead of delegating to ScalarBackedIO.sysread().

Fixes IPTC, XMP, and PNG test failures related to UTF-8 encoding.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
…nd sysread

- StringOperators: STRING+BYTE_STRING concat returns BYTE_STRING when no
  wide chars present, matching Perl 5 utf8 flag semantics
- CustomFileChannel: read() and sysread() return byte[] (BYTE_STRING)
  instead of StringBuilder (STRING) to preserve binary data type
- ScalarBackedIO: same byte[] return for read/sysread and write methods
- Readline.read(): preserve BYTE_STRING type through offset handling

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
MOVE was doing reference copy (aliasing) for all register transfers.
This caused my/our variable declarations to share RuntimeScalar objects
with source variables, so in-place mutations (+=, ++, etc.) on one
would corrupt the other.

- Rename MOVE to ALIAS to clarify its reference-sharing semantics
- Use MY_SCALAR (new RuntimeScalar + value copy) for my $x = expr
- Use SET_SCALAR for our ($a,...) = (...) to preserve global binding
- Use LOAD_UNDEF + SET_SCALAR for bare ident and list assignments
- Implement MY_SCALAR handler in BytecodeInterpreter
- Add MY_SCALAR to disassembler

ALIAS remains correct for temp register shuffling (block results,
ternary, ||/&&//) where the source is ephemeral.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
- If either operand is STRING type, always return STRING (prevents
  BYTE_STRING from downgrading wide-char strings)
- Add wide-char safety check in BYTE_STRING concat path to prevent
  ISO-8859-1 truncation of chars > 255
- Fixes: undef .= "\x{1ff}" producing 63 instead of 511
- Eliminates regressions in chomp.t (-50), sprintf.t (-9), utf.t (-48)

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
- Readline.read(): Do not force BYTE_STRING when existing buffer has
  wide chars (>255); add safety check before ISO-8859-1 conversion.
  Fixes 84 read.t regressions.

- SubroutineParser: Revert named sub return value from NumberNode(1)
  to ListNode, restoring correct module_true semantics.
  Fixes 56 require.t regressions.

- StringOperators: Re-read left operand after right operand evaluation
  in STRING and fallback concat paths, matching Perl 5 behavior where
  tied FETCH side effects are visible to earlier operands.
  Fixes 1 concat.t regression (RT #132595).

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
@fglock fglock merged commit 54c559f into master Mar 1, 2026
2 checks passed
@fglock fglock deleted the fix-exiftool-crashes branch March 1, 2026 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant