String Unicode Fragment Issues

:wave: Hey folks! I saw the NUM project Reddit and thought it was a great idea.  Diving in I found MODL, which made me even more excited, but noticed it didn't have a ton of libraries yet, so I started hacking on one just to see if I could get something working.  It's still very much a work in progress, and just something I'm hacking on in my free time, so no promises on quality.

https://github.com/bign8/modl.go

Anyway, I ran into an issue with my unicode parsing logic.  Based on the test added in https://github.com/MODLanguage/grammar/commit/d0668494130c14bd6b7989ffc4a0867e474e7aff, it appears MODL is supporting non-4 digit unicode characters which doesn't seem to match with the grammar defined below or the written specification: https://www.modl.uk/specification#hex-values.

https://github.com/MODLanguage/grammar/blob/3c788096bc6367ceb2955d51672e61ea317652fc/antlr4/MODLLexer.g4#L73-L78

But, the Java library looks to support this behavior, which is great, I just didn't notice it really documented anywhere besides the test case and in the java source.

https://github.com/MODLanguage/java-interpreter/blob/d9cc9d76f73687a03114d57fccc253c3c82fad71/src/main/java/uk/modl/utils/UnicodeEscapeReplacer.java#L104-L174

Given the complexity of the `UnicodeEscapeReplacer`, I'm not really sure the best way to represent those nuances in the grammar effectively.  But having a note somewhere that non-4 digit code points are supported would be dope. Anyway, let me know what you think and I can get something in a PR for ya.

Cheers :beers: 

	fragment UNICODE
	: 'u' HEX HEX HEX HEX
	;
	fragment HEX
	: [0-9a-fA-F]
	;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

String Unicode Fragment Issues #44

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

String Unicode Fragment Issues #44

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions