Fix handling of backslashes and Unicode escape sequences in CSS content#398
Fix handling of backslashes and Unicode escape sequences in CSS content#398liulinboyi wants to merge 1 commit intoservo:mainfrom
Conversation
b187e1d to
7d06d94
Compare
7d06d94 to
b9a6b7c
Compare
b9a6b7c to
e1336a5
Compare
emilio
left a comment
There was a problem hiding this comment.
This PR addresses an issue in the handling of backslashes and Unicode escape sequences in CSS content. The current implementation may incorrectly escape backslashes, which can break valid CSS syntax, especially when dealing with Unicode escape sequences like \00a0.
Do you have an example of the broken serialization? This should have a test at least. But I don't understand off-hand how this is correct.
|
In particular, you're changing the serialization of an already unescaped CSS string. So |
|
Ok, pretty sure this is incorrect. Here's a test: diff --git a/src/tests.rs b/src/tests.rs
index 3c122f0..ca90faa 100644
--- a/src/tests.rs
+++ b/src/tests.rs
@@ -1351,3 +1351,20 @@ fn servo_define_css_keyword_enum() {
assert_eq!(UserZoom::from_ident("fixed"), Ok(UserZoom::Fixed));
}
+
+#[test]
+fn serialize_backslash_roundtrips() {
+ fn parse_string(css_string: &str) -> String {
+ let mut input = ParserInput::new(&css_string);
+ let mut input = Parser::new(&mut input);
+ input.expect_string().unwrap().as_ref().into()
+ }
+ let parsed_string = parse_string(r#""\00a0""#);
+ assert_eq!(parsed_string, "\u{a0}");
+
+ // Serialization round-trips.
+ let mut serialized_string = String::new();
+ super::serializer::serialize_string(&parsed_string, &mut serialized_string).unwrap();
+ assert_eq!(parse_string(&serialized_string), parsed_string);
+}That test passes, and the serialization code generates a compatible string. It seems the issue is in the caller trying to serialize something that has been already escaped... or something along those lines. |
This PR addresses an issue in the handling of backslashes and Unicode escape sequences in CSS content. The current implementation may incorrectly escape backslashes, which can break valid CSS syntax, especially when dealing with Unicode escape sequences like \00a0.
Changes:
Improved Handling of Backslashes:
The code now correctly distinguishes between backslashes used for escaping specific characters (like " or ) and those that are part of Unicode escape sequences.
If a backslash is followed by a valid Unicode escape sequence (e.g., \00a0), it is preserved as-is.
If a backslash is followed by any other character, it is properly escaped to \.
Added Validation for Unicode Escape Sequences:
A helper function is_valid_unicode_escape is introduced to check if a sequence of characters following a backslash forms a valid Unicode escape sequence.
This ensures that only valid sequences are preserved, while invalid sequences are treated as regular text.
Code Changes: