Thank you for this fork of css. It's very helpful to us.
While testing, I think I have found a regression in the parser, introduced with commit 24ed6e7:
|
/** |
|
* replace ',' by \u200C for data selector (div[data-lang="fr,de,us"]) |
|
* replace ',' by \u200C for nthChild and other selector (div:nth-child(2,3,4)) |
|
* |
|
* Examples: |
|
* div[data-lang="fr,\"de,us"] |
|
* div[data-lang='fr,\'de,us'] |
|
* div:matches(.toto, .titi:matches(.toto, .titi)) |
|
* |
|
* Regex logic: |
|
* ("|')(?:\\\1|.)*?,(?:\\\1|.)*?\1 => Handle the " and ' |
|
* \(.*?,.*?\) => Handle the () |
|
* |
|
* Optimization 0: |
|
* No greedy capture (see docs about the difference between .* and .*?) |
|
* |
|
* Optimization 1: |
|
* \(.*?,.*?\) instead of \(.*?\) to limit the number of replace (don't need to replace if , is not in the string) |
|
* |
|
* Optimization 2: |
|
* ("|')(?:\\\1|.)*?,(?:\\\1|.)*?\1 this use reference to capture group, it work faster. |
|
*/ |
|
.replace(/("|')(?:\\\1|.)*?,(?:\\\1|.)*?\1|\(.*?,.*?\)/g, m => |
|
m.replace(/,/g, '\u200C') |
|
) |
The regular expression /("|')(?:\\\1|.)*?,(?:\\\1|.)*?\1|\(.*?,.*?\)/g seems to be too greedy for cases like div[class='foo'],div[class='bar']. For this example, it captures 'foo'],div[class=', leading to an incorrect replacement of the comma that separates the two selectors and is not part of the data selector.
The original regular expression, used before this change (/"(?:\\"|[^"])*"|'(?:\\'|[^'])*'/g) handles this case correctly by matching 'foo' and 'bar'.
Ultimately, the current behavior leads to an incorrect AST, where instead of two selectors, only one (incorrect) selector is listed:
"type": "rule",
"selectors": [
"div[data-value='foo'],div[data-value='bar']",
],
Expected:
"type": "rule",
"selectors": [
"div[data-value='foo']",
"div[data-value='bar']"
],
I have attached a test case demonstrating this. You can extract the archive directly into test/cases/:
case - selectors-attributes.zip
Thank you for this fork of
css. It's very helpful to us.While testing, I think I have found a regression in the parser, introduced with commit 24ed6e7:
css-tools/src/parse/index.ts
Lines 215 to 239 in a0efaee
The regular expression
/("|')(?:\\\1|.)*?,(?:\\\1|.)*?\1|\(.*?,.*?\)/gseems to be too greedy for cases likediv[class='foo'],div[class='bar']. For this example, it captures'foo'],div[class=', leading to an incorrect replacement of the comma that separates the two selectors and is not part of the data selector.The original regular expression, used before this change (
/"(?:\\"|[^"])*"|'(?:\\'|[^'])*'/g) handles this case correctly by matching'foo'and'bar'.Ultimately, the current behavior leads to an incorrect AST, where instead of two selectors, only one (incorrect) selector is listed:
Expected:
I have attached a test case demonstrating this. You can extract the archive directly into
test/cases/:case - selectors-attributes.zip