Regression in selector parsing: Attribute selectors not parsed correctly

Thank you for this fork of `css`. It's very helpful to us.

While testing, I think I have found a regression in the parser, introduced with commit https://github.com/adobe/css-tools/commit/24ed6e753fee6f0874446953ed75ae35806ab98c:

https://github.com/adobe/css-tools/blob/a0efaee362ef472ab00c9b2e2786289808081ae3/src/parse/index.ts#L215-L239

The regular expression `/("|')(?:\\\1|.)*?,(?:\\\1|.)*?\1|\(.*?,.*?\)/g` seems to be too greedy for cases like `div[class='foo'],div[class='bar']`. For this example, it captures `'foo'],div[class='`, leading to an incorrect replacement of the comma that separates the two selectors and is not part of the data selector.

The original regular expression, used before this change ([`/"(?:\\"|[^"])*"|'(?:\\'|[^'])*'/g`](https://github.com/adobe/css-tools/blob/434aa1733f275a67ea700311451b98a14f8cc21a/lib/parse/index.js#L202)) handles this case correctly by matching `'foo'` and `'bar'`.

Ultimately, the current behavior leads to an incorrect AST, where instead of two selectors, only one (incorrect) selector is listed:
```json
"type": "rule",
"selectors": [
    "div[data-value='foo'],div[data-value='bar']",
],
```

**Expected:**
```json
"type": "rule",
"selectors": [
    "div[data-value='foo']",
    "div[data-value='bar']"
],
```

I have attached a test case demonstrating this. You can extract the archive directly into `test/cases/`:
[case - selectors-attributes.zip](https://github.com/adobe/css-tools/files/10156598/case.-.selectors-attributes.zip)


	/**
	* replace ',' by \u200C for data selector (div[data-lang="fr,de,us"])
	* replace ',' by \u200C for nthChild and other selector (div:nth-child(2,3,4))
	*
	* Examples:
	* div[data-lang="fr,\"de,us"]
	* div[data-lang='fr,\'de,us']
	* div:matches(.toto, .titi:matches(.toto, .titi))
	*
	* Regex logic:
	* ("\|')(?:\\\1\|.)?,(?:\\\1\|.)?\1 => Handle the " and '
	* \(.?,.?\) => Handle the ()
	*
	* Optimization 0:
	* No greedy capture (see docs about the difference between .* and .*?)
	*
	* Optimization 1:
	* \(.?,.?\) instead of \(.*?\) to limit the number of replace (don't need to replace if , is not in the string)
	*
	* Optimization 2:
	* ("\|')(?:\\\1\|.)?,(?:\\\1\|.)?\1 this use reference to capture group, it work faster.
	*/
	.replace(/("\|')(?:\\\1\|.)?,(?:\\\1\|.)?\1\|\(.?,.?\)/g, m =>
	m.replace(/,/g, '\u200C')
	)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression in selector parsing: Attribute selectors not parsed correctly #77

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Regression in selector parsing: Attribute selectors not parsed correctly #77

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions