-
-
Notifications
You must be signed in to change notification settings - Fork 48
Milestone
Description
Describe the Proposal
To easily work with the new \DOM\HTMLDocument, it would be good to introduce a new entry, type & cast for this. This would be a nice addition to the existing XML type, which would allow much easier side scraping and data extraction from those.
In theory, we could extend the existing XMLEntry, but that one is more specific, and a dedicated one sounds like a better idea.
API Adjustments
Entry:
/**
* @implements Entry<?\DOM\HTMLDocument>
*/
final class HTMLEntry implements Entry
{
use EntryRef;
public function __construct(
private readonly string $name,
HTMLDocument|string|null $value,
) {
if (\is_string($value)) {
try {
$doc = \DOM\HTMLDocument::createFromString($value, \LIBXML_COMPACT | \LIBXML_NOERROR);
} catch (\ValueError $e) {
throw new InvalidArgumentException(\sprintf('Given string "%s" is not valid XML', $value), $e->getCode(), $e);
}
}
}
//...
}Cast:
/**
* @implements Type<HTMLDocument>
*/
final readonly class HTMLType implements Type
{
// ...
public function cast(mixed $value): HTMLDocument
{
if ($this->isValid($value)) {
return $value;
}
if (\is_string($value)) {
return HTMLDocument::createFromString($value, \LIBXML_COMPACT | \LIBXML_NOERROR);
}
try {
$stringValue = type_string()->cast($value);
return HTMLDocument::createFromString($stringValue, \LIBXML_COMPACT | \LIBXML_NOERROR);
} catch (CastingException $e) {
throw new CastingException($value, $this, $e);
}
}
// ...
}Query function:
final class DomQueryAll extends ScalarFunctionChain
{
public function __construct(
private readonly mixed $value,
private readonly ScalarFunction|string $path,
) {
}
/**
* @return null|array<Element>
*/
public function eval(Row $row) : ?array
{
$value = (new Parameter($this->value))->asInstanceOf($row, \DOM\HTMLDocument::class);
$path = (new Parameter($this->path))->asString($row);
if ($value === null || $path === null) {
return null;
}
$result = $value->querySelectorAll($path);
if ($result->count() === 0) {
return null;
}
// ...
}
}Are you intending to also work on proposed change?
Yes
Are you interested in sponsoring this change?
No
Integration & Dependencies
Enabled PHP ext-dom & PHP 8.4+.
norberttech