我可以使用DOM回显W3C规范中的所有HTML标记吗?

问题描述:

I'm using this simple PHP HTML parser: http://simplehtmldom.sourceforge.net. Is it possible to use it to echo all tags of the HTML specification?

我正在使用这个简单的PHP HTML解析器: http://simplehtmldom.sourceforge.net 。 是否可以使用它来回显HTML规范的所有标记? p> div>

Here you go:

$dom = new DOMDocument;
$dom->load('http://www.w3.org/2002/08/xhtml/xhtml1-transitional.xsd');
$xsns = 'http://www.w3.org/2001/XMLSchema';
$elements = array();
foreach ($dom->getElementsByTagNameNS($xsns, 'element') as $element) {
    if ($element->hasAttribute('name')) {
        echo $element->getAttribute('name');
        $docs = $element->getElementsByTagNameNS($xsns, 'documentation');
        foreach ($docs as $doc) {
            echo "\t", $doc->nodeValue;
        }
        echo PHP_EOL;
    }
}

The above code will output all the Element types in the Schema definition (not DTD) for XHTML1 Transitional (not HTML) plus any documentation, e.g.

pre
      content is "Inline" excluding
         "img|object|applet|big|small|sub|sup|font|basefont"

It uses PHP's native DOM extension to do that. The DOM extension uses libxml underneath and is superior to SimpleHtmlDom in terms of speed and offered control over the markup. The DOM interface is a language agnostic W3C specification.

For alternatives to the DOM extension see

In the documentation it says

// Dumps the internal DOM tree back into string
$str = $html;

// Print it!
echo $html; 

I think the echo should be $str not $html but this is what the documentation says.


// Dumps the internal DOM tree back into string
$str = $html->save();

// Dumps the internal DOM tree back into a file
$html->save('result.htm');

Hope this helps.

Documentation: http://simplehtmldom.sourceforge.net/manual.htm

No, that parser is a simple HTML parser, it has no capability to parse a DTD and it's internal logic for handling HTML elements is no exposed (or even expressed in a way that would making presenting it in human readable form even slightly convenient).