如何在没有PHP数据的情况下克隆不同的XML结构?
I have an XML document that looks like this:
<root>
<node/>
<node>
<sub>more</sub>
</node>
<node>
<sub>another</sub>
</node>
<node>value</node>
</root>
Here's my pseudo-code:
import xml.
create empty-xml.
foreach child of imported-xml-root-node,
recursively clone node structure without data.
if clone does not match one already in empty-xml,
then add clone to empty-xml.
I'm trying to get a result that looks like this:
<root>
<node/>
<node>
<sub/>
</node>
</root>
Note that my piddly example data is only 3 nodes deep. In production, there will be an unknown number of descendants, so an acceptable answer needs to handle variable node depths.
Failed Approaches
I have reviewed The DOMNode class which has a cloneNode
method with a recursive option that I would like to use, although it would take some extra work to purge the data. But while the class contains a hasChildNodes
function which returns a boolean, I can't find a way to actually return the collection of children.
$doc = new DOMDocument();
$doc->loadXML($xml);
$root_node = $doc->documentElement;
if ( $root_node->hasChildNodes() ) {
// looking for something like this:
// foreach ($root_node->children() as $child)
// $doppel = $child->cloneNode(true);
}
Secondly, I have tried my hand with the The SimpleXMLElement class which does have an awesome children
method. Although it's lacking the recursive option, I built a simple function to surmount that. But the class is missing a clone/copyNode method, and my function is bloating into something nasty to compensate. Now I'm considering combining usage of the two classes so I've got access to both SimpleXMLElement::children
and DOMDocument::cloneNode
, but I can tell this is not going cleanly and surely this problem can be solved better.
$sxe = new SimpleXMLElement($xml);
$indentation = 0;
function getNamesRecursive( $xml, &$indentation )
{
$indentation++;
foreach($xml->children() as $child) {
for($i=0;$i<$indentation;$i++)
echo "\t";
echo $child->getName() . "
";
getNamesRecursive($child,$indentation);
}
$indentation--;
}
getNamesRecursive($sxe,$indentation);
well here's my stinky solution. suggestions for improvements or completely new better answers are still very welcome.
$xml = '
<root>
<node/>
<node>
<sub>more</sub>
</node>
<node>
<sub>another</sub>
</node>
<node>value</node>
</root>
';
$doc = new DOMDocument();
$doc->loadXML($xml);
// clone without data
$empty_xml = new DOMDocument();
$empty_xml->appendChild($empty_xml->importNode($doc->documentElement));
function clone_without_data(&$orig, &$clone, &$clonedoc){
foreach ($orig->childNodes as $child){
if(get_class($child) === "DOMElement")
$new_node = $clone->appendChild($clonedoc->importNode($child));
if($child->hasChildNodes())
clone_without_data($child,$new_node,$clonedoc);
}
}
clone_without_data($doc->documentElement, $empty_xml->documentElement, $empty_xml);
// remove all duplicates
$distinct_structure = new DOMDocument();
$distinct_structure->appendChild($distinct_structure->importNode($doc->documentElement));
foreach ($empty_xml->documentElement->childNodes as $child){
$match = false;
foreach ($distinct_structure->documentElement->childNodes as $i => $element){
if ($distinct_structure->saveXML($element) === $empty_xml->saveXML($child)) {
$match = true;
break;
}
}
if (!$match)
$distinct_structure->documentElement->appendChild($distinct_structure->importNode($child,true));
}
$distinct_structure->formatOutput = true;
echo $distinct_structure->saveXML();
Which results in this output:
<?xml version="1.0"?>
<root>
<node/>
<node>
<sub/>
</node>
</root>
Consider XSLT, the special-purpose language designed to transform XML files. And PHP maintains an XSLT 1.0 processor. You simply need to keep items of position 1 and copy only its elements not text.
XSLT (save as .xsl file to use below in php)
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" omit-xml-declaration="yes" />
<xsl:strip-space elements="*"/>
<!-- Identity Transform -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<!-- Remove any nodes position greater than 2 -->
<xsl:template match="*[position() > 2]"/>
<!-- Copy only tags -->
<xsl:template match="/*/*/*">
<xsl:copy/>
</xsl:template>
</xsl:transform>
PHP
// LOAD XML AND XSL FILES
$xml = new DOMDocument('1.0', 'UTF-8');
$xml->load('Input.xml');
$xslfile = new DOMDocument('1.0', 'UTF-8');
$xslfile->load('Script.xsl');
// TRANSFORM XML with XSLT
$proc = new XSLTProcessor;
$proc->importStyleSheet($xslfile);
$newXml = $proc->transformToXML($xml);
// ECHO OUTPUT STRING
echo $newXml;
# <root>
# <node/>
# <node>
# <sub/>
# </node>
# </root>
// NEW DOM OBJECT
$final = new DOMDocument('1.0', 'UTF-8');
$final->loadXML($newXml);