在Python中通过ElementTree解析xml时如何保留名称空间

问题描述:

假设我要使用Python的 ElementTree 修改以下XML:

Assume that I've the following XML which I want to modify using Python's ElementTree:

<root xmlns:prefix="URI">
  <child company:name="***"/>
  ...
</root> 

我正在对XML文件进行如下修改:

I'm doing some modification on the XML file like this:

import xml.etree.ElementTree as ET
tree = ET.parse('filename.xml')
# XML modification here
# save the modifications
tree.write('filename.xml')

然后XML文件如下:

<root xmlns:ns0="URI">
  <child ns0:name="***"/>
  ...
</root>

如您所见,名称空间前缀更改为 ns0 。我知道使用 ET.register_namespace()提到此处

As you can see, the namepsace prefix changed to ns0. I'm aware of using ET.register_namespace() as mentioned here.

ET.register_namespace()的问题是:


  1. 您需要知道前缀 URI

  2. 不能与默认名称空间一起使用。

例如如果xml看起来像这样:

e.g. If the xml looks like:

<root xmlns="http://uri">
    <child name="name">
    ...
    </child>
</root>

它将被转换为以下内容:

It will be transfomed to something like:

<ns0:root xmlns:ns0="http://uri">
    <ns0:child name="name">
    ...
    </ns0:child>
</ns0:root>

如您所见,默认名称空间更改为 ns0

As you can see, the default namespace is changed to ns0.

有什么方法可以用 ElementTree 解决此问题吗?

Is there any way to solve this problem with ElementTree?

以下是保留名称空间的前缀和URI的方法:

Here is the way to preserve the namespaces' prefix and URI:

def register_all_namespaces(filename):
    namespaces = dict([node for _, node in ET.iterparse(filename, events=['start-ns'])])
    for ns in namespaces:
        ET.register_namespace(ns, namespaces[ns])

应在调用 [ET] .write()方法。