为什么实体& eacute;无效,而实体& lt;是吗
我正在查看xml解析器 System.Xml.Resolvers.XmlPreloadedResolver
在dtds方面带来了什么,而我感到困惑的是实体& lt;
被xml读取器识别,但实体& eacute;
不能识别。
I'm looking at what the xml resolver System.Xml.Resolvers.XmlPreloadedResolver
brings to the table in terms of dtds and i'm stumped by the fact that the entity <
is recognized by the xml reader but not the entity é
.
private static void Main(string[] args)
{
string invalidContent = "<?xml version=\"1.0\" encoding=\"utf-8\"?><key value=\"char é invalid\"/>";
string validContent = "<?xml version=\"1.0\" encoding=\"utf-8\"?><key value=\"char < valid\"/>";
XmlDocument xmlDocument = new XmlDocument();
var xmlReaderSettings = new XmlReaderSettings()
{
DtdProcessing = DtdProcessing.Parse,
XmlResolver = new XmlPreloadedResolver(XmlKnownDtds.All),
ProhibitDtd = false
};
using (XmlReader reader = XmlReader.Create(new StringReader(invalidContent), xmlReaderSettings))
{
xmlDocument.Load(reader); // reference to undeclared entity 'eacute'
}
using (XmlReader reader = XmlReader.Create(new StringReader(validContent), xmlReaderSettings))
{
xmlDocument.Load(reader); //
}
}
在XmlPreloadedResolver内部检查,我可以看到 XmlKnownDtds.All
应该带入xhtml-lat1.ent文件,其中包含紧急实体以及许多其他实体。知道为什么我会看到这种行为吗?
Checking inside the XmlPreloadedResolver i can see that the XmlKnownDtds.All
should bring in the xhtml-lat1.ent file which contains the eacute entity, along with many others. Any idea why i'm seeing this behavior?
& lt;
是XML规范本身中定义的基本实体; & eacute;
不是。这就是为什么您会看到行为上的差异。 (因此,我希望&
,& gt;
,& ‘
和& quot;
也可以使用。)请参见 http://www.w3.org/TR/REC-xml/#sec-references
<
is a fundamental entity defined in the XML specification itself; é
isn't. That's why you're seeing the difference in behaviour. (So I'd expect &
, >
, '
and "
to work too.) See http://www.w3.org/TR/REC-xml/#sec-references
我不认为 XmlResolver
在这里特别重要,因为您的XML没有引用任何其他DTD等。 'b认为它是用来自动导入实体而无需引用文档本身中的任何内容。
I don't think the XmlResolver
is particularly relevant here as your XML doesn't refer to any other DTDs etc. I don't think it's meant to be used to automatically import entities without referring to anything at all within the document itself.