如何在一个Java应用程序中处理不同版本的xsd文件?

如何在一个Java应用程序中处理不同版本的xsd文件?

问题描述:

在我的java应用程序中,我必须同时处理具有不同模式版本(xsd文件)的XML文件。 XML文件的内容在不同版本之间只有一点点变化,所以我想主要使用相同的代码来处理它,并且只是根据所使用的模式的版本做一些案例distictions。

In my java application I have to handle XML files with different schema versions (xsd files) simultaneously. The content of the XML files changed only a little between the different versions, so I'd like to use mainly the same code to handle it and just do some case distictions dependent on the version of the used schema.

现在我正在使用SAX解析器和我自己的 ContentHandler 忽略架构版本,只检查我需要处理的标签是否存在。

Right now I'm parsing the XML files with a SAX parser and my own ContentHandler ignoring the schema version and just checking if the tags I need for processing are present.

我真的很想使用JAXB来生成解析XML文件的类。这样我就可以从我的java代码中删除所有硬编码字符串(常量),并使用生成的类来处理。

I'd really like to use JAXB to generate the classes for parsing the XML files. This way I could remove all the hardcoded strings (constants) from my java code and handle with the generated classes instead.


  • 如何使用JAXB以统一的方式处理不同的模式版本?

  • 有更好的解决方案吗?

我将架构版本编译为不同的包v1,v2和v3。现在我可以用这种方式创建一个 Unmarshaller

I compiled the schema versions to different packages v1, v2 and v3. Now I can create an Unmarshaller this way:

JAXBContext jc = JAXBContext.newInstance( 
    v1.Root.class, v2.Root.class, v3.Root.class );
Unmarshaller u = jc.createUnmarshaller();

现在 u.unmarshal(xmlInputStream); 从匹配XML文件架构的包中给我 Root 类。

Now u.unmarshal( xmlInputStream ); gives me the Root class from the package matching the schema of the XML file.

接下来我会尝试定义接口以访问模式的公共部分。 如果您以前做过类似的事情,请告诉我。与此同时,我正在阅读JAXB规范...

Next I'll try to define an interface to access the common parts of the schemas. If you have done something like this before, please let me know. In the mean time I'm reading through the JAXB specs...

首先,您需要一些方法来识别模式适用于特定的实例文档。您说文档具有 schemaLocation 属性,因此这是一个解决方案。但请注意,您必须专门配置解析器才能使用此属性,恶意文档可以指定您无法控制的架构位置。相反,我建议获取属性值,并使用它来在内部表中查找适当的模式。

First, you need some way to identify the schema appropriate for the particular instance document. You say that the documents have a schemaLocation attribute, so this is one solution. Note, however, that you have to specifically configure the parser to use this attribute, and a malicious document could specify a schema location that you don't control. Instead, I'd recommend getting the attribute value, and using it to find the appropriate schema in an internal table.

接下来是访问数据。你没有说为什么你使用三种不同的模式。唯一合理的原因是不断发展的数据规范(即,模式代表相同数据的版本1,2和3)。如果这不是您的理由,那么您需要重新考虑您的设计。

Next is access to the data. You don't say why you're using three different schemas. The only rational reason is an evolving data spec (ie, the schemas represent versions 1, 2, and 3 of the same data). If that's not your reason, then you need to rethink your design.

如果您尝试支持不断发展的数据规范,那么您需要回答如何做我处理的是缺少的数据。这有几个答案:一个是维护代码的多个版本。通过重构常用功能,这不是一个坏主意,但它很容易变得不可维护。

If you are trying to support an evolving data spec, then you need to answer the question "how do I deal with data that's missing." There are a couple of answers to this: one is to maintain multiple versions of the code. With refactoring of common functionality, this is not a bad idea, but it can easily become unmaintainable.

另一种方法是使用单个代码库,某种 href =http://en.wikipedia.org/wiki/Adapter_pattern =noreferrer>适配器对象,其中包含您的规则。如果沿着这条路走下去,JAXB是错误的解决方案,因为它与模式相关联。您可能能够使用宽松的XML-> Java转换器:我相信 XStream 将起作用,我知道实用XML 的1.1版本将起作用(因为我写了它) - 尽管你必须自己构建它。

The alternative is to use a single codebase, and some sort of adapter object that incorporates your rules. And if you go down this path, JAXB is the wrong solution, since it is tied to a schema. You might be able to use a permissive XML->Java converter: I believe XStream will work, and I know that the 1.1 release of Practical XML will work (since I wrote it) -- although you'd have to build it yourself.

另一个更好的选择,取决于模式的复杂性,是开发一组使用XPath来检索数据的对象。我可能会在架构的每个变体中使用包含每个字段的XPath表达式的主对象来实现。然后创建包含实例文档的DOM版本的轻量级包装器对象,并使用适合于模式的XPath。但请注意,这仅限于只读访问。

Another, better alternative, depending on the complexity of the schema, is to develop a set of objects that use XPath to retrieve the data. I would probably implement using a "master" object that contains XPath expressions for every field, in every variant of the schema. Then create lightweight "wrapper" objects that hold a DOM version of your instance document, and use the XPath appropriate to the schema. Note, however, that this is limited tor read-only access.