带有空元素键的Jackson JsonNode

问题描述：

我正在使用jackson-dataformat-xml(2.9)将XML解析为JsonNode，然后将其解析为JSON(XML是动态的，所以这就是为什么我使用JsonNode而不是绑定到POJO的原因.例如'elementName'和'id'名称可能会有所不同.)

I am using jackson-dataformat-xml (2.9) to parse an XML into JsonNode and then parse it to JSON (the XML is very dynamic so that is why I am using JsonNode instead of binding to a POJO. e.g 'elementName' and 'id' names may vary).

碰巧，在JSON解析阶段，元素键之一是空字符串(").

It happens that during the JSON parsing phase, one of the element keys is empty string ("").

XML:

<elementName>
      <id type="pid">abcdef123</id>
</elementName>

解析逻辑:

public Parser() {
        ObjectMapper jsonMapper = new ObjectMapper();
        XmlMapper xmlMapper = new XmlMapper(new XmlFactory(new WstxInputFactory()));
}

public InputStream parseXmlResponse(InputStream xmlStream) {
        InputStream stream = null;

        try {
            JsonNode node = xmlMapper.readTree(xmlStream);
            stream = new ByteArrayInputStream(jsonMapper.writer().writeValueAsBytes(node));
        } catch (IOException e) {
            e.printStackTrace();
        }

        return stream;
    }

Json:

结果:

{
   "elementName": {
     "id": {
        "type": "pid",
        "": "abcdef123"
     }
   },
}

预期:

{
   "elementName": {
     "id": {
        "type": "pid",
        "value": "abcdef123"
     }
   },
}

我的想法是找到具有空键"的任何内容，并将其替换为值".在XML反序列化过程中或在JSON序列化过程中.我曾尝试使用默认的序列化程序，过滤器，但没有以一种简洁的方式使其正常工作.

My idea is to find whenever I have the empty key "" and replace it with "value". Either at XML de-serialization or during JSON serialization. I have tried to use default serializer, filter, but haven't got it working in a nice and concise way.

建议深表感谢.

谢谢您的帮助.

基于@shoek的建议，我决定编写一个自定义序列化程序，以避免在此过程中创建中间对象(ObjectNode).

Based on @shoek suggestion I decided to write a custom serializer to avoid creating an intermediate object (ObjectNode) during the process.

基于@shoek提出的相同解决方案进行重构.

edit: refactor based on the same solution proposed by @shoek.

public class CustomNode {
    private JsonNode jsonNode;

    public CustomNode(JsonNode jsonNode) {
        this.jsonNode = jsonNode;
    }

    public JsonNode getJsonNode() {
        return jsonNode;
    }
}

public class CustomObjectsResponseSerializer extends StdSerializer<CustomNode> {

    protected CustomObjectsResponseSerializer() {
        super(CustomNode.class);
    }

    @Override
    public void serialize(CustomNode node, JsonGenerator jgen, SerializerProvider provider) throws IOException {
        convertObjectNode(node.getJsonNode(), jgen, provider);
    }

    private void convertObjectNode(JsonNode node, JsonGenerator jgen, SerializerProvider provider) throws IOException {
        jgen.writeStartObject();
        for (Iterator<String> it = node.fieldNames(); it.hasNext(); ) {
            String childName = it.next();
            JsonNode childNode = node.get(childName);
            // XML parser returns an empty string as value name. Replacing it with "value"
            if (Objects.equals("", childName)) {
                childName = "value";
            }

            if (childNode instanceof ArrayNode) {
                jgen.writeFieldName(childName);
                convertArrayNode(childNode, jgen, provider);
            } else if (childNode instanceof ObjectNode) {
                jgen.writeFieldName(childName);
                convertObjectNode(childNode, jgen, provider);
            } else {
                provider.defaultSerializeField(childName, childNode, jgen);
            }
        }
        jgen.writeEndObject();

    }

    private void convertArrayNode(JsonNode node, JsonGenerator jgen, SerializerProvider provider) throws IOException {
        jgen.writeStartArray();
        for (Iterator<JsonNode> it = node.elements(); it.hasNext(); ) {
            JsonNode childNode = it.next();

            if (childNode instanceof ArrayNode) {
                convertArrayNode(childNode, jgen, provider);
            } else if (childNode instanceof ObjectNode) {
                convertObjectNode(childNode, jgen, provider);
            } else {
                provider.defaultSerializeValue(childNode, jgen);
            }
        }
        jgen.writeEndArray();
    }
}

答

您还可以简单地后处理 JSON DOM，遍历所有对象，然后重命名为空的键字符串为值" .

You also could simply post-process the JSON DOM, traverse to all objects, and rename the keys that are empty strings to "value".

种族条件:这样的密钥可能已经存在，并且不能被覆盖
(例如< id type ="pid" value ="existing"> abcdef123</id> ).

Race condition: such a key may already exist, and must not be overwritten
(e.g. <id type="pid" value="existing">abcdef123</id>).

用法:
(注意:您不应默默抑制异常并返回null，但应传播该异常，以便调用方可以决定是否需要捕获并应用故障转移逻辑)

public InputStream parseXmlResponse(InputStream xmlStream) throws IOException {
    JsonNode node = xmlMapper.readTree(xmlStream);
    postprocess(node);
    return new ByteArrayInputStream(jsonMapper.writer().writeValueAsBytes(node));
}

后处理:

private void postprocess(JsonNode jsonNode) {

    if (jsonNode.isArray()) {
        ArrayNode array = (ArrayNode) jsonNode;
        Iterable<JsonNode> elements = () -> array.elements();

        // recursive post-processing
        for (JsonNode element : elements) {
            postprocess(element);
        }
    }
    if (jsonNode.isObject()) {
        ObjectNode object = (ObjectNode) jsonNode;
        Iterable<String> fieldNames = () -> object.fieldNames();

        // recursive post-processing
        for (String fieldName : fieldNames) {
            postprocess(object.get(fieldName));
        }
        // check if an attribute with empty string key exists, and rename it to 'value',
        // unless there already exists another non-null attribute named 'value' which
        // would be overwritten.
        JsonNode emptyKeyValue = object.get("");
        JsonNode existing = object.get("value");
        if (emptyKeyValue != null) {
            if (existing == null || existing.isNull()) {
                object.set("value", emptyKeyValue);
                object.remove("");
            } else {
                System.err.println("Skipping empty key value as a key named 'value' already exists.");
            }
        }
    }
}

输出:与预期的一样.

{
   "elementName": {
     "id": {
        "type": "pid",
        "value": "abcdef123"
     }
   },
}

关于性能的注意事项:

我对一个大型XML文件( enwikiquote-20200520-pages-articles-multistream.xml ，en.wikiquote XML转储， 498.4 MB )进行了测试，100四舍五入，并具有以下测量时间(使用带有 System.nanoTime()的增量):

I did a test with a large XML file (enwikiquote-20200520-pages-articles-multistream.xml, en.wikiquote XML dump, 498.4 MB), 100 rounds, with following measured times (using deltas with System.nanoTime()):

平均读取时间(文件，SSD): 2870.96 ms
( JsonNode节点= xmlMapper.readTree(xmlStream); )
平均后处理时间: 0.04 ms
( postprocess(node); )
平均写入时间(内存): 0.31 ms
(新的ByteArrayInputStream(jsonMapper.writer().writeValueAsBytes(node));

average read time (File, SSD): 2870.96 ms
(JsonNode node = xmlMapper.readTree(xmlStream);)
average postprocessing time: 0.04 ms
(postprocess(node);)
average write time (memory): 0.31 ms
(new ByteArrayInputStream(jsonMapper.writer().writeValueAsBytes(node));)

对于从约500 MB的文件构建对象树来说，这只是一毫秒的时间-因此性能非常出色，无需担心.

That's a fraction of a millisecond for an object tree build from a ~500 MB file - so performance is excellent and no concern.

带有空元素键的Jackson JsonNode

相关推荐