大型JSON数组的前四个元素的有效解析
我正在使用 Jackson 从JSON inputStream
中解析JSON,如下所示:
I am using Jackson to parse JSON from a json inputStream
which looks like following:
[
[ 36,
100,
"The 3n + 1 problem",
56717,
0,
1000000000,
0,
6316,
0,
0,
88834,
0,
45930,
0,
46527,
5209,
200860,
3597,
149256,
3000,
1
],
[
........
],
[
........
],
.....// and almost 5000 arrays like above
]
这是原始的供稿链接: http://uhunt.felix-halim.net/api /p
This is the original feed link: http://uhunt.felix-halim.net/api/p
我想解析它并仅保留每个数组的前4个元素,并跳过其他18个元素.
I want to parse it and keep only the first 4 elements of every array and skip other 18 elements.
36
100
The 3n + 1 problem
56717
到目前为止我尝试过的代码结构:
Code structure I have tried so far:
while (jsonParser.nextToken() != JsonToken.END_ARRAY) {
jsonParser.nextToken(); // '['
while (jsonParser.nextToken() != JsonToken.END_ARRAY) {
// I tried many approaches here but not found appropriate one
}
}
由于此供稿非常大,因此我需要以更少的开销和内存来高效地执行此操作. 另外,还有三种用于JSON的模型:流API ,数据绑定和树模型.哪一个适合我的目的?
As this feed is pretty big, I need to do this efficiently with less overhead and memory. Also there are three models to procress JSON: Streaming API, Data Binding and Tree Model. Which one is appropriate for my purpose?
我如何与杰克逊有效地解析此json?如何跳过这18个元素并跳到下一个数组以获得更好的性能?
How can I parse this json efficiently with Jackson? How can I skip those 18 elements and jump to next array for better performance?
(解决方案)
Jackson
和GSon
几乎都以相同的机制工作(增量模式,因为内容是增量读取和写入的),我切换到GSON
,因为它具有功能skipValue()
(非常适合姓名).尽管Jackson的nextToken()
将像skipValue()
一样工作,但GSON
对我来说似乎更灵活.感谢@Kowser bro的建议,我以前曾经了解过GSON,但是以某种方式忽略了它.这是我的工作代码:
Jackson
and GSon
both works in almost in the same mechanism (incremental mode, since content is read and written incrementally), I am switching to GSON
as it has a function skipValue()
(pretty appropriate with name). Although Jackson's nextToken()
will work like skipValue()
, GSON
seems more flexible to me. Thanks @Kowser bro for his recommendation, I came to know about GSON before but somehow ignored it. This is my working code:
reader.beginArray();
while (reader.hasNext()) {
reader.beginArray();
int a = reader.nextInt();
int b = reader.nextInt();
String c = reader.nextString();
int d = reader.nextInt();
System.out.println(a + " " + b + " " + c + " " + d);
while (reader.hasNext())
reader.skipValue();
reader.endArray();
}
reader.endArray();
reader.close();
这是针对Jackson
关注本教程
明智地使用jasonParser.nextToken()应该可以为您提供帮助.
Judicious use of jasonParser.nextToken() should help you.
while (jasonParser.nextToken() != JsonToken.END_ARRAY) { // might be JsonToken.START_ARRAY?
伪代码是
The pseudo-code is
- 找到下一个数组
- find next array
- 读取值
- 跳过其他值
- 跳过下一个结束令牌
这是针对gson
的.
请看本教程.考虑考虑教程中的第二个示例.
This is for gson
.
Take a look at this tutorial. Consider following second example from the tutorial.
明智地使用reader.begin*
reader.end*
和reader.skipValue
应该可以为您完成这项工作.
Judicious use of reader.begin*
reader.end*
and reader.skipValue
should do the job for you.
这是 JsonReader