Python中的json模块的dump和dumps、load和loads的功能

Python中的json模块的dump和dumps、load和loads的功能

JSON

首先简单介绍一下json格式文件。json是JavaScript Object Notation的缩写。顾名思义,json来源于js的对象的格式。现在json已经不在局限于某种语言,而是一种通用的,比xml更轻量级的数据交换形式。

json 的基本格式可以看做是嵌套的字典。通俗来讲,就是再字典中的某个元素的值还是一个字典。我们都熟悉Python的字典dict类型,如果我们定义

json_dict = {'a':{'c':{'e':404}, 'f':233}, 'b':{'d':1}}

可以看到,'a’的取值是一个字典,这个字典包括’c’和’f’两个key。其中,'c’的取值仍然是个字典,而f的取值则是一个数值。以此类推。。。这样一个字典具有清晰的结构关系。因此可以用来传输一个具有结构化的数据体。

下面,我们看看json官网上的介绍:

JSON具有以下这些形式:

对象(object) 是一个无序的“‘名称/值’对”集合。一个对象以“{”(左括号)开始,“}”(右括号)结束。每个“名称”后跟一个“:”(冒号);“‘名称/值’ 对”之间使用“,”(逗号)分隔。

Python中的json模块的dump和dumps、load和loads的功能

数组(array) 是值(value)的有序集合。一个数组以“[”(左中括号)开始,“]”(右中括号)结束。值之间使用“,”(逗号)分隔。

Python中的json模块的dump和dumps、load和loads的功能

值(value) 可以是双引号括起来的字符串(string)、数值(number)、truefalsenull、对象(object)或者数组(array)。这些结构可以嵌套。

Python中的json模块的dump和dumps、load和loads的功能

字符串(string) 是由双引号包围的任意数量Unicode字符的集合,使用反斜线转义。一个字符(character)即一个单独的字符串(character string)。

JSON的字符串(string)与C或者Java的字符串非常相似。

Python中的json模块的dump和dumps、load和loads的功能

数值(number) 也与C或者Java的数值非常相似。只是JSON的数值没有使用八进制与十六进制格式。

Python中的json模块的dump和dumps、load和loads的功能

同时,可以在任意标记之间添加空白。

Python中的json模块

Python的json模块用于处理json格式的数据。主要有如下几个函数:

  • json.dump()

  • json.dumps()

  • json.load()

  • json.loads()

dump是将python的dict数据体做成json形式,而load则相反,从文件或string中加载数据,并解析成dict的形式。

简单来说,s可以理解为string,带有s的是将dict结构dump成str,或者从str中load一个dict,而没有s的则将dict以json形式存到文件,或者从文件读出json形式。

下面是官方文档的usage:

json.dump(obj, fp, *, skipkeys=False, ensure_ascii=True,
check_circular=True, allow_nan=True, cls=None, indent=None,
separators=None, default=None, sort_keys=False, **kw)

可以看出,dump主要的参数是obj和fp,分别代表dict和要保存的文件的handler,如果需要打印出来缩进好看的话,可以设置indent。另外,separator是一个二元组(item_sep, key_sep),也就是元素之间的分隔符,和key-value之间的分隔符,默认的是逗号和冒号,并且如果没有indent,则逗号冒号后面都接一个空格,如果有indent,自然逗号后面就不需要空格了(因为有缩进)。这个分隔符可以自行指定。还有一个比较有用的是sort_keys,如果指定为True,则会按照key 进行排序。

json.dumps(obj, *, skipkeys=False, ensure_ascii=True,
check_circular=True, allow_nan=True, cls=None, indent=None,
separators=None, default=None, sort_keys=False, **kw)

和上面一样,只不过不是保存到file里,而是转成一个string。

Serialize obj to a JSON formatted str using this conversion table. The arguments have the same meaning as in dump(). Note Keys in key/value pairs of JSON are always of the type str. When a dictionary is converted into JSON, all the keys of the dictionary are coerced to strings. As a result of this, if a dictionary is converted into JSON and then back into a dictionary, the dictionary may not equal the original one. That is, loads(dumps(x)) != x if x has non-string keys.

json.load(fp, *, cls=None, object_hook=None, parse_float=None,
parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)

load和loads同理。

Deserialize fp (a .read()-supporting text file or binary file containing a JSON document) to a Python object using this conversion table. object_hook is an optional function that will be called with the result of any object literal decoded (a dict). The return value of object_hook will be used instead of the dict. This feature can be used to implement custom decoders (e.g. JSON-RPC class hinting). object_pairs_hook is an optional function that will be called with the result of any object literal decoded with an ordered list of pairs. The return value of object_pairs_hook will be used instead of the dict. This feature can be used to implement custom decoders. If object_hook is also defined, the object_pairs_hook takes priority. Changed in version 3.1: Added support for object_pairs_hook. parse_float, if specified, will be called with the string of every JSON float to be decoded. By default, this is equivalent to float(num_str). This can be used to use another datatype or parser for JSON floats (e.g. decimal.Decimal). parse_int, if specified, will be called with the string of every JSON int to be decoded. By default, this is equivalent to int(num_str). This can be used to use another datatype or parser for JSON integers (e.g. float). parse_constant, if specified, will be called with one of the following strings: '-Infinity', 'Infinity', 'NaN'. This can be used to raise an exception if invalid JSON numbers are encountered. Changed in version 3.1: parse_constant doesn’t get called on ‘null’, ‘true’, ‘false’ anymore. To use a custom JSONDecoder subclass, specify it with the cls kwarg; otherwise JSONDecoder is used. Additional keyword arguments will be passed to the constructor of the class. If the data being deserialized is not a valid JSON document, a JSONDecodeError will be raised. Changed in version 3.6: All optional parameters are now keyword-only. Changed in version 3.6: fp can now be a binary file. The input encoding should be UTF-8, UTF-16 or UTF-32.

json.loads(s, *, encoding=None, cls=None, object_hook=None,
parse_float=None, parse_int=None, parse_constant=None,
object_pairs_hook=None, **kw)

Deserialize s (a str, bytes or bytearray instance containing a JSON document) to a Python object using this conversion table. The other arguments have the same meaning as in load(), except encoding which is ignored and deprecated. If the data being deserialized is not a valid JSON document, a JSONDecodeError will be raised. Changed in version 3.6: s can now be of type bytes or bytearray. The input encoding should be UTF-8, UTF-16 or UTF-32.