用PYTHON爬取JSON数据时出错

用PYTHON爬取JSON数据时出错

问题描述:

问题遇到的现象和发生背景

常规
请求 URL: http://172.21.249.14:16509/api/Clinic/GetSamples
请求方法: POST
状态代码: 200 OK
远程地址: 172.21.249.14:16509
引用站点策略: strict-origin-when-cross-origin
响应头

Cache-Control: no-cache
Content-Length: 1721
Content-Type: application/json; charset=utf-8
Date: Fri, 26 Nov 2021 12:22:20 GMT
Expires: -1
Pragma: no-cache
Server: Microsoft-IIS/8.5
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET

请求标头

Accept: application/json, text/javascript, */*; q=0.01
Accept-Encoding: gzip, deflate
Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6
Connection: keep-alive
Content-Length: 1333
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Host: 172.21.249.14:16509
Origin: http://172.21.249.14:16509
Referer: http://172.21.249.14:16509/?calltype=1&his_id=157638878||22
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.55 Safari/537.36 Edg/96.0.1054.34
X-Requested-With: XMLHttpRequest

表单数据

QueryField: [{"KEY":"PATIENT_TYPE","VALUE":"","OPERATER":"IN","TYPE":"1","TYPEKEY":""},{"KEY":"REQUISITION_ID","VALUE":"9400124600,9400117000,9400106600","OPERATER":"IN","TYPE":"1","TYPEKEY":""},{"KEY":"REQUISITION_ID","VALUE":null,"OPERATER":"=","TYPE":"1","TYPEKEY":""},{"KEY":"INSPECTION_STATE","VALUE":"","OPERATER":"IN","TYPE":"10","TYPEKEY":""},{"KEY":"PRINT_STATE","VALUE":"","OPERATER":"=","TYPE":"1","TYPEKEY":"FILTERDATE"},{"KEY":"VISIT_NUM","VALUE":null,"OPERATER":"=","TYPE":"1","TYPEKEY":""},{"KEY":"INSPECTION_DATES","VALUE":"20210828","OPERATER":">=","TYPE":"3","TYPEKEY":"FILTERDATE"},{"KEY":"INSPECTION_DATEE","VALUE":"20211126","OPERATER":"<=","TYPE":"3","TYPEKEY":"FILTERDATE"}]
Order:   ORDER BY FILTERDATE DESC
HospitalId: 
OutpatientIdField: OUTPATIENT_ID
DbType: OleDb
PageSize: 100
PageNum: 1

响应

{"rows":[{"INSPECTION_ID":"20211125G0014506","GROUP_ID":"G001","GROUP_NAME":"临床免疫室2","PGROUP_ID":"LIS3","INSPECTION_CLASS":"一般检验类","SAMPLING_POSITION_NAME":null,"INSPECTION_DATE":"20211125","SAMPLE_NUMBER":"4506","REQUISITION_ID":"9400124600","PATIENT_ID":null,"OUTPATIENT_ID":"0020214452","INPATIENT_ID":"159911369","VISIT_NUM":null,"PATIENT_NAME":"张三","PATIENT_SEX":"2","PATIENT_SEX_TEXT":"女","PATIENT_AGE":"73岁","PATIENT_TYPE":"1","PATIENT_DEPT":"LIS8404","PATIENT_DEPT_NAME":"A","PATIENT_WARD":"LIS6602","PATIENT_WARD_NAME":"B","PATIENT_BED":"+13床","SAMPLE_CLASS":"LIS126","SAMPLE_CLASS_NAME":"血清","TEST_ORDER":"LIS028061","TEST_ORDER_NAME":"脂溶性维生素谱","REQUISITION_PERSON":"C","REQUISITION_TIME":null,"PRINT_PERSON":null,"PRINT_TIME":"2021-11-19 15:02","SAMPLING_PERSON":"D","SAMPLING_TIME":"2021-11-20 06:04","SEND_PERSON":"分拣机-DBA","SENDOUT_TIME":null,"SENDOUT_PERSON":null,"SEND_TIME":"2021-11-20 08:39","RECEIVE_PERSON":"E","RECEIVE_TIME":"2021-11-20 08:58","INCEPT_PERSON":"E","INPUT_TIME":"2021-11-20 08:58","INSPECTION_PERSON":"F","INSPECTION_TIME":"2021-11-20 08:58","CHECK_PERSON":"G","CHECK_TIME":"2021-11-23 08:50","INSPECTION_STATE":"sent","INSPECTION_STATE_NAME":"已发送","PRINT_STATE":"0","READ_STATE":"1","READ_TIME":"2021-11-25 18:47","SAMPLING_POSITION":null,"REMARK":null,"REMARK_NAME":null,"CLINICAL_DIAGNOSES":"胰体恶性肿瘤","STATE_COLOR_IDX":8.0,"FILTERDATE":"2021-11-20T08:58:56","PDF_PATH":null,"DELIVER_HOSPITAL":"1","PDF_FLAG":"0","REPORT_PRINTTIME":null,"HOSPITAL_ID":"51A001","PRINT_PDF_PAGE":null,"PRINT_PDF_SIZE":null,"RN":1.0}],"total":1}

问题相关代码,请勿粘贴截图
import requests
import json
post_url = 'http://172.21.249.14:16509/api/Clinic/GetSamples'
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.53'
}
word = input('enter a word')
data = {
    'QueryField': '[{"KEY":"","VALUE":"","OPERATER":"IN","TYPE":"1","TYPEKEY":""},{"KEY":"REQUISITION_ID","VALUE":word,"OPERATER":"IN","TYPE":"1","TYPEKEY":""},{"KEY":"REQUISITION_ID","VALUE":null,"OPERATER":"=","TYPE":"1","TYPEKEY":""},{"KEY":"INSPECTION_STATE","VALUE":"","OPERATER":"IN","TYPE":"10","TYPEKEY":""},{"KEY":"PRINT_STATE","VALUE":"","OPERATER":"=","TYPE":"1","TYPEKEY":"FILTERDATE"},{"KEY":"VISIT_NUM","VALUE":null,"OPERATER":"=","TYPE":"1","TYPEKEY":""},{"KEY":"INSPECTION_DATES","VALUE":"20210827","OPERATER":">=","TYPE":"3","TYPEKEY":"FILTERDATE"},{"KEY":"INSPECTION_DATEE","VALUE":"20211125","OPERATER":"<=","TYPE":"3","TYPEKEY":"FILTERDATE"}]',
    'Order': 'ORDER BY FILTERDATE DESC',
    'Hospitalld':
    'DbType' 'oleDbb',
    'PageSize': '100',
    'PageNum': '1'
}
response = requests.post(url=post_url, data=data, headers=headers)
print(response.text)
dic_obj = response.json()
fileName = word + ',json'
fp = open(fileName, 'w', encoding='utf-8')
json.dump(dic_obj, fp=fp, ensure_ascii=False)
print('over')
###### 运行结果及报错内容 
C:\Users\cc\AppData\Local\Programs\Python\Python39\python.exe D:/Pycharm/data/chaxue4.py
enter a word9388008600
{"Message":"An error has occurred."}
over

Process finished with exit code 0

我的解答思路和尝试过的方法
我想要达到的结果

想要爬取响应里面的"INSPECTION_ID":"20211125G0014506"

Referer个X-Requested-With请求头加上试试,被反扒没返回信息,如果请求头有cookie的话要一起加上,需要通过浏览器开发工具看完整发送的请求有哪些请求头,特别是自定义的请求头,这些包含验证信息什么的,不能缺少

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.53',
'Referer': http://172.21.249.14:16509/?calltype=1&his_id=157638878||22',
'X-Requested-With': 'XMLHttpRequest'}