如何对Stack Exchange Data Explorer(SEDE)结果进行分页?
问题描述:
使用数据浏览器创建查询:
SELECT P.id, creationdate,tags,owneruserid,answercount
--SELECT DISTINCT TAGNAME ,TAGID
FROM TAGS AS T
JOIN POSTTAGS AS PT
ON T.ID = PT.TAGID
JOIN POSTS AS P
ON PT.POSTID = P.ID
--WHERE CAST(P.TAGS AS VARCHAR) IN('JAVA')
WHERE PT.TAGID = 3143
如何在查询中添加分页以便不仅获取前50,000个结果,然后再次运行查询以获取下一个剩余结果?
How is it possible to add pagination in the query in order to take not only the first 50,000 results, but then run the query again to take the next remaining results?
答
有几种方法可以通过TSQL结果进行分页".看到:
There are a few ways to "page" through TSQL results; see:
- How to return a page of results from SQL?
and - SQL performance: WHERE vs WHERE(ROW_NUMBER)
在这里,我将使用CTE方法:
Here I will use the CTE method as:
- 它使用方便的行号来翻页结果,而不是尝试跟踪诸如
creationdate
之类的难以预测的因素. - 据报道,它的执行速度比
OFFSET
方法快.
- It uses convenient row numbers to page through results, rather than trying to track less predictable factors such as
creationdate
. - It reportedly performs faster than the
OFFSET
method.
因此,该问题的查询变为此SEDE查询:
So, that question's query becomes this SEDE query:
-- StartRow: Starting row for paging
-- EndRow: Ending row for paging (Max 50K rows at a time)
WITH allData AS (
SELECT
ROW_NUMBER() OVER (ORDER BY P.creationdate) AS row
, P.id
, P.creationdate
, P.tags
, P.owneruserid
, P.answercount
FROM Posttags AS PT
JOIN Posts AS P ON PT.postid = P.id
WHERE PT.tagid = 3143 -- tag [scala]
)
SELECT *
FROM allData
WHERE row >= ##StartRow:INT?1##
AND row <= ##EndRow:INT?50000##
ORDER BY row