SQLite 中的累积求和值
我正在尝试在 SQLite 中执行值的累积总和.我最初只需要对单列求和并获得代码
I am trying to perform a cumulative sum of values in SQLite. I initially only needed to sum a single column and had the code
SELECT
t.MyColumn,
(SELECT Sum(r.KeyColumn1) FROM MyTable as r WHERE r.Date < t.Date)
FROM MyTable as t
Group By t.Date;
效果很好.
现在我想将其扩展到更多列 KeyColumn2
和 KeyColumn3
说.与其添加更多 SELECT
语句,我认为最好使用连接并编写以下内容
Now I wanted to extend this to more columns KeyColumn2
and KeyColumn3
say. Instead of adding more SELECT
statements I thought it would be better to use a join and wrote the following
SELECT
t.MyColumn,
Sum(r.KeyColumn1),
Sum(r.KeyColumn2),
Sum(r.KeyColumn3)
FROM MyTable as t
Left Join MyTable as r On (r.Date < t.Date)
Group By t.Date;
然而,这并没有给我正确的答案(相反,它给出了比预期大得多的值).为什么会这样,我如何更正 JOIN
以给出正确答案?
However this does not give me the correct answer (instead it gives values that are much larger than expected). Why is this and how could I correct the JOIN
to give me the correct answer?
您可能会得到我所说的 mini-笛卡尔乘积:您的 Date
值可能不是唯一的,并且作为自联接的结果,您将获得每个的匹配项非唯一值.按 Date
分组后,结果只是相应地相乘.
You are likely getting what I would call mini-Cartesian products: your Date
values are probably not unique and, as a result of the self-join, you are getting matches for each of the non-unique values. After grouping by Date
the results are just multiplied accordingly.
要解决这个问题,连接的左侧必须去掉重复的日期.一种方法是从您的表格中导出唯一日期的表格:
To solve this, the left side of the join must be rid of duplicate dates. One way is to derive a table of unique dates from your table:
SELECT DISTINCT Date
FROM MyTable
并将其用作连接的左侧:
and use it as the left side of the join:
SELECT
t.Date,
Sum(r.KeyColumn1),
Sum(r.KeyColumn2),
Sum(r.KeyColumn3)
FROM (SELECT DISTINCT Date FROM MyTable) as t
Left Join MyTable as r On (r.Date < t.Date)
Group By t.Date;
我注意到您在 SELECT 子句中使用了 t.MyColumn
,而您的分组是通过 t.Date
进行的.如果这是故意的,那么您可能在那里依赖未定义的行为,因为 t.MyColumn
值可能会在同一 t.Date
中的(可能)许多中任意选择> 组.
I noticed that you used t.MyColumn
in the SELECT clause, while your grouping was by t.Date
. If that was intentional, you may be relying on undefined behaviour there, because the t.MyColumn
value would probably be chosen arbitrarily among the (potentially) many in the same t.Date
group.
出于本示例的目的,我假设您实际上是指 t.Date
,因此,我相应地替换了该列,如上所示.如果我的假设不正确,请澄清.
For the purpose of this example, I assumed that you actually meant t.Date
, so, I replaced the column accordingly, as you can see above. If my assumption was incorrect, please clarify.