获取neo4j中每个组的前n条记录

问题描述:

我需要对neo4j数据库中的数据进行分组,然后过滤掉每个组中除顶部n记录以外的所有内容.

I need to group the data from a neo4j database and then to filter out everything except the top n records of every group.

示例:

我有两种节点类型:订单和商品.它们之间存在添加"关系. 添加"关系具有时间戳属性.我想知道的是(对于每篇文章)在订单中的前两篇文章中有多少次.我尝试的是以下方法:

I have two node types : Order and Article. Between them there is an "ADDED" relationship. "ADDED" relationship has a timestamp property. What I want to know (for every article) is how many times it was among the first two articles added to an order. What I tried is the following approach:

  1. 获取所有订单-[ADDED]-文章

  1. get all the Order-[ADDED]-Article

将步骤1中的结果按订单ID作为第一个排序关键字,然后按ADDED关系的时间戳作为第二个排序关键字;

sort the result from step 1 by order id as first sorting key and then by timestamp of ADDED relationship as second sorting key;

对于第2步中代表一个顺序的每个子组,仅保留前2行;

for every subgroup from step 2 representing one order, keep only the top 2 rows;

在步骤3的输出中计算不同的文章ID;

Count distinct article ids in the output of step 3;

我的问题是我陷入了第3步.是否可以为代表订单的每个子组获取前2行?

My problem is that I got stuck at step 3. Is it possible to get top 2 rows for every subgroup representing an order?

谢谢

提比留

尝试

MATCH (o:Order)-[r:ADDED]->(a:Article)
WITH o, r, a
ORDER BY o.oid, r.t
WITH o, COLLECT(a)[..2] AS topArticlesByOrder UNWIND topArticlesByOrder AS a
RETURN a.aid AS articleId, COUNT(*) AS count

结果看起来像

articleId    count
   8           6
   2           2
   4           5
   7           2
   3           3
   6           5
   0           7

在此示例图使用

FOREACH(opar IN RANGE(1,15) |
    MERGE (o:Order {oid:opar})
    FOREACH(apar IN RANGE(1,5) |
        MERGE (a:Article {aid:TOINT(RAND()*10)})
        CREATE o-[:ADDED {t:timestamp() - TOINT(RAND()*1000)}]->a
    )
)