如何在 Flink 流中的空窗口上执行函数?

问题描述:

我编写了一个 Flink 程序,它从一个简单的 kafka 流中计算每个键控窗口的事件数.我工作得很好,快速&准确的.当源停止时,我希望 0 作为每个窗口的计算结果,但没有发送结果.该函数只是不执行.我认为这是因为 Flink 的惰性操作行为.

I wrote a Flink program that calculates the number of events per keyed window from a simple kafka stream. I works great, fast & accurate. When the source stops, I would like to have 0 as result of the calculation on each window, but no result is sent. The function just does not execute. I assume this is because of the lazy operation behavior of Flink.

有什么推荐吗?

我遇到了同样的情况.用另一个进程填充数据库中的漏洞是一种解决方案.

I encountered the same situation. Filling the holes in your database with another process is a solution.

但是,我发现将主流与自定义期刊源结合起来更容易,该源会发出假人,其唯一作用是触发窗口创建.执行此操作时,您必须确保在计算中忽略哑元.

However, I found it easier to union your main stream with a custom periodical source, that emits dummies, whose only roles are to trigger windows creation. When doing this, you have to make sure that dummies are ignored in computations.

这是编写期刊源代码的方法(但是您可能不需要RichParallelSourceFunction,一个SourceFunction就够了)

Here is how to code a periodical source (however you may not need a RichParallelSourceFunction, a SourceFunction can be enough)