Spring 与 Apache Beam
我想将 Spring 与将在 Google Cloud Data flow Runner 上运行的 Apache Beam 结合使用.数据流作业应该能够在执行流水线步骤时使用 Spring Runtime 应用程序上下文.我想在我的 Apache Beam 管道中将 Spring 功能用于 DI 和其他东西.在 google 上浏览了几个小时后,我找不到任何显示 Spring 集成在 Apache Beam 中的帖子或文档.所以,如果有人用 Apache Beam 尝试过 spring,请告诉我.
I want to use Spring with Apache Beam that will run on Google Cloud Data flow Runner. Dataflow job should be able to use Spring Runtime application context while executing the Pipeline steps. I want to use Spring feature in my Apache Beam pipeline for DI and other stuff. After browsing hours on google, I couldn't find any post or documentation which shows Spring integration in Apache Beam. So, if anyone has tried spring with Apache beam, please let me know.
在主类中,我已经初始化了 spring 应用程序上下文,但是在执行管道步骤时它不可用.我得到自动装配 bean 的空指针异常.我想问题是,在运行时上下文对工作线程不可用.
In main class i have initialised the spring application context but it is not available while execution of pipeline steps. I get null pointer exception for autowired beans. I guess the problem is, at runtime context is not available to worker threads.
public static void main(String[] args) {
initSpringApplicationContext();
GcmOptions options = PipelineOptionsFactory.fromArgs(args)
.withValidation()
.as(GcmOptions.class);
Pipeline pipeline = Pipeline.create(options);
// pipeline definition
}
我想将 spring 应用程序上下文注入到每个 ParDo 函数中.
I want to inject the spring application context to each of the ParDo functions.
这里的问题是 ApplicationContext 在任何 worker 上都不可用,因为 main
方法仅在构造作业时调用,并且不在任何工人机器上.因此,initSpringApplicationContext
永远不会在任何 worker 上调用.
The problem here is that the ApplicationContext is not available on any worker, as the main
method is only called when constructing the job and not on any worker machine. Therefore, initSpringApplicationContext
is never called on any worker.
我从未尝试在 Apache Beam 中使用 Spring,但我想将 initSpringApplicationContext
移动到静态初始化程序块中会导致您预期的结果.
I've never tried to use Spring within Apache Beam, but I guess moving initSpringApplicationContext
in a static initializer block will lead to your expected result.
public class ApplicationContextHolder {
private static final ApplicationContext CTX;
static {
CTX = initApplicationContext();
}
public static ApplicationContext getContext() {
return CTX;
}
}
请注意,仅凭这一点不应被视为在 Apache Beam 中使用 Spring 的最佳实践,因为它不能很好地集成到 Apache Beam 的生命周期中.例如,当应用上下文初始化过程中发生错误时,它会出现在第一个使用ApplicationContextHolder
的地方.因此,我建议从静态初始化程序块中提取 initApplicationContext
并根据 Apache Beam 的生命周期显式调用它.设置 阶段将是一个好地方.
Please be aware that this alone shouldn't be considered as a best practice of using Spring within Apache Beam since it doesn't integrate well in the lifecycle of Apache Beam. For example, when an error happens during the initialization of the application context, it will appear in the first place where the ApplicationContextHolder
is used. Therefore, I'd recommend to extract initApplicationContext
out of the static initializer block and call it explicitly with regards to Apache Beam's Lifecycle. The setup phase would be a good place for this.