Apache Flink将S3用于后端状态和检查点
问题描述:
- 我打算使用S3通过
FsStateBackend
存储Flink的检查点.但是不知何故我遇到了以下错误.
- I was planning to use S3 to store the Flink's checkpoints using the
FsStateBackend
. But somehow I was getting the following error.
错误
org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 's3'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded.
Flink版本::我正在使用Flink 1.10.0版本.
Flink version: I am using Flink 1.10.0 version.
答
我已经找到了解决上述问题的方法,所以在这里,我将其按要求列出来.
I have found the solution for the above issue, so here I am listing it in steps that are required.
- 我们需要在下面列出的
flink-conf.yaml
文件中添加一些配置.
- We need to add some configs in the
flink-conf.yaml
file which I have listed below.
state.backend: filesystem
state.checkpoints.dir: s3://s3-bucket/checkpoints/ #"s3://<your-bucket>/<endpoint>"
state.backend.fs.checkpointdir: s3://s3-bucket/checkpoints/ #"s3://<your-bucket>/<endpoint>"
s3.access-key: XXXXXXXXXXXXXXXXXXX #your-access-key
s3.secret-key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx #your-secret-key
s3.endpoint: http://127.0.0.1:9000 #your-endpoint-hostname (I have used Minio)
-
完成第一步后,我们需要将opt目录中的相应
flink-s3-fs-hadoop-1.10.0.jar
和flink-s3-fs-presto-1.10.0.jar
)JAR文件复制到Flink的plugins目录中.
After completing the first step we need to copy the respective(
flink-s3-fs-hadoop-1.10.0.jar
andflink-s3-fs-presto-1.10.0.jar
) JAR files from the opt directory to the plugins directory of your Flink.
- 例如:-> 1.将
/flink-1.10.0/opt/flink-s3-fs-hadoop-1.10.0.jar
复制到/flink-1.10.0/plugins/s3-fs-hadoop/flink-s3-fs-hadoop-1.10.0.jar
//建议用于StreamingFileSink
2..将/flink-1.10.0/opt/flink-s3-fs-presto-1.10.0.jar
复制到/flink-1.10.0/plugins/s3-fs-presto/flink-s3-fs-presto-1.10.0.jar
//推荐用于检查点
-
E.g:--> 1. Copy
/flink-1.10.0/opt/flink-s3-fs-hadoop-1.10.0.jar
to/flink-1.10.0/plugins/s3-fs-hadoop/flink-s3-fs-hadoop-1.10.0.jar
// Recommended for StreamingFileSink
2. Copy/flink-1.10.0/opt/flink-s3-fs-presto-1.10.0.jar
to/flink-1.10.0/plugins/s3-fs-presto/flink-s3-fs-presto-1.10.0.jar
//Recommended for checkpointing
在检查点代码中添加它
env.setStateBackend(new FsStateBackend("s3://s3-bucket/checkpoints/"))
- 完成上述所有步骤后,如果Flink已在运行,请重新启动.
注意:
- 如果您在Flink中同时使用(
flink-s3-fs-hadoop
和flink-s3-fs-presto
),请专门对flink-s3-fs-presto
使用s3p://
,对flink-s3-fs-hadoop
使用s3a://
而不是s3://
. - 有关更多详细信息,请单击此处.
- If you are using both(
flink-s3-fs-hadoop
andflink-s3-fs-presto
) in Flink then please uses3p://
specificly forflink-s3-fs-presto
ands3a://
forflink-s3-fs-hadoop
instead ofs3://
. - For more details click here.