如何使用Java处理数据流管道中的BigQuery插入错误?
问题描述:
我正在解析XML并使用数据流管道写入Bigquery.如果在BigQuery中插入失败,如何处理错误?我想编写一个自定义代码,将失败的xml写入错误存储区.
I'm parsing a XML and write to Bigquery using dataflow pipeline. How can the errors be handled if the insert fails in BigQuery? I want to write a custom code to write the failed xml to error bucket.
答
以下代码在写入bigquery时获取失败的行:
The following code gets the failed rows when writing to bigquery:
TableRow row1 = new TableRow().set("name", "a").set("number", "1");
TableRow row2 = new TableRow().set("name", "b").set("number", "2");
TableRow row3 = new TableRow().set("name", "c").set("number", "error");
PCollection<TableRow> failedRows =
p.apply(Create.of(row1, row2, row3))
.apply(
BigQueryIO.writeTableRows()
.to("project-id:dataset-id.table-id")
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withSchema(
new TableSchema()
.setFields(
ImmutableList.of(
new TableFieldSchema().setName("name").setType("STRING"),
new TableFieldSchema().setName("number").setType("INTEGER"))))
.getFailedInserts();