如何使用Java处理数据流管道中的BigQuery插入错误?

问题描述:

我正在解析XML并使用数据流管道写入Bigquery.如果在BigQuery中插入失败,如何处理错误?我想编写一个自定义代码,将失败的xml写入错误存储区.

I'm parsing a XML and write to Bigquery using dataflow pipeline. How can the errors be handled if the insert fails in BigQuery? I want to write a custom code to write the failed xml to error bucket.

以下代码在写入bigquery时获取失败的行:

The following code gets the failed rows when writing to bigquery:

TableRow row1 = new TableRow().set("name", "a").set("number", "1");
TableRow row2 = new TableRow().set("name", "b").set("number", "2");
TableRow row3 = new TableRow().set("name", "c").set("number", "error");    
PCollection<TableRow> failedRows =
        p.apply(Create.of(row1, row2, row3))
            .apply(
                BigQueryIO.writeTableRows()
                    .to("project-id:dataset-id.table-id")
                    .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
                    .withSchema(
                        new TableSchema()
                            .setFields(
                                ImmutableList.of(
                                    new TableFieldSchema().setName("name").setType("STRING"),
                                    new TableFieldSchema().setName("number").setType("INTEGER"))))             
            .getFailedInserts();