Spring-Cloud-Stream Kafka Azure - 获取数据时出现意外错误代码 13

问题描述:

我正在开发一个 dockerized 的 SpringBoot 应用程序.docker 镜像是微服务,其中之一与 Azure 事件中心通信.

I'm developing a SpringBoot app dockerized. The docker images are microservices and one of these communicate with Azure Event Hub.

我的一些属性:

spring-boot ->2.0.7.RELEASE

spring-boot -> 2.0.7.RELEASE

spring-cloud.version ->Finchley.SR2

spring-cloud.version -> Finchley.SR2

我在 Azure 中创建了一个主题(启用了 Kafka).

I've created a topic in Azure(with Kafka enabled).

我按照一些简单的指南来设置我的微服务,一切正常.

I've follow some simple guide to set up my microservice and everything works fine.

@EnableBinding({Processor.class})
public class EventService {
    ...
    @Autowired private Processor ehProcessor;
    ...
    public void send(String event) {

        Message<String> message = MessageBuilder
                .withPayload(event)
                .setHeader(MessageHeaders.CONTENT_TYPE, MimeTypeUtils.APPLICATION_JSON)
                .build();

        boolean send = ehProcessor.output().send(message, 5000L);

        if (!send) {

            log.error("Event NOT sent", event);
        }
    }

    ...

    @StreamListener(target = Processor.INPUT)
    public void receive(String event) {

        handle(event);
    }
}

整整一个月一切都很好,但在最后两天,微服务卡住了,因为连续的堆栈跟踪填满了我的所有磁盘(解决方案是设置 docker 日志轮换).

For an entire month everything goes fine but in the last two days the microservice stucks because a continuous stacktrace is filling all my disc (a solution was to set up a docker log rotation).

java.lang.IllegalStateException: Unexpected error code 13 while fetching data
        at org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:891) ~[kafka-clients-1.0.1.jar!/:na]
        at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:528) ~[kafka-clients-1.0.1.jar!/:na]
        at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1154) ~[kafka-clients-1.0.1.jar!/:na]
        at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1111) ~[kafka-clients-1.0.1.jar!/:na]
        at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:699) ~[spring-kafka-2.1.7.RELEASE.jar!/:2.1.7.RELEASE]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_181]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_181]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]

我说的是 1 毫秒内的 8/9 日志消息.

I'm talking about 8/9 log messsage in 1 ms.

查看 org.apache.kafka.common.protocol.Errors 类的内部该错误与:

Looking inside the org.apache.kafka.common.protocol.Errors class the error is related to:

NETWORK_EXCEPTION(13, "服务器在响应之前断开连接收到."

NETWORK_EXCEPTION(13, "The server disconnected before a response was received."

我无法以编程方式重现此错误.我不明白为什么当第一个错误出现时,日志会开始并且不会在无限循环中停止.我需要停止docker容器,有时容器不会停止.唯一的解决方案是移除容器并重新创建.

I'm not able to reproduce programmatically this error. I don't understand why when the first error is arised the log will start and no stop in infinite loop. I need to stop the docker container and sometimes the container will not stop. The only solution is to remove the container and recreate again.

更新

我在 github 上打开了一个问题 此处一>.我已经收到回复,他们正在开始调查.

I've open an issue on github here. I've already received a response and they are starting to investigate on it.

更新

问题已解决.

当他们将 UnknownServerException 更改为 NetworkException 时,Spring Boot 开始卡在重试循环中.

When they changed an UnknownServerException to a NetworkException, Spring Boot started getting stuck in the retry loop.

在这个链接中已经确认,

最近发生了一个变化,UnknownServerException 的实例被更改为 NetworkException.

There was a recent change where an instance of UnknownServerException was changed to a NetworkException.

问题详情在这里 - https://github.com/Azure/azure-event-hubs-for-kafka/issues - 包含您的命名空间信息.谢谢!

Issue details are here - https://github.com/Azure/azure-event-hubs-for-kafka/issues - with your namespace info. Thanks!