Missing flowfile exception on Nifi processing cause loss of information

自闭症网瘾萝莉.ら 提交于 2019-12-24 21:59:55

问题


During an ETL process, we had random exception that causes loss of flowfile. Nifi is deployed on 3 nodes Kubernetes cluster with repositories on shared file-system (GlusterFS). We did some stress test and on 2000 files csv being processed almost 10% get lost with the exception reported. We tried also to scale down to one node and setting the number of parallel threads to 1 in order to minimize parallelism problems on the incriminated processors (validatecsv and validatejsonpath). It seems that the processor tries to access the flowfile content in late. The problem is not systematic and occour randomly, it occours on 1.8 but also upgrading to last stable 1.9.2 doesn't help.

This is the exception occurred. Any help is appreciated.

    2019-11-11 08:34:10,011 ERROR [Timer-Driven Process Thread-7] o.a.n.p.standard.CompressContent CompressContent[id=b634d291-6f29-389e-b481-3539828a2205] CompressContent[id=b634d291-6f29-389e-b481-3539828a2205] failed to process session due to org.apache.nifi.processor.exception.MissingFlowFileException: Unable to find content for FlowFile; Processor Administratively Yielded for 1 sec: org.apache.nifi.processor.exception.MissingFlowFileException: Unable to find content for FlowFile
org.apache.nifi.processor.exception.MissingFlowFileException: Unable to find content for FlowFile
    at org.apache.nifi.controller.repository.StandardProcessSession.handleContentNotFound(StandardProcessSession.java:3132)
    at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2926)
    at org.apache.nifi.processors.standard.CompressContent.onTrigger(CompressContent.java:236)
    at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
    at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
    at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
    at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.nifi.controller.repository.ContentNotFoundException: Could not find content for StandardContentClaim [resourceClaim=StandardResourceClaim[id=1573461249850-433, container=default, section=433], offset=8002, length=4957]: Stream contained only 0 bytes but should have contained 4957
    at org.apache.nifi.controller.repository.io.FlowFileAccessInputStream.ensureAllContentRead(FlowFileAccessInputStream.java:49)
    at org.apache.nifi.controller.repository.io.FlowFileAccessInputStream.read(FlowFileAccessInputStream.java:84)
    at org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:68)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
    at java.io.FilterInputStream.read(FilterInputStream.java:107)
    at org.apache.nifi.processors.standard.CompressContent$1.process(CompressContent.java:312)
    at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2908)
    ... 12 common frames omitted
2019-11-11 08:34:10,013 WARN [Timer-Driven Process Thread-7] o.a.n.controller.tasks.ConnectableTask Administratively Yielding CompressContent[id=b634d291-6f29-389e-b481-3539828a2205] due to uncaught Exception: org.apache.nifi.processor.exception.MissingFlowFileException: Unable to find content for FlowFile
org.apache.nifi.processor.exception.MissingFlowFileException: Unable to find content for FlowFile
    at org.apache.nifi.controller.repository.StandardProcessSession.handleContentNotFound(StandardProcessSession.java:3132)
    at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2926)
    at org.apache.nifi.processors.standard.CompressContent.onTrigger(CompressContent.java:236)
    at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
    at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
    at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
    at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.nifi.controller.repository.ContentNotFoundException: Could not find content for StandardContentClaim [resourceClaim=StandardResourceClaim[id=1573461249850-433, container=default, section=433], offset=8002, length=4957]: Stream contained only 0 bytes but should have contained 4957
    at org.apache.nifi.controller.repository.io.FlowFileAccessInputStream.ensureAllContentRead(FlowFileAccessInputStream.java:49)
    at org.apache.nifi.controller.repository.io.FlowFileAccessInputStream.read(FlowFileAccessInputStream.java:84)
    at org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:68)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
    at java.io.FilterInputStream.read(FilterInputStream.java:107)
    at org.apache.nifi.processors.standard.CompressContent$1.process(CompressContent.java:312)
    at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2908)
    ... 12 common frames omitted

来源:https://stackoverflow.com/questions/59028521/missing-flowfile-exception-on-nifi-processing-cause-loss-of-information

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!