问题
During an ETL process, we had random exception that causes loss of flowfile. Nifi is deployed on 3 nodes Kubernetes cluster with repositories on shared file-system (GlusterFS). We did some stress test and on 2000 files csv being processed almost 10% get lost with the exception reported. We tried also to scale down to one node and setting the number of parallel threads to 1 in order to minimize parallelism problems on the incriminated processors (validatecsv and validatejsonpath). It seems that the processor tries to access the flowfile content in late. The problem is not systematic and occour randomly, it occours on 1.8 but also upgrading to last stable 1.9.2 doesn't help.
This is the exception occurred. Any help is appreciated.
2019-11-11 08:34:10,011 ERROR [Timer-Driven Process Thread-7] o.a.n.p.standard.CompressContent CompressContent[id=b634d291-6f29-389e-b481-3539828a2205] CompressContent[id=b634d291-6f29-389e-b481-3539828a2205] failed to process session due to org.apache.nifi.processor.exception.MissingFlowFileException: Unable to find content for FlowFile; Processor Administratively Yielded for 1 sec: org.apache.nifi.processor.exception.MissingFlowFileException: Unable to find content for FlowFile
org.apache.nifi.processor.exception.MissingFlowFileException: Unable to find content for FlowFile
at org.apache.nifi.controller.repository.StandardProcessSession.handleContentNotFound(StandardProcessSession.java:3132)
at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2926)
at org.apache.nifi.processors.standard.CompressContent.onTrigger(CompressContent.java:236)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.nifi.controller.repository.ContentNotFoundException: Could not find content for StandardContentClaim [resourceClaim=StandardResourceClaim[id=1573461249850-433, container=default, section=433], offset=8002, length=4957]: Stream contained only 0 bytes but should have contained 4957
at org.apache.nifi.controller.repository.io.FlowFileAccessInputStream.ensureAllContentRead(FlowFileAccessInputStream.java:49)
at org.apache.nifi.controller.repository.io.FlowFileAccessInputStream.read(FlowFileAccessInputStream.java:84)
at org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:68)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at org.apache.nifi.processors.standard.CompressContent$1.process(CompressContent.java:312)
at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2908)
... 12 common frames omitted
2019-11-11 08:34:10,013 WARN [Timer-Driven Process Thread-7] o.a.n.controller.tasks.ConnectableTask Administratively Yielding CompressContent[id=b634d291-6f29-389e-b481-3539828a2205] due to uncaught Exception: org.apache.nifi.processor.exception.MissingFlowFileException: Unable to find content for FlowFile
org.apache.nifi.processor.exception.MissingFlowFileException: Unable to find content for FlowFile
at org.apache.nifi.controller.repository.StandardProcessSession.handleContentNotFound(StandardProcessSession.java:3132)
at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2926)
at org.apache.nifi.processors.standard.CompressContent.onTrigger(CompressContent.java:236)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.nifi.controller.repository.ContentNotFoundException: Could not find content for StandardContentClaim [resourceClaim=StandardResourceClaim[id=1573461249850-433, container=default, section=433], offset=8002, length=4957]: Stream contained only 0 bytes but should have contained 4957
at org.apache.nifi.controller.repository.io.FlowFileAccessInputStream.ensureAllContentRead(FlowFileAccessInputStream.java:49)
at org.apache.nifi.controller.repository.io.FlowFileAccessInputStream.read(FlowFileAccessInputStream.java:84)
at org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:68)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at org.apache.nifi.processors.standard.CompressContent$1.process(CompressContent.java:312)
at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2908)
... 12 common frames omitted
来源:https://stackoverflow.com/questions/59028521/missing-flowfile-exception-on-nifi-processing-cause-loss-of-information