Access Json element and write to a text file using python ExecuteScript processor

我只是一个虾纸丫 提交于 2019-12-11 04:22:08

问题


I am new to python and nifi.

My flow is GetFile-->ExecuteScript

In the script, for each json,i want to accesss a particular element and write it to a text file line by line.

I tried the below:

import json
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback

class ModJSON(StreamCallback):
  def __init__(self):
    pass
  def process(self, inputStream, outputStream):
  text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
  json_content = json.loads(text)
  try:
     body = json_content['id']['body']
     body_encoded = body.encode('utf-8')
  except (KeyError,TypeError,ValueError):
     body_encoded = ''

  text_file = open ('/tmp/test/testFile.txt', 'w')   
  text_file.write("%s"%body_encoded)
  text_file.close()
  outputStream.write(bytearray(json.dumps(body, indent=4).encode('utf-8')))

flowFile = session.get()
if (flowFile != None):
    flowFile = session.write(flowFile, ModJSON())
    flowFile = session.putAttribute(flowFile, "filename", flowFile.getAttribute('filename').split('.')[0]+'_translated.json')
session.transfer(flowFile, REL_SUCCESS)

but in the testFile.txt, the accessed body is not being written.

what do i miss here?


回答1:


The body of your Python class is not indented, and neither is the body of the process method. Try indenting one level from the def init line through the outputStream.write line, then again indent one level from the text = IOUtils.toString line through the outputStream.write line, this should give you a working StreamCallback class and cause the script to work correctly.

Also you do not need a call to session.commit(), that will be called for you when the script is complete.

EDIT (due to OP edit -- see comments): The script above is still not indented correctly, the body of the process() method needs to be indented. Are you getting errors or bulletins on the ExecuteScript processor? If the incoming flow files are being queued before ExecuteScript, then the "flowFile = session.get()" is not getting executed, or the processor should be throwing an error and posting a bulletin (a red box on the upper right corner).

Also since you intend to send the same content out of the processor in a flow file, you shouldn't need the "text_file" code, I assume that's for debugging?



来源:https://stackoverflow.com/questions/40630612/access-json-element-and-write-to-a-text-file-using-python-executescript-processo

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!