How to setup a HTTP Source for testing Flume setup?

前端 未结 3 1360
逝去的感伤
逝去的感伤 2020-12-29 10:12

I am a newbie to Flume and Hadoop. We are developing a BI module where we can store all the logs from different servers in HDFS.

For this I am using Flume. I just st

3条回答
  •  执笔经年
    2020-12-29 11:00

    Hopefully this helps you get started. I'm having some problems testing this on my machine and don't have time to fully troubleshoot it right now, but I'll get to that...

    Assuming you have Flume up and running right now, this should be what your flume.conf file needs to look like to use an HTTP POST source and local file sink (note: this goes to a local file, not HDFS)

    ########## NEW AGENT ########## 
    # flume-ng agent -f /etc/flume/conf/flume.httptest.conf -n httpagent
    # 
    
    # slagent = SysLogAgent
    ###############################
    httpagent.sources = http-source
    httpagent.sinks = local-file-sink
    httpagent.channels = ch3
    
    # Define / Configure Source (multiport seems to support newer "stuff")
    ###############################
    httpagent.sources.http-source.type = org.apache.flume.source.http.HTTPSource
    httpagent.sources.http-source.channels = ch3
    httpagent.sources.http-source.port = 81
    
    
    # Local File Sink
    ###############################
    httpagent.sinks.local-file-sink.type = file_roll
    httpagent.sinks.local-file-sink.channel = ch3
    httpagent.sinks.local-file-sink.sink.directory = /root/Desktop/http_test
    httpagent.sinks.local-file-sink.rollInterval = 5
    
    # Channels
    ###############################
    httpagent.channels.ch3.type = memory
    httpagent.channels.ch3.capacity = 1000
    

    Start Flume with the command on the second line. Tweak it for your needs (port, sink.directory, and rollInterval especially). This is a pretty bare minimum config file, there are more options availible, check out the Flume User Guide. Now, as far as this goes, the agent starts and runs fine for me....

    Here's what I don't have time to test. The HTTP agent, by default, accepts data in JSON format. You -should- be able to test this agent by sending a cURL request with a form something like this:

    curl -X POST -H 'Content-Type: application/json; charset=UTF-8' -d '{"username":"xyz","password":"123"}' http://yourdomain.com:81/
    

    -X sets the request to POST, -H sends headers, -d sends data (valid json), and then the host:port. The problem for me is that I get an error:

    WARN http.HTTPSource: Received bad request from client. org.apache.flume.source.http.HTTPBadRequestException: Request has invalid JSON Syntax.
    

    in my Flume client, invalid JSON? So something is being sent wrong. The fact that an error is popping up though shows the Flume source is receiving data. Whatever you have that's POSTing should work as long as it's in a valid format.

提交回复
热议问题