How to continuously feed sniffed packets to kafka?

旧城冷巷雨未停 提交于 2019-12-01 22:41:59

问题


Currently I am sniffing packets from my local wlan interface like :

sudo tshark > sampleData.pcap

However, I need to feed this data to kafka.

Currently, I have a kafka producer script producer.sh:

../bin/kafka-console-producer.sh --broker-list localhost:9092 --topic 'spark-kafka'

and feed data to kafka like this:

producer.sh < sampleData.pcap

where in sampleData.pcap I have pre-captured IP packet information.

However, I wanna automate the process where it'd be something like this:

sudo tshark > http://localhost:9091
producer.sh < http://localhost:9091

This is obviously just a pseudoalgorithm. What I want to do is, send the sniffing data to a port and have kafka continuously read it. I don't want kafka to read from a file continuously because that'd mean tremendous amount of read/write operations from a single file causing inefficiency.

I searched the internet and came across kafka-connect but I can't find any useful documentation for implementing something like this.

What's the best way to implement something like this?

Thanks!


回答1:


With netcat

No need to write a server, you can use netcat (and tell your script to listen on the standard input):

shell1> nc -l 8888 | ./producer.sh
shell2> sudo tshark -l | nc 127.1 8888

The -l of tshark prevents it from buffering the output too much (flushes after each packet).


With a named pipe

You could also use a named pipe to transmit tshark output to your second process:

shell1> mkfifo /tmp/tsharkpipe
shell1> tail -f -c +0 /tmp/tsharkpipe | ./producer.sh
shell2> sudo tshark -l > /tmp/tsharkpipe



回答2:


I think you can either

  • create a tiny server that connects to kafka ant listens to a port
  • use the kafka-file connector and append all your data to that file. http://kafka.apache.org/documentation.html#quickstart_kafkaconnect



回答3:


If you use Node, you can use child_process and kafka_node to do it. Something like this:

var kafka = require('kafka-node');
var client = new kafka.Client('localhost:2181');
var producer = new kafka.Producer(client);

var spawn = require('child_process').spawn;
var tshark = spawn('sudo', ['/usr/sbin/tshark']);

tshark.stdout.on('data', (data) => {
  producer.send([
    {topic: 'spark-kafka', messages: [data.split("\n")]}
  ], (err,result) => { console.log("sent to kafka")});
});



回答4:


Another option would be to use Apache NiFi. With NiFi you can execute commands and pass the output to other blocks for further processing. Here you could have NiFi execute a tshark command on the local host and then pass the output to Kafka.

There is an example here which should demonstrate this type of approach in slightly more detail.



来源:https://stackoverflow.com/questions/35872663/how-to-continuously-feed-sniffed-packets-to-kafka

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!