问题
I'm Trying to get the fb pages data using graph api. The size each post is more than 1MB where kafka default fetch.message is 1MB. I have changed the kafka properties from 1MB to 3MB by adding the below lines in kafa consumer.properties and server.properties file.
fetch.message.max.bytes=3048576 (consumer.properties)
file message.max.bytes=3048576 (server.properties)
replica.fetch.max.bytes=3048576 (server.properties )
Now after adding the above lines in Kafka, 3MB message data is going into kafka data logs. But STORM is unable to process that data and it is able to read only default size i.e.,1MB data.What Parameters I should add to storm topology to read the 3MB data from kafka topic.Do i need to increase buffer.size in storm?don't have a clear idea about it.
Here is my topology code.
String argument = args[0];
Config conf = new Config();
conf.put(JDBC_CONF, map);
conf.setDebug(true);
conf.put(Config.TOPOLOGY_MAX_SPOUT_PENDING, 1);
//set the number of workers
conf.setNumWorkers(3);
TopologyBuilder builder = new TopologyBuilder();
//Setup Kafka spout
BrokerHosts hosts = new ZkHosts("localhost:2181");
String topic = "year1234";
String zkRoot = "";
String consumerGroupId = "group1";
SpoutConfig spoutConfig = new SpoutConfig(hosts, topic, zkRoot, consumerGroupId);
spoutConfig.scheme = new SchemeAsMultiScheme(new StringScheme()); KafkaSpout kafkaSpout = new KafkaSpout(spoutConfig);
builder.setSpout("KafkaSpout", kafkaSpout,1); builder.setBolt("user_details", new Parserspout(),1).shuffleGrouping("KafkaSpout"); builder.setBolt("bolts_user", new bolts_user(cp),1).shuffleGrouping("user_details");
Thanks In Advance
回答1:
the class SpoutConfig extends KafkaConfig which has all the following settings:
public int fetchSizeBytes = 1024 * 1024;
public int socketTimeoutMs = 10000;
public int fetchMaxWait = 10000;
public int bufferSizeBytes = 1024 * 1024;
public MultiScheme scheme = new RawMultiScheme();
public boolean ignoreZkOffsets = false;
public long startOffsetTime = kafka.api.OffsetRequest.EarliestTime();
public long maxOffsetBehind = Long.MAX_VALUE;
public boolean useStartOffsetTimeIfOffsetOutOfRange = true;
public int metricsTimeBucketSizeInSecs = 60;
notice that they are public so you can change them
spoutConfig.fetchSizeBytes = 3048576;
spoutConfig.bufferSizeBytes = 3048576;
see here: http://grepcode.com/file/repo1.maven.org/maven2/org.apache.storm/storm-kafka/0.9.2-incubating/storm/kafka/KafkaConfig.java#KafkaConfig
来源:https://stackoverflow.com/questions/36159768/how-to-set-spoutconfig-from-default-setting