I am trying to use Flume-ng to grab 90 seconds of log information and put it into a file in HDFS. I have flume working to look at the log file via an exec and tail however it i
According to the source code of org.apache.flume.sink.hdfs.BucketWriter:
/**
* Internal API intended for HDFSSink use.
* This class does file rolling and handles file formats and serialization.
* Only the public methods in this class are thread safe.
*/
class BucketWriter {
...
/**
* open() is called by append()
* @throws IOException
* @throws InterruptedException
*/
private void open() throws IOException, InterruptedException {
...
// if time-based rolling is enabled, schedule the roll
if (rollInterval > 0) {
Callable action = new Callable() {
public Void call() throws Exception {
LOG.debug("Rolling file ({}): Roll scheduled after {} sec elapsed.",
bucketPath, rollInterval);
try {
// Roll the file and remove reference from sfWriters map.
close(true);
} catch(Throwable t) {
LOG.error("Unexpected error", t);
}
return null;
}
};
timedRollFuture = timedRollerPool.schedule(action, rollInterval,
TimeUnit.SECONDS);
}
...
}
...
/**
* check if time to rotate the file
*/
private boolean shouldRotate() {
boolean doRotate = false;
if (writer.isUnderReplicated()) {
this.isUnderReplicated = true;
doRotate = true;
} else {
this.isUnderReplicated = false;
}
if ((rollCount > 0) && (rollCount <= eventCounter)) {
LOG.debug("rolling: rollCount: {}, events: {}", rollCount, eventCounter);
doRotate = true;
}
if ((rollSize > 0) && (rollSize <= processSize)) {
LOG.debug("rolling: rollSize: {}, bytes: {}", rollSize, processSize);
doRotate = true;
}
return doRotate;
}
...
}
and org.apache.flume.sink.hdfs.AbstractHDFSWriter
public abstract class AbstractHDFSWriter implements HDFSWriter {
...
@Override
public boolean isUnderReplicated() {
try {
int numBlocks = getNumCurrentReplicas();
if (numBlocks == -1) {
return false;
}
int desiredBlocks;
if (configuredMinReplicas != null) {
desiredBlocks = configuredMinReplicas;
} else {
desiredBlocks = getFsDesiredReplication();
}
return numBlocks < desiredBlocks;
} catch (IllegalAccessException e) {
logger.error("Unexpected error while checking replication factor", e);
} catch (InvocationTargetException e) {
logger.error("Unexpected error while checking replication factor", e);
} catch (IllegalArgumentException e) {
logger.error("Unexpected error while checking replication factor", e);
}
return false;
}
...
}
the rolling of hdfs files is controlled by 4 conditions:
Change the values accoding to these if-segments in BucketWriter.class