问题
I've installed Flume and Hadoop manually (I mean, not CDH) and I'm trying to run the twitter example from Cloudera.
In the apache-flume-1.5.0-SNAPSHOT-bin
directory, I start the agent with the following command:
bin/flume-ng agent -c conf -f conf/twitter.conf -Dflume.root.logger=DEBUG,console -n TwitterAgent
My conf/twitter.conf
file uses the logger as the sink. The conf/flume-env.sh
assigns to CLASSPATH the flume-sources-1.0-SNAPSHOT.jar
that contains the definition of the twitter source. The resulting output is:
(...) [ERROR org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)] Unable to start EventDrivenSourceRunner: { source:com.cloudera.flume.source.TwitterSource{name:Twitter,state:IDLE} } - Exception follows. java.lang.NoSuchMethodError:
twitter4j.FilterQuery.setIncludeEntities(Z)Ltwitter4j/FilterQuery;
at com.cloudera.flume.source.TwitterSource.start(TwitterSource.java:139)
The conflict results from a FilterQuery
class that is defined elsewhere in the flume lib and that does not contain the setIncludeEntities
method. For me, the file that contains this class is the twitter4j-stream-3.0.3.jar
and I cannot exclude the file from the classpath as suggested here.
回答1:
I believe this experience was quite frustrating for you, for me it was for sure. The main problem is, both the files, flume-sources-1.0-SNAPSHOT.jar and twitter4j-stream-3.0.3.jar contains the same FilterQuery.class. That is why the conflict message is generated in the log file.
I am not a Java or Big Data expert, but I can give you an alternate to this problem. Download the Twitter4j-stream-2.6.6.jar or lower version from here and replacethe twitter4j-stream-3.0.3.jar. All the 3.X.X uses this class. After replacing, everything should work fine. But you may get some heap error after downloading huge amount of tweets. Please google the solution as it was resolved in 3.X.X files.
-Edit Also, please don't forget to download and replace all the twitter4j files in /usr/lib/flume-ng folder. Namely, twitter4j-media-support-2.2.6.jar, twitter4j-stream-2.2.6.jar and twitter4j-core-2.2.6.jar. Any mismatch related to version among these files will also create problem.
回答2:
As suggested in the post a problematic file can be search-contrib-1.0.0-jar-with-dependencies.jar too.
回答3:
You need to recompile flume-sources-1.0-SNAPSHOT.jar from the git:https://github.com/cloudera/cdh-twitter-example
Install Maven, then download the repository of cdh-twitter-example.
Unzip, then execute inside (as mentionned) :
$ cd flume-sources
$ mvn package
$ cd ..
This problem happened when the twitter4j version updated from 2.2.6 to 3.X, they removed the method setIncludeEntities, and the JAR is not up to date.
PS: Do not download the prebuilt version, it is still the old.
回答4:
Simply rename all twitter4j-stream* jar files and rerun your flume. It will work with charm. :)
回答5:
I had the same problem and at last I solved following these steps:
- First I renamed all jar files in jarx: from twitter4j-stream-3.0.3.jar -> twitter4j-stream-3.0.3.jarx, ...
This solved the error, but when it tried to estabilish connection, I got error 404:
(Twitter Stream consumer-1[Establishing connection])
[INFO - Twitter4j.internal.logging.SLF4JLogger.info(SLF4JLogger.java:83)] 404:
The URI requested is invalid or the resource requested, such as a user, does not exist.)
- After reading this page (https://twittercommunity.com/t/twitter-streaming-api-not-working-with-twitter4j-and-apache-flume/66612/11) finally I solved downloading a new version of twitter4j (in the page there's a link). Probably not the best solution, but worked for me.
来源:https://stackoverflow.com/questions/19189979/cannot-run-flume-because-of-jar-conflict