Sunday, January 7, 2018

Get total number of tweets in twitter using apache spark scala

Get total number of tweets in twitter using apache spark scala




Get total number of tweets in twitter using apache spark scala -

i new apache-spark , want find out total number of tweets posted across world in twitter every 10 seconds span of time. wrote little snippet tag in twitter. need find out total count of tweets in twitter.

please help me resolve issue.

import java.io._ import org.apache.spark.streaming.{seconds, streamingcontext} import streamingcontext._ import org.apache.spark.sparkcontext._ import org.apache.spark.streaming.twitter._ object twitterpopulartags { def main(args: array[string]) { val (master, filters) = (args(0), args.slice(5, args.length)) // twitter authentication credentials system.setproperty("twitter4j.oauth.consumerkey", "xxxx") system.setproperty("twitter4j.oauth.consumersecret","xxxx") system.setproperty("twitter4j.oauth.accesstoken", "xxxx") system.setproperty("twitter4j.oauth.accesstokensecret", "xxxx") val ssc = new streamingcontext(master, "twitterpopulartags",seconds(10), system.getenv("spark_home"), streamingcontext.jarofclass(this.getclass)) val tweets = twitterutils.createstream(ssc, none) val statuses = tweets.map(status => status.gettext()) val words = statuses.flatmap(status => status.split(" ")) val hashtags = words.filter(word => word.startswith("#")) val tagcounts = hashtags.window(seconds(100), seconds(10)).countbyvalue() tagcounts.print()

}

rdd tweets contains tweets received.

tweets.count gives total count.

scala twitter apache-spark

go to link download
download
alternative link download

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.