Skip to content

lin1000/TwitterPublicAPI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

From Twiiter Public Twitter API to Social Network Analysis Experiements

Configurations

In this sample, I will leave content of twitter4j.properties as dummy data. please replace the data in twitter4j.properties with your own and make sure the .jar and .properties in a same directory.

Feature List

  • (java) To connect to Public Twitter API using your own keys and secretid

  • (java) Given twitter handle, you can find the followers' handle list

  • (java) Twitter API Key Resoure Control by managing the concurrency and locking mechanism to maximize the rate litmit utilization

  • (java) Executor Thread pool to submit concurrent tasks

  • (python) Random Sampling Account and then output as csv file in 01SamplingAccount folder

  • (python) Read through full account list and then output as csv file in 01FullAccount folder

  • (python) Compose a gnip query rule with interested accounts that aligning with rule limitations

  • (python) Create a historical job that can sent to gnip

  • (python) Generate csv files group by rule tags

  • (spark) Generate json/csv files group by rule tags (accerelate processing speed by parallelizing)

    spark-submit --master "local[*]" --executor-memory 2G --total-executor-cores 20 06GNIPDataGroupByRuleTag-Spark.py > 06GNIPDataGroupByRuleTag-Spark.log 2>&1
    
  • (spark) Generate json/csv files filter by influencee account (accerelate processing speed by parallelizing)

  • (spark) Speark GraphX to analyze the social networking of random sampled followers

  • (java8) CountTweets

    export MAVEN_OPTS="-ea"
    mvn exec:java@0002 -Dexec.args="./output/collect-follower-day4/modelpress.followers.json Scanner"
    
  • (java8) CountTweetsParaller : Use parallels stream to parse json object

    export MAVEN_OPTS="-ea"
    mvn exec:java@0003 -Dexec.args="./output/collect-follower-day4/modelpress.followers.json Parallels"
    
  • (node v6) Mapbox visualization on followers home locations

Utility

  • (java8) Utility Class that getting directories and files resursively using stream. Here, in order to handle checked exception in stream chain , Throwables.propagate(e) in google guava library was used.
    mvn exec:java@0004
    

GitBook

  • Adding GitBook Integration (experimental)

Languages

Java (Stream, Concurrency, Twitter API)
Python (Data Processing)
Spark (Data Processing)
Node V6 (Mapbox Visualization)

About

This is a place for Public Twitter API experiments

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors