All posts by Telvis

twitter mining: top tweets by follower count

We can find interesting tweets using the author's follower count and tweet timestamp. We store tweets using CouchDB and search for tweets using tweepy streaming. With these tools we can find the top N tweets per day. The code below uses the couchpy view server to write a view in python. The steps to setup couchpy are found here. Basically, you add the following to /etc/couchdb/local.ini and install couchpy.

Install couchpy and couchdb-python with the following command.

pip install couchdb

Test couchpy is installed.

$ which couchpy
/usr/bin/couchpy

Edit /etc/couchdb/local.ini

[query_servers]
python=/usr/bin/couchpy

This a simple view mapper that maps each tweet to a timestamp so we can query by start and end time.


import couchdb
from couchdb.design import ViewDefinition
import sys

server = couchdb.Server('http://localhost:5984')
db = sys.argv[1]
db = server[db]

def tweets_by_created_at(doc):
    if doc.get('created_at'):
        _date = doc['created_at']
    else:
        _date = 0 # Jan 1 1970
    
    if doc.get('user'):
        yield (_date, doc) 
        
view = ViewDefinition('index', 'daily_tweets', tweets_by_created_at, language='python')
view.sync(db)

The code below queries the view for all tweets within a date range. Then we sort in memory by the follower count.

import couchdb
from datetime import datetime

def run(db, date, limit=10):
    """Query a couchdb view for tweets. Sort in memory by follower count.
    Return the top 10 tweeters and their tweets"""
    print "Finding top %d tweeters"%limit
        
    dt = datetime.strptime(date,"%Y-%m-%d")
    stime=int(time.mktime(dt.timetuple()))
    etime=stime+86400-1
    tweeters = {}
    tweets = {}
    for row in db.view('index/daily_tweets', startkey=stime, endkey=etime):
        status = row.value
        screen_name = status['user']['screen_name']
        followers_count = status['user']['followers_count']
        tweeters[screen_name] = int(followers_count)
        if not tweets.has_key(screen_name):
            tweets[screen_name] = []
        tweets[screen_name].append(status['id_str'])
        
    # sort
    di = tweeters.items() 
    di.sort(key=lambda x: x[1], reverse=True)
    out = {}
    for i in range(limit):
        screen_name = di[i][0]
        followers_count = di[i][1]
        out[screen_name] = {}
        out[screen_name]['follower_count'] = followers_count
        out[screen_name]['tweets'] = {}
        # print i,screen_name,followers_count
        for tweetid in tweets[screen_name]:
            orig_text = db[tweetid]['orig_text']
            # print tweetid,orig_text
            out[screen_name]['tweets'][tweetid] = orig_text

    return out

server = couchdb.Server('http://localhost:5984')
db = server[dbname]
date = '2012-03-05'
output = run(db, date)

Find the complete codebase on github at: https://github.com/telvis07/twitter_mining

twitter_mining: oauth with tweepy

Tweepy provides a module to authenticate with twitter using OAuth. The example below retrieves the auth credentials from a config file and creates a filter stream for the terms 'technical' and 'elvis'. You can get a CONSUMER_KEY and CONSUMER_SECRET by creating a twitter dev account at http://dev.twitter.com/apps/new. The access token and access token secret can be found under the "My Applications" link in your account.

import tweepy
import ConfigParser
import os

class Listener(tweepy.StreamListener):
    def on_status(self, status):
        print "screen_name='%s' tweet='%s'"%(status.author.screen_name, status.text)

def login():
    config = ConfigParser.RawConfigParser()
    fn = os.path.join(os.environ['HOME'],'conf', 'twitter_mining.cfg')
    config.read(fn)

    CONSUMER_KEY = config.get('auth','CONSUMER_KEY')
    CONSUMER_SECRET = config.get('auth','CONSUMER_SECRET')
    ACCESS_TOKEN = config.get('auth','ACCESS_TOKEN')
    ACCESS_TOKEN_SECRET = config.get('auth','ACCESS_TOKEN_SECRET')

    auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
    auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
    return auth

try:
    auth = login()
    streaming_api = tweepy.streaming.Stream(auth, Listener(), timeout=60)
    streaming_api.filter(follow=None, track=['technical','elvis'])
except KeyboardInterrupt:
    print "got keyboardinterrupt"

Find the complete codebase on github at: https://github.com/telvis07/twitter_mining

Giants and Grasshoppers

We are "well able" to finish what we've started. Don't let anyone tell you different.  "Numbers 13:30- 33 KJV: 30 And Caleb stilled the people before Moses, and said, Let us go up at once, and possess it; for we are WELL ABLE to overcome it. 31 But the men who had gone up with [Caleb]...spread a bad report... 33 And there we saw the giants, ... and we were in our own sight as grasshoppers, and so we were in their sight."

new project: twitter mining

I've started a new twitter mining project in a effort to blog regularly and play with interesting tech. The initial code is on github at https://github.com/telvis07/twitter_mining. The code will be updated often and this blog will be updated weekly (hopefully). The goal of this project is to develop novel ways find the MOST meaningful tweets and tweeters over a given interval. I will blog about the current code and any additions I make.

This project is inspired by my awesome wife Sharon who wants info relevant to her site MidtownSweets. This is also inspired by the book Mining The Social Web by Matthew A. Russell - which is a great book.

New wine

I hope we try new things to help us grow. It's hard to grow with old restricted habits.

Matthew 9:17 NIV "Neither do people pour new wine into old wineskins. If they do, the skins will burst; the wine will run out and the wineskins will be ruined. No, they pour new wine into new wineskins, and both are preserved.""

Faith and Forgiveness

I hope we can increase our faith and capacity to forgive today. Mark 11:24, 25 NIV

"Therefore I tell you, whatever you ask for in prayer, believe that you have received it, and it will be yours. And when you stand praying, if you hold anything against anyone, forgive them, so that your Father in heaven may forgive you your sins.""