Python 对新浪微博的博文元素 (Word, Screen Name)的频次分析

Python 对新浪微博的博文元素 (Word, Screen Name)的频率分析

CODE:

#!/usr/bin/python 
# -*- coding: utf-8 -*-

'''
Created on 2014-7-9
@author: guaguastd
@name: weiboFrequencyAnalysis.py
'''

if __name__ == '__main__':
    
    # get weibo_api to access sina api
    from sinaWeiboLogin import sinaWeiboLogin
    sinaWeiboApi = sinaWeiboLogin()
    
    # import sinaWeibo
    from sinaWeibo import extractWeiboEntities
    
    # import sinaWeoboStatuses
    from sinaWeiboStatuses import publicTimeline
    
    # import sinaWeiboFrequency
    from sinaWeiboFrequency import weiboFrequencyAnalysis
    
    # get the new 5 weibo
    weiboNum = 5
    statuses = publicTimeline(sinaWeiboApi, weiboNum)
    status_texts,screen_names,words = extractWeiboEntities(statuses)  

    for label, data in (('Word', words),
                        ('Screen Name', screen_names)):
        weiboFrequencyAnalysis(label, data, weiboNum)

RESULT:

+------------------------------------------+-------+
| Word                                     | Count |
+------------------------------------------+-------+
| http://t.cn/8snKY0S                      |     1 |
| [围观]CANNCI千姿百袋2014新款牛皮菱格女包 |     1 |
| 时尚潮流单肩包                           |     1 |
| 浪漫RI系「喜欢请赞                       |     1 |
| ✲✲✲✲✲✲                             |     1 |
+------------------------------------------+-------+
+--------------------+-------+
| Screen Name        | Count |
+--------------------+-------+
| 马傻强             |     1 |
| 手机用户2360148561 |     1 |
| 潮流爆款搭V        |     1 |
| star爱上泡面猫     |     1 |
| 美容潮搭健康       |     1 |
+--------------------+-------+