From Aicip
Revision as of 11:13, 15 July 2012 by Wzb.zju (talk | contribs)

Jump to: navigation, search


Existing social networking services recommend potential friends to users based on their social graphs and web actions. This mechanism, however, may not be the most appropriate to reflect a user’s preferences on friend selection in real life. We present Friendbook, a semantic-based friend recommendation system for social networks. By exploiting recent sociology findings, Friendbook identifies and recommends users with similar life styles. Specifically, taking the advantage of developments in text mining, Friendbook models a user’s daily life as life documents with the frequency of activity information, or bag-of-activity. Friendbook then extracts the life style distributions of users from their life documents using the Latent Dirichlet Allocation (LDA) algorithm. Based on these distributions, Friendbook constructs a friend-matching graph that represents users’ life style similarities. When users send queries to Friendbook for friend recommendations, the Friendbook server analyzes the friend-matching graphs, ranks users according to their impact, and sends a list of potential friends in response to the query. To further improve the accuracy of recommen dations, Friendbook integrates a feedback mechanism that takes inputs from users, and dynamically adjusts internal parameters to optimize online performance. We have im plemented Friendbook based on the Android-based Nexus S mobile phones, and evaluated its performance based on data collected from 8 users for a period of three months. The results show that the recommendations accurately reflect the preferences of users in choosing friends.


Twenty years ago, people typically made friends with others who live or work close to themselves, such as neighbors or colleagues. We call friends made through this traditional fashion as G-friends, which stands for geographical location based friends because they are influenced by the geographical distances between each other. With the rapid advances in social networks, services such as Facebook, Twitter and Google+ have provided us revolutionary ways of making friends. According to Facebook statistics, a user has an average of 130 friends, perhaps larger than any other time in history [2].

One challenge with existing social networking services is how to recommend a good friend to a user. Most of them rely on pre-existing user relationships to pick friend candidates. For example, Facebook relies on a social link analysis among those who already share common friends and recommends symmetrical users as potential friends. Unfortunately, this approach may not be the most appropriate based on recent sociology findings [24, 23, 11, 21]. According to these studies, the rules to group people together include: 1) habits or life style; 2) attitudes; 3) tastes; 4) moral standards; 5)economic level; 6) people they already known. Apparently, rule #3 and rule #6 are the mainstream factors considered by existing recommendation systems. Rule #1, although probably the most intuitive, is not widely used because users’ life styles are difficult, if not impossible, to capture through web actions. Rather, life styles are usually closely correlated with daily routines and activities. Therefore, if we could gather information on users’ daily routines and activities, we can exploit rule #1 and recommend friends to people based on their similar life styles.

In our everyday lives, we may have hundreds of activities, which form meaningful sequences that shape our lives. In this paper, we use the word activity to specifically refer to the actions taken in the order of seconds, such as “sit ting”, “walking”, or “typing”, while we use the phrase life style to refer to higher-level abstractions of daily lives, such as “office work” or “shopping”. For instance, the “shop ping” life style mostly consists of the “walking” activity, butmay also contain the "standing” or the “sitting” activities. To model daily lives properly, we draw an analogy between people’s lives and documents. Previous research on probabilistic topic models in text mining has treated documents as mixtures of topics, and topics as mixtures of words. In spired by this, similarly, we can treat our daily lives (or life documents) as a mixture of life styles (or topics), and each life style as a mixture of activities (or words). Observe here, essentially, we represent daily lives with “life documents”, whose semantic meanings are reflected through their topics, which are life styles in our study. Just like words serve as the basis of documents, people’s activities naturally serve as the primitive vocabulary of these life documents.

Our proposed solution is also motivated by the recent advances in smartphones, which have become more and more popular in people’s lives. These smartphones (e.g., iPhone or Android-based smartphones) are equipped with a rich set of embedded sensors, such as GPS, accelerometer, microphone, gyroscope, and camera. Thus, a smartphone is no longer simply a communication device, but also a powerful and environmental reality sensing platform from which we can extract rich context and content-aware information. From this perspective, smartphones serve as the ideal platform for sensing daily routines from which people’s life styles could be discovered.

In spite of the powerful sensing capabilities of smart phones, there are still multiple challenges for extracting users’ life styles and recommending potential friends based on their similarities. First, how to automatically and accurately discover life styles from noisy and heterogeneous sensor data? Second, how to measure the similarity of users in terms of life styles? Third, who should be recommended to the user among all the friend candidates? To address these challenges, in this paper, we present Friendbook, a semantic based friend recommendation system based on sensor-rich smartphones. The contributions of this work is summarized as follows:

• To the best of our knowledge, Friendbook is the first friend recommendation system exploiting a user’s life style information discovered from smartphone sensors.
• Inspired by achievements in the field of text mining, we model the daily life of users as life documents and use the probabilistic topic model to extract life style information of users.
• We propose a unique similarity metric to capture the “just-right” life style relation between users and con struct accurate friend-matching graphs.
• We propose an efficient ranking algorithm based on users’ life styles, which considers not only the structure of friend-matching graphs, but also each user’s person ality attributes.
• We integrate a linear feedback mechanism that exploits the user’s feedback to improve recommendation accuracy.


Both of large-scale simulations and small-scale experiments are conducted to evaluate the performance of Friendbook. In real experiments, eight volunteers of different professions help to contribute their data to evaluate Friendbook.

Application & Dataset Release

The Friendbook Client Apk:
The Raw Data Collection Client Apk:
The preload sample data set (small):


Wang, Z. and Taylor, C.E. and Cao, Q. and Qi, H. and Wang, Z, "Friendbook: privacy preserving friend matching based on shared interests". In Proceedings of the 9th ACM Conference on Embedded Networked Sensor Systems Pages 397-398, Seattle WA. [PDF]


[1] Amazon.

[2] Facebook statistics.

[3] Netfix.

[4] Rotten tomatoes.

[5] L. Bao and S. S. Intille. Activity Recognition from User-Annotated Acceleration Data. Pervasive Computing, pages 1–17, 2004.

[6] J. Biagioni, T. Gerlich, T. Merrifield, and J. Eriksson. EasyTracker: Automatic Transit Tracking, Mapping, and Arrival Time Prediction Using Smartphones. In Proc. of SenSys, pages 68–81, 2011.

[7] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3:993–1022, 2003.

[8] J. Dean and S. Ghemawat. Mapreduce: Simplified Data Processing on Large Clusters. In Proc. of OSDI, pages 137–150, 2004.

[9] N. Eagle and A. S. Pentland. Reality Mining: Sensing Complex Cocial Systems. Personal Ubiquitous Computing, 10(4):255–268, March 2006.

[10] K. Farrahi and D. G. Perez. Discovering Routines from Large-scale Human Locations using Probabilistic Topic Models. ACM Transac tions on Intelligent Systems and Technology (TIST), 2(1), 2011.

[11] A. Giddens. Modernity and Self-identity: Self and Society in the late Modern Age. Stanford Univ Pr, 1991.

[12] T. Huynh, M. Fritz, and B. Schiel. Discovery of Activity Patterns using Topic Models. In Proc. of UbiComp, 2008.

[13] M. Keally, G. Zhou, G. Xing, J. Wu, and A. Pyles. PBN: Towards Practical Activity Recognition Using Smartphone-Based Body Sensor Networks. In Proc. of SenSys, pages 246–259, 2011.

[14] N. Kern, B. Schiele, and A. Schmidt. Multi-sensor Activity Context Detection for Wearable Computing. Ambient Intelligence, pages 220-232, 2003.

[15] J. Lester, T. Choudhury, N. Kern, G. Borriello, and B. Hannaford. A Hybrid Discriminative/Generative Approach for Modeling Human Activities. In Proc. of IJCAI, pages 766–772, 2005.

[16] Q. Li, J. A. Stankovic, M. A. Hanson, A. T. Barth, J. Lach, and G. Zhou. Accurate, Fast Fall Detection Using Gyroscopes and Accelerometer-Derived Posture Information. In Proc. of BSN, pages 138–143, 2009.

[17] E. Miluzzo, C. T. Cornelius, A. Ramaswamy, T. Choudhury, Z. Liu, and A. T. Campbell. Darwin Phones: the Evolution of Sensing and Inference on Mobile Phones. In Proc. of MobiSys, pages 5–20, 2010.

[18] E. Miluzzo, N. D. Lane, S. B. Eisenman, and A. T. Campbell. Cenceme-Injecting Sensing Presence into Social Networking Appli cations. In Proc. of EuroSSC, pages 1–28, October 2007.

[19] L. Page, S. Brin, R. Motwani, and T. Winograd. The Pagerank Citation Ranking: Bringing Order to the Web. Technical Report, Stanford InfoLab, 1999.

[20] S. Reddy, M. Mun, J. Burke, D. Estrin, M. Hansen, and M. Srivastava. Using Mobile Phones to Determine Transportation Modes. ACM Transactions on Sensor Networks (TOSN), 6(2):13, 2010.

[21] I. Røpke. The Dynamics of Willingness to Consume. Ecological Economics, 28(3):399–420, 1999.

[22] J. Shafer, S. Rixner, and A. L. Col. The Hadoop Distributed Filesystem: Balancing Portability and Performanc. In Proc. of ISPASS, pages 122–133, 2010.

[23] G. Spaargaren and B. Van Vliet. Lifestyles, Consumption and the Environment: The Ecological Modernization of Domestic Consumption. Environmental Politics, 9(1):50–76, 2000.

[24] M. Tomlinson. Lifestyle and Social Class. European Sociological Review, 19(1):97–111, 2003.

[25] Y. Zheng, Y. Chen, Q. Li, X. Xie, and W.-Y. Ma. Understanding Trans portation Modes Based on GPS Data for Web Applications. ACM Transactions on the Web (TWEB), 4(1):1–36, 2010.