The Social Graph: A Conversation with Marc Smith from Microsoft Research

You have probably been hearing the term “social graph” a lot in recent weeks.

Mark Zuckerberg of Facebook describes the concept in a recent Business Week article:

As he (Zuckerberg) describes it, this is a mathematical construct that maps the real-life connections between every human on the planet. Each of us is a node radiating links to the people we know. “We don’t own the social graph,” he says. “The social graph is this thing that exists in the world, and it always has and it always will. It’s really most natural for people to communicate through it, because it’s with the people around you, friends and business connections or whatever. What [Facebook] needed to do was construct as accurate of a model as possible of the way the social graph looks in the world. So once Facebook knows who you care about, you can upload a photo album and we can send it to all those people automatically.”

Since the Business Week interview, it seems (at least to me) like the concept of the “Social Graph” has taken on a life of its own. The definitions of social graph (at least from what I’ve seen) range from the mathematical construct Zuckerburg describes, to a mapping of relationships in a particular network. Others have suggested that the social graph maps relationships as well as contains activity streams, semantic data, and more.

I’m trying to get my head around this, as I think many are. I tried to think of the smartest person I know who regularly studies social network theory, and Marc Smith from Microsoft Research immediately came to mind. Marc was kind enough to answer my questions via email, and a transcript of that conversation follows.

Q. What is your definition of the “Social Graph”? Can this concept be discussed outside the context of social network theory?

I do not think you can get away from the ideas in social network theory and still make any sense of the concept “social graph”.

Computer Mediated Communication systems are social networks.

“The Social Graph” just means that since Joseph Moreno’s 1934 work on sociograms, we recognize that [1] all entities are tied to other entities through relationships and [2] all relationships can be represented as directed graphs, node lists, and matrices, and that each of these data structures is amenable to further analysis. The current fad is just the ever growing awareness of these facts combined with a very real change in the costs of authoring, collecting, and analyzing these structures in digital media. In a social network nodes are people and edges or lines that connect the nodes are relationships.

Our social network research focuses on relationships in older forms of computer-mediated social network services like email lists, newsgroups, web boards and other repositories of threaded conversation. We found interesting “roles” like “answer person” (seen below).

We documented this “answer person role: in a paper we recently published in the Journal of Social Structure: “Visualizing the Signatures of Social Roles in Online Discussion Groups” which is available from:

Some of the tools we used to do this study along with others are available from our website (

Our research points to the way to move from “page rank” to “people rank” by generating “social accounting metadata”. These measures of author behavior capture the structure of conversations and populations of community participants; the results can provide useful relevance ranking features for improving community search. Eric Brill published on the topic of making use of Netscan metadata as a feature of relevance ranking algorithms:

W. Xi, J. Lind and E. Brill, “Learning Effective Ranking Functions for Newsgroup Search,” SIGIR’04, Sheffield, UK, July 2004.

We have published a series of papers in which we demonstrate the value of social accounting metadata to identify authors who display behaviors that are clearly associated with a particular role or function, such as the relatively few “answer people” who provide much of the support in online discussions.

Tammara Turner, Marc Smith, Danyel Fisher and Howard Ted Welser, Picturing Usenet: Mapping computer-mediated collective action, Journal of Computer mediated Communication, September 2005.
Viégas, Fernanda B., Marc Smith. “Newsgroup Crowds and AuthorLines: Visualizing the Activity of Individuals in Conversational Cyberspaces“, Proceedings of Hawaii International Conference on Software and Systems (HICSS) 2004.

‘Answer people,’ the folks who contribute much of the value in the Internet, are a small minority of all online users. Our paper reports that less than 2% of authors in Usenet newsgroups are likely to be the helpful ‘answer person’ type — authors who reply to many other people with brief replies. Information visualizations highlight the difference between these helpful folks and other types of contributors. Of course, the remarkable things is that so few can provide so much to so many.

Q: Does the concept of the social graph deserve the media attention it’s been getting of late?
Yes. Yes. A critical social structure is suddenly becoming very visible and computable in ways that are novel. I am impressed!

Q: Besides Facebook, what other sites or companies are doing interesting things with the “social graph”?
Everything that is about bringing people into contact with people creates a social graph, so all sorts of things are in this space. Email is about social graphs, it just often lacks the UI for the data structures it generates. That is changing, of course. Now there are applications that natively focus on the directed graph as their data structure. That is new as wellFor example, have you noticed that most email clients let you create contact records for each person you know but almost no email clients allow those contacts to have relationships to one another. Applications that generate one data structure do not always have mechanisms to read or analyze that data structure, or only gain those features as they mature.

Marc’s suggestions for further reading & listening:
Audio: Listen to Marc Smith’s portion of the OCLC Symposium

