How does information spread? How do you encourage its spread?
These are fundamental business questions. If you introduce a new product or service, how will customer word-of-mouth travel? And they are questions of equal importance for policymakers and nongovernmental organizations. How do you get entrepreneurs in small villages of developing countries involved with microfinance? Or how do you best spread HIV awareness among homeless youth?
Convention and intuition point to one solution: Find people who hold the most influence, typically those who sit at the center of a social network — the hub in a wheel — and “seed” them with the new information. From there, the idea will efficiently reach new ears through word-of-mouth.
Unfortunately, finding these hubs can be a lengthy and expensive process. Picking the five best seeds in a 200-person network requires checking 2.5 billion variations. Consider, then, a network of 1,000 people, or 1 million people.
Does a simpler approach to spreading information exist?
While tackling this question, a team of Stanford researchers found a remarkable result: Simply seeding a few more people at random avoids the challenge of mapping a network’s contours and can spread information in a way that is essentially indistinguishable from cases involving careful analysis; seeding seven people randomly may result in roughly the same reach as seeding five people optimally. (The results are available in their online working paper, “Just a Few Seeds More: Value of Network Information for Diffusion.”)
“Network information can be super expensive to collect, and finding precisely the right people to help something go viral is unpredictable,” says Mohammad Akbarpour, an assistant professor of economics at Stanford Graduate School of Business and one of the paper’s authors. “You might be better just ignoring the network altogether and seeding a few more people.”
When they set out on this project, Akbarpour, along with Suraj Malladi, a PhD student in economics at Stanford GSB, and Amin Saberi, an associate professor of management science and engineering at Stanford, knew from random graph theory that random seeding might perform well in getting a piece of information to go viral. Curious about how it compared to targeted seeding, they built a model and ran it alongside three past experiments from development economics that used deliberate seeding methods.
“These earlier studies describe how, from a statistical perspective, central individuals in a network help diffuse information, but they don’t necessarily tell you whether this is economically meaningful,” Akbarpour says.
“They also often assume that you can perfectly observe the social network of who talks to whom about some particular piece of information. But suppose you have some noise in the data, which is always the case.”
When the authors compared the results of their model, which used random seeding, to results that relied on careful network analysis, they found that random seeding with one to three additional seeds performed nearly as well as targeted seeding, both in terms of speed and extent of diffusion. In some cases, their results proved even better.
Deliberate seeding efforts rely on the degree to which somebody sits at the center of a network. They look for the people who are most highly connected and thus best positioned to spread information. But this approach can create redundancy that leads to rapidly diminishing returns.
“The interesting thing is that if you use an algorithm that targets people with the most friends, then you are going to pick people who are likely connected to the core of the network,” Saberi says. “And once you’ve talked to a few of these people, the next one will not be as valuable, since you’ve already saturated the core.”
In the meantime, you’ve ignored a bunch of so-called “small communities” — satellite networks on the periphery that are loosely connected to the main network. “This is not to recommend random seeding as a universal policy,” Saberi says, “but to show that central individuals do not always maximize diffusion.”
The effectiveness of random seeding with a few more seeds depends on the nature of the network. And how information spreads depends on what is being talked about. For instance, how farmers decide whom to speak with about new corn-growing techniques may be totally different from how they decide whom to gossip with.
“Careful seeding may matter for reasons not captured in our models,” Malladi says. “If, for instance, farmers adopt a new technology only after a sufficient number of their friends adopt it, then you are probably better off being strategic with whom you seed.”
Or consider the problem of stanching, rather than encouraging, information flow. For this “vaccination” problem, understanding the structure of the network and its most influential components — who is infected and who is likeliest to spread the infection — is essential for effective intervention.
“If someone attacks a network with a virus or a piece of fake news, random vaccination is unlikely to stop the spread, and vaccinating central individuals — or informing them about the truth in the case of misinformation — is absolutely necessary,” Akbarpour says. “In some sense, this is troubling, because the attacker does not need the network data, but the defender does.”
Finally, if the process of seeding is expensive — for example, seeding requires extensive training in how to use a new technology with the hope that the trainees subsequently teach their friends — then it makes sense to find the most influential people within a social network.
“In the end, the main tradeoff that companies or policymakers face is in costs — whether it is more expensive to seed a few more random people or to collect and analyze the network data,” Akbarpour says. But when uncovering the structure of a network is expensive, as it very often is, “then you may as well just forget network theory and pick more people based on common wisdom — go talk to the village shopkeeper, the teacher, or a random person in the street.”