The Cook Book
Recipe for the week of February 26  March 3
Ultimate Bacon: The Giant Component of a Complex Network
 Let's take a break from our usual fare to discuss an amusing contest
that I entered last weekend. Apparently, sometime within the past few years,
MTV talk show host Jon Stewart devised a parlor game in which contestants
are to link any movie star to Kevin Bacon by a chain of films that share
performers. Thus we imagine actors as vertices in a large network, with
edges between any two who have been in a movie together. The goal is to
find the path of minimal length connecting x to Kevin Bacon,
that length then being the Bacon number, which we denote here as B(x).
Quite a cult has grown up around this pastime, as witnessed by
more than a dozen web pages now devoted to the game.
 Now Brett Tjaden and Glenn Wasson at the University of Virginia have
automated the calculation of B(x) at their Web site,
The Oracle of Bacon.
Their program, which makes use of the marvelous
Internet Movie Database (IMDB), will compute the
Bacon number of any performer you care to specify within a few seconds. Hardcore cultists
seem threatened by the power of the Oracle, but the rest of us welcome this automation
of what can be an arduous evaluation. For instance, B(Bara, Theda) = 3, and in fact
the Oracle confirmed an outstanding conjecture that for any x from the United States,
either B(x) is at most 4 or B(x) is infinite.
 Mathematicians have had their own version of this story for many years,
centered around the Hungarian number theorist and combinatorist
Paul Erdos,
where the links are formed by joint authorship of research papers. To have Erdos
as center is especially fitting since he and his compatriot
Alfred Renyi
developed a beautiful theory of random graphs that captures the
essential qualitative features of complex networks such as these. Thus
Renyi has Erdos number 1. For a list of all mathematicians y with
Erdos number E(y) at most 2, see
The Erdos Number Project.
Alas, E(Griffeath, David) = 3. A key feature of ErdosRenyi theory,
shared by the Movie and Mathematician databases, is that the sizable
majority of members are connected to one another, all belonging to one
Giant Component, while those members not in the Giant Component
belong to very small, isolated clusters. It would be interesting to compute
the average number of films per performer and the average number of
performers per film for the IMDB, in order to compare the network to a bestfit
ErdosRenyi model. This would be a kind of meanfield approximation,
a method commonly used in statistical physics to analyze complex
systems. One could handle the database of mathematicians similarly, but
somehow most folks are more interested in movie stars...
 So last Saturday afternoon I discovered the Oracle of Bacon, and a contest
announced there. The computer program had determined that of the 174,830 performers
in the IMDB, 151,141 (88%) are in the Giant Component. For those, the maximal Bacon
number is 8, and only 8 individuals have this value. A Kevin Bacon video of one's choice was
offered to the first person who could correctly identify a member of {x: B(x) = 8}.
With my lifelong passion for movies, I couldn't resist spending many hours probing the
dark recesses of film history until, at about 10 AM on Sunday, I found an incredibly obscure
1928 Soviet pirate film, Plenniki Morya, starring P. Savin with a Bacon number of 7, and whose
supporting cast of 8 appeared nowhere else. Evidently I had discovered ALL the solutions:
A. Kramer, N. Kutuzov, J. Laptev, V. Podgorny, A. Safroshin, E. Smirnova, I. Strauch, and
Dmitri Vasilyev. As evidenced by
The Bacon Challenge, these folks eventually lead to Roman Polanski (2), Jack
Nicholson (1), and Kevin Bacon (0). That page also documents the lamentable
fact that I came in second to one Stefan Kahrs of the University of Edinburgh Laboratory
for Foundations of Computer Science. I have since learned from Stefan that
he has been instrumental in developing the IMDB over the past two years,
introducing the first links between most of the Soviet Cinema and
Hollywood. I feel less disappointed with my 'honorable mention' knowing
that the winner did the original research on the longest path!
 The story needn't end here, however. There are several intriguing open questions
relating to the INDM network, for instance:
 (a) What is the diameter of the Giant Component?
 (b) What are the size and contents of the second largest component?
 (c) Who is the most central performer (the REAL Kevin Bacon)?
 A natural way to formulate (c) is in terms of minimal average
distance to all members of the Giant Component. Last April, Claus Schotten
determined that several other performers including Ralph Bellamy, James Mason
and Orson Welles had smaller averages than Kevin, but he did not carry out an
exhaustive analysis. Kahrs conjectures that a 'character actor' such as Michael Caine,
Klaus Kinski, Donald Pleasence, or most likely John Carradine should be the
true Center. Of course the database is quite dynamic since new films are
made, and obscure old films are added, all the time. As is our custom, we'll
report any latebreaking developments in a future recipe.
