What's wrong with the YouTube documentary?
posted by Michael
for VJ10, July 2008
As someone who has shot video and programmed web-based interfaces to video over the past decade, it has been exciting to see how distributing video via the Internet has become increasingly popularized, thanks in large part to video sharing sites like YouTube. At the same time, I continue to design and write software in search of new forms of collaborative and "evolving" documentaries; and for myself, and others around me, I feel disinterest, even aversion, to posting videos on YouTube. This essay has two threads: (1) I revisit an earlier essay describing the "Evolving Documentary" model to get at the roots of my enthusiasm for working with video online, and (2) I examine why I find YouTube problematic, and more a reflection of television than the possibilities that the web offers.
In 1996, I co-authored an essay with Glorianna Davenport, then my teacher and director of the Interactive Cinema group at the MIT Media Lab, called "Automatist storyteller systems and the shifting sands of story" . In it, we described a model for supporting "Evolving Documentaries", or an "approach to documentary storytelling that celebrates electronic narrative as a process in which the author(s), a networked presentation system, and the audience actively collaborate in the co-construction of meaning." In this paper, Glorianna included a section entitled "What's wrong with the Television Documentary?" The main points of this argument were as follows:
[... T]elevision consumes the viewer. Sitting passively in front of a TV screen, you may appreciate an hour-long documentary; you may even find the story of interest; however, your ability to learn from the program is less than what it might be if you were actively engaged with it, able to control its shape and probe its contents.
Here, it is crucial to understand what is meant by the word "active". In a naive comparison between the activities of watching television and surfing the web, one might say that the latter is inherantly more active in the sense that the process is "driven" by the choices of the user; in the early days of the web it became popular to refer to this split as "lean back vs. lean forward" media. Of course, if one means to talk about cognitive activity, this is clearly misleading as aimlessly surfing the net can be achieved at near comatose levels of brain function (as any late night surfer can attest to) and watching a particularly sharp television program can be incredibly engaging, even life changing. Glorianna would often describe her frustration with traditional documentary by observing the vast difference between her own sense of engagement with a story gained through the process of shooting and editing, versus the experience of an audience member from simply viewing the end result. Thus "active" here relates to the act of authoring and the construction of meaning. Rather than talking about leaning forward or backward, a more useful split might be between reading and writing. Rather than being a question of bad versus good access, the issue becomes about two interconnected cognitive processes, both hopefully thoughtful and "active." An ideal platform for online documentary would be one that facilitates a fluid movement between moments of reflection (reading) and of construction (writing).
Television severely limits the ways in which an author can "grow" a story. A story must be composed into a fixed, unchanging form before the audience can see and react to it: there is no obvious way to connect viewers to the process of story construction. Similarly, the medium offers no intrinsic, immediately available way to interconnect the larger community of viewers who wish to engage in debate about a particular story.
Part of the promise of crossing video with computation is the potential to combine the computers' ability to construct models and run simulations with the random access possibilities of digitized media. Instead of a editing a story down into a fixed form or "final cut", one can program a "storytelling system" that can act as an "editor in software". Thus the system can maintain a dynamic representation of the context of a particular telling, on which to base (or support a viewer in making) editing decisions "on the fly". The "Evolving Documentary" was intended to support complex stories that would develop over time, and which could best be told from a variety of points of view.
Like published books and movies, television is designed for unidirectional, one-to-many transmission to a mass audience, without variation or personalization of presentation. The remote-control unit and the VCR (videocassette recorder) -- currently the only devices that allow the viewer any degree of independent control over the playout of television -- are considered anathema by commercial broadcasters. Grazing, time-shifting, and "commercial zapping" run contrary to the desire of the industry for a demographically correct audience that passively absorbs the programming -- and the intrusive commercial messages -- that the broadcasters offer.
Adding a decentralized means of distribution and feedback such as the Internet provides the final piece of the puzzle in creating a compelling new medium for the evolving documentary. No longer would footage have to be excluded for reasons of reaching a "broad" or average audience. An ideal storytelling system would be one that could connect an individual viewer to whatever material was most personally relevant. The Internet is a unique "mass media" in its potential support for enabling access to non-mainstream, individually relevant and personal subject matter.
What's wrong with the YouTube Documentary?
YouTube has massively popularized the sharing and consumption of video online. That said, most of the core concerns made in the arguments related to television, are still relevant to YouTube when considered as a platform for online collaborative documentary.
Clips are primarily "view-only"
The format of the clip is fixed and uniform for all kinds of content
Technically, YouTube places some rather arbitrary limits on the format of clips: all clips must contain an image and a sound track and may not be longer than 10 minutes in length. Furthermore all clips are treated equally, there is no notion of a "lecture", versus a "slideshow", versus a "music video", together with a sense that these different kinds of material might need to be handled differently. Each clip is compressed in a uniform way, meaning at the moment into a flash format video file of fixed data rate and screen size. 
Clips have no history
Despite these limitations, users of YouTube have found workarounds to, for instance, download clips to then rework them into derived clips. Although the derived works are often placed back again on YouTube, the system itself has no means representing this kind of relationship.  The system is unable to model or otherwise make available the "history" of a particular piece of media. Contrast this with a system like Wikipedia, where the full history of an article, with a record of what was changed, by whom, when, and even "meta-level" discussions about the changes (including possible disagreement) is explicitly facilitated.
Weak or "Flat" narrative structure
YouTube's primary model for narrative is a broad (and somewhat obscure) sense of "relatedness" (based on user-defined tags) modulated by popularity. As with many "social networking" and media sharing sites, YouTube relies on "positive feedback" popularity mechanisms, such as view counts, "star" ratings and favorites, to create ranked lists of clips. Entry points like "Videos being watched right now", "Most Viewed", "Top Favorites" only close the loop of featuring what's already popular to begin with. In addition, YouTube's commercial model of enabling special paid levels of membership leads to ambiguous selection criteria, complicated by language as in the "Promoted Videos" and "Featured Videos" of YouTube's front page (promoting what?, featured by whom?).
The "editing logic" threading the user through the various clips is flat, in that a clip is shown the same way regardless of what has been viewed before it. Thus YouTube makes no visible use of a particular viewing history (though the fact that this information is stored has been brought to the attention of the public via the ongoing Viacom lawsuit ). In this way it's difficult to get a sense of being in a particular "story arc" or thread when moving from clip to clip in YouTube as in a sense each click and each clip restarts the narrative experience.
No licenses for sharing / reuse
The lack of a download feature in YouTube could be said to protect the interests of those who wish to assert a claim of copyright. However, YouTube ignores and thus obscures the question of license altogether. One can find for instance the early films of Hitchcock, now part of the public domain, in 10 minute chunks on YouTube; despite this status (not indicated on the site), these clips are, like all YouTube clips, unavailable for any kind of manipulation. This approach, and the limitations it's places on the use of YouTube material, highlights the fact that YouTube is primarily focused on getting users to consume YouTube material, framed in YouTube's media player, on YouTube's terms.
while YouTube is built using open-source software (Python and ffmpeg for instance), the source code of the system itself is closed, leaving little room for negotiation about how the software of the site itself operates. This is a pity on a variety of levels. Free and open source software is inextricably bound to the web not only in terms of providing many of the underlying software (like the Apache web server), but also in the reverse, as the possibilities for collaborative development that the web provides has catalyzed the process of open source development. Software designed to support collaborative work on code, like Subversion and other CVS's (concurrent versioning systems), and platforms for tracking and discussing software (like TRAC) provide much richer models of use and relationship to work than those which YouTube offer for video production.
Broadcasting over coherence
From it's slogan (Broadcast yourself), to the language the service uses around joining and uploading videos (see images), YouTube falls very much into a traditional model of commercial broadcast television. In this model sharing means getting others to watch your clips, with the more eyeballs the better.
The desire for broadness and the building of a "worldwide" community united only by a desire to "broadcast one's self" means creating coherence is not a top priority. YouTube comments, for instance, seem to suffer from this lack of coherance and context. Given no particular focus, comments seem doomed to be similarly ungrounded and broad. Indeed, comments in YouTube often seem to take on more the character of public toilets than of public broadcasting, replete with the kind of sexism, racism, and homophobia that more or less anonymous "blank wall" access seems to encourage.
A problematic space for "sharing"
The combination of all these aspects make YouTube for many a problematic space for "sharing" -- particularly when the material is of a personal or particular nature. While on the one hand appearing to pose an alternative platform to television, YouTube unfortunately transposes many of that form's limitations and conventions onto the web.
Looking to the future, what still remains challenging, is figuring out how to fuse all those aspects that make the Internet so compelling as a medium and enable them in the realm of online video: the net's decentralized nature, the possibilities for participatory / collaboration production, the ability to draw on diverse sources of knowledge (from "amateur" and home-based, to "expert"). How can the successful examples of collaborative text-based projects like the Wikipedia inspire new forms of collaborative video online; and in a way that escapes the "heaviness" and inertia of traditional forms of film/video. This fusion can and needs to take place on a variety of levels from the concept of what a documentary is and can be, to the production tools and content management systems media makers use, to a legal status of media that reflects an understanding that culture is something which is shared, down to the technical details of the formats and codecs carrying the media in a way that facilitates sharing, instead of complicating it.
- ↑ http://www.research.ibm.com/journal/sj/363/davenport.html
- ↑ The ability to add clips to personal "playlists" comes closest to some sense of "writing" with clips but falls far short of any kind of "editing", and the playlists made by others are not generally visible via the YouTube interface.
- ↑ There is a mechanism for posting video responses to other clips, but this mechanism doesn't seem to be used to link "derivative" works.
- ↑ http://news.bbc.co.uk/2/hi/technology/7506948.stm