Audiovisual research practice in the global era #VisualANTH

Photo by only4denn (

Time has shortened, space has shrunk, social relations have stretched and information keeps coming as intensive flows. This article explores the role audiovisual media has in reshaping time and space in the era of global interconnectedness, outlining two basic principles for the process of image production. First, despite continuous development, technology is still restricted in relation to the multimedia nature of human perception—data in its raw form—this limitation and its potentiality for ethnographic practice should be taken into account in any effort of visual production. Second, as social interaction expands, social actors simultaneously experience the global and the local, reacting to close up signs and attentive to distant prospects, possessing both near sight and far sight. This introduction presents the work of six young visual ethnographers from the Visual Research Network, who have taken these two seemingly obvious statements from a theoretical standpoint and explored them through visual ethnographic practice.

Producing a visual practice thematic thread, the aim is to show how visual research in social sciences is adapting to and reinventing how the visual medium is used and most importantly how it is read.

Using ethnographic examples from Egypt, Colombia, Kurdistan, Brazil and Congo (DRC), the objective is to show how researchers can use collaboration, experimentation, photography, sound and film to draw clear links between our global and local interconnection, as well as how we sense, feel and understand, the world around us.

When the ethnographer chooses visual tools as a research method there are two significant changes to the processes of fieldwork and data collection. On one hand, the camera becomes both an element of the material environment in which the ethnographer is participating and essential to the ethnographer’s forms of experience, engagement, and participation in that environment (Pink 2015:100). On the other hand, the material produced becomes part of the understanding of a place, a visual interpretation that penetrates audiences and provides new entanglements that reshape place.

In visual ethnography, the visual practice is understood as a co-constituent of the ethnographic place and as such, a presence that lingers within the reality of social aesthetics, relations and interactions.

In this view, place is first constituted through the social, material and sensorial relations of a given site and it is simultaneously remade as it is recorded in the camera. Place is later reconstituted, as the viewers imagine and ascribe personal and cultural meanings to the representation. In this process, as images, frames and visual narratives reshape space, time is transformed through the editing process, altering, compressing and providing rhythm to the lives, cultures and spaces reflected in the film outcome. The threefold process of place making in visual research does not only transform a place, but also its time and space, with the outcome being a fragment of a partially true reality.

Visual Anthropologists have for long studied the complexity of space and place making, focusing on the different forms of experience and representation that develop in ethnographic settings. Using visual media as observational and objectifying tools to represent experience,

anthropology offers several insights that can be useful to unpack the different routes of multisensorial knowing in order to research how sensory structures and social aesthetics are created.

In October 2016 a group of young social scientists and visual practitioners set a project to identify how two theoretical notions crucial to grasp the ethos and logos in the image production process, namely sensoriality and bifocality, could be explored in practice. Founding the Visual Research Network (VRN), the young visual ethnographers will share in this thematic thread the ethnographic encounters and visual experiments they undertook to unveil the potential of the visual language to help understand the complexity of our sensory worlds and collaborative experiences.

The first concept is that sensorial experience takes place in different formats making reception inherently multimedia as a result of the multimodality of our semiotic world. Human perception takes place in the body in the form of sound, objects, visual deign, gestures, textures and actions. This is made possible through our senses, which are ‘attuned in a quite specific way to the natural environment’, providing us with differentiated access to the world (Kress & Leeuwen 2000:184). The mix media sensory experience received by the body is then translated in the brain into cognition. Following the same process,

‘data’ is what we are able to perceive in the field, which ‘we perceive through all our senses, including sight, hearing, touch, smell and event taste’, so while our means of recording data are limited, data in the form it reaches the researcher is ‘composed of diverse media’ (Dicks, 2006:78).

The disjuncture between the ‘restricted’ media used in ‘data-records’ and the multimedia nature of human perception, including the abstract and unobservable grammars and the ‘non-material resources of meaning-making’ is what Kress and Leeuwen have defined as the multimodality paradigm (Dicks, 2006:78). The camera, as a limited technology, can only represent a portion of human perception, giving humans the possibility to check against their own experience the modes of seeing and understanding audiovisual media proposes.

The multimodality of human perception and the potential for representation of the visual medium should not be seen as a limitation, but as a point of access to the semiotic world through the visual, not as dominant, but as part of an interconnected sensory experience.

This can be especially useful in research with children, when trying to unpack how their spoken and unspoken grammars translate to our understandings of childhood, education and learning. The first two articles of this visual ethnography thematic thread focus specifically on children’s ethnographies, exploring the possibilities of the medium to communicate and represent the sensoriality, materiality and significance children give to their life experiences. Benjamin Llorens Rocamora and Paloma Yáñez Serrano narrate the experiences of fieldwork in Egypt, filming in collaboration with children and exploring their perception of play and work in post-revolutionary Cairo. Llorens and Serrano guide the reader into a game world created for children to experiment with the dynamics of the city, through the pretend-profession role-play of the alternative education project, Mini Medina. They describe how the camera was introduced into the city-game as a tool for those children playing to be reporters. In this way, the camera stopped being bound to the researcher’s own project as it began to integrate the children’s own game world, revealing not only a representation of the game through given frames or images but also a document of the corporeal embodiment of the filmmaker child. In this way the viewer can not only understand the context and actions of a given game, but the ways of playing and seeing from the perspective of the playing child.

Building as well on experimental visual methods, and equally concerned with children’s perceptions of urban infrastructure, Camilo Leon-Quijano shares some valuable snippets of his ethnography working with a group of 10 students with cognitive disabilities in the Ecole K. of Sarcelles, the biggest ‘social housing city’ of France. Using photography as a means to (re)create children’s dreams and expectations, Quijano works with a mixture of fiction and reality, guiding children into creative narrative and photo production inspired by their own imaginative life worlds. In a similar way to the use of the game in the Egyptian ethnography, Quijano proposes, through creative fictional and non-fictional engagement with photography, we can access ‘the emotive, sensual and affective relationship existing between the visual elements of an image, the inventiveness of the author(s) and the material elements of the city’ (this thread). Explaining through aesthetics the shapes and structures in our urban environment that enhance or impair the dreams and expectations for the future of our younger and most vulnerable inhabitants.

The second notion that became clear at the first VRN meeting, has come as a result of global communication flows, providing that humans simultaneously experience near sight and far sight. Community and locality are no longer given or natural, no longer exclusive to face-to-face relations, they are rather constituted by a wider set of social and spatial relations, ‘conceived less as a matter of ‘ideas’ than of embodied practices that shape identities and enable resistances’ (Gupta & Ferguson 1997:6). The conception of time and space is drawn from a simultaneous understanding the global and the local, what Peters defined as the condition of ‘bifocality’ in communication theory, suggesting that social actors possess both ‘near sight’ and ‘far sight’ (1997:75). He traces ‘bifocality’ to the eighteenth century, when the creation of the press allowed information to spread across the globe, through newspapers, novels, statistics, encyclopedia, dictionaries, and panoramas. Today, we not only receive information but are able to produce it at much higher speed and lower cost, transforming our social role from receivers to creators. This entails a responsibility that is often unaccounted for, which implies connecting global flows of capital, cultures and humans, with specific landscapes, bodies and modes of experience that exist and change through time.

Negotiating near and distant is as complex as negotiating filmmaker/research/person roles, it is a matter of complex identities that are constantly in motion, a type of movement that film encaptures into a narrated time, but which will continue to move and change long after the film is done.

When describing her ethnographic encounters, Lana Askari (this thematic thread), suggests that the global is intrinsic to the local, when trying to understand future imaginaries and the conditions and structures of urban infrastructure in Kurdistan. Through the story of Mihemed, a Kurdish journalist from Kobane (Syrian Kurdistan), who lives in a refugee camp in Slemani (Iraqi Kurdistan), Askari reveals a story of turbulent past leading to confused futures. Kobane is a symbol of Kurdish resistance against the Islamic State since 2015 when the Kurdish forces liberated the city, yet proud Kurds like Mihemed cannot return due to the ongoing war. Stuck in Suleimani because of conflicting global powers and the imminent treat of ISIS, Mihemed and his family are forced to live in a displaced locality, reinforced by shops, restaurants and a bridge named after Kobane. Visually engaging with the city and the everyday life experiences of Mihemed, Askari makes clear that Mihemed’s conception of the future respond to his near and far sight, simultaneity of experiencing the reality of Kobane and Slemani. Through the aesthetic of film but also the engagement with participants, an account that makes clear social actors have near and far sight, even when hopes for the future seem to lack.

Daniel Lema Vidal and Clara Kleininger take our readers to Rio de Janeiro, Brazil, exploring the intricacies of the collaborative film project Nosso Morro. In a truly collaborative ethnographic piece, they bring their two distinct, yet complementary perspectives on how six ethnographic filmmakers made a documentary workshop for 10 young people from two adjacent districts, Gávea and Rocinha, the neighbourhood with highest development index and the largest favela of the city, respectively. The workshop was a learning place, a ‘meeting point’ for the 6 international filmmakers and the 10 young participants to make a film about their communities. Lema and Kleininger suggest the workshop served to explore the teenagers’ visions, not only of the narrative as the contextual local, but also to explore global flows, networks, influences and power struggles that are translated into violence, fear and marginalisation. The workshop also represented a space to explore feelings beyond their grammatical interface allowing the participants to get closer to their sensorial experiences in their effort to represent the people in their communities. Lema and Kleininger argue that by including the workshop in the final edit, not only they recognize the ethnographic importance of the visual production, but most importantly they produce a commensurable engagement between the ethnographers (6 international researchers), filmmakers (workshop participants) and viewers that goes beyond the entertaining narrative in favour of a sensorial understanding of the ‘meeting point’.

Crapanzano, contesting Geertz’s definition of anthropology as ‘reading a text over the shoulders of those to whom it properly belongs’ (1973, p. 452), claims that this position would ‘cast our shadows over that book’, pushing the ‘other’ to close it (1986:52). Opening a new book, through the use of collaborative music videos in Goma, DRC, Eugenio Giorgianni brings a closure to our thematic thread. Giorgianni’s ethnography shows how music videos, allow for compromised collaboration, mediating the process of sense making while observing, and being observed. Unveiling the dynamics, aesthetics, insights and rhythms brought together through the production of the song and video clip, Amani Kila Siku – Peace Every Day (kiSwahili), the article shows how musicians and researchers worked together to represent the need of local people to overcome the endless conflict that afflicts the North Kivu region, a warzone until 2013. Unrepresented and underused in social sciences research, the video clip is proposed as a route to understand how participants see and relate to the world through their musical and visual creations, as well as a route to a collaborative representation of sensory experience.


In brief our notions are simple: it is enough to look at our own lives.

We simultaneously live local lives with global connections and at the same time there is a visceral certainty that our feelings, our sensations are basic to our understanding of the world, beyond what technology can capture.

Our ethnographic attempts are limited, simply because a camera cannot produce a record of perception. Nonetheless, through the lens we can make subtle hints. In this process, our images, our words, our senses are what matter.



Dicks, B., Soyinka, B., & Coffey, A. (2006). Multimodal ethnography. Qualitative research, 6(1), 77-96.

Gupta, A., & Ferguson, J. (1997). Culture, power, place: ethnography at the end of an era. In Gupta, & Ferguson eds. Culture, power, place: Explorations in critical anthropology, Duke University Press 1-29.

Kress, G. (2000). Multimodality. Multiliteracies: Literacy learning and the design of social futures, 2, 182-202.

Peters, J. D. (1997). Seeing bifocally: Media, place, culture. In Gupta, & Ferguson eds. Culture, power, place: Explorations in critical anthropology, Duke University Press 75-92.

Pink, S. (2015). Doing sensory ethnography. Sage.

Leave a Comment

Your email address will not be published. Required fields are marked *