Skip to main content

More than meets the “I”: A polyphonic approach to video as dialogic meaning-making


This paper summons Bakhtin's principle of visual excess to the field of video research. Bakhtin's dialogic approach emphasises the visual as an effort of the eye, as well as the subjective “I”. Seeing is thus re-caste as an event where subjective and cultural boundaries are encountered, lived, and offer insight to those involved. Video is therefore posited as a visual and axiologic encounter that allows one to perceive beyond one's own limits. Here the researcher does not come with a predetermined set of categories or criteria, but seeks to encounter the form of language and the meaning of those forms, from multiple (polyphonic) visual and ideological standpoints. I argue that taking this approach opens up possibilities for seeing as an opportunity for dialogic speculation and interrogation- one that forms the basis of my research orientation. By way of demonstration the paper will introduce an example of video filmed in an infant educational setting which highlights the additional insights offered through different visual fields and their interpreted meanings. Synchronising four visual fields of the same event - from the view of the infants, teacher and researcher - visual surplus is thus operationalized as a multi-voiced polyphonic event. Dialogues concerning their pedagogical significance - for the teacher and the researcher - are discussed alongside the footage itself. Together they highlight subtle, yet highly significant potentialities for video work that set out to engage with the experience of the eye as an encounter with ‘other’. I argue that such visually oriented engagement can act as a central source of understanding and insight that far exceeds traditional approaches in educational research that view participants as mere objects for amusement or manipulation. Moreover, this approach poses a new video methodology in which meanings take precedence over what is aesthetically received.

A central dilemma exists in video research, particularly in relation to education. It concerns the trend towards interpreting what is captured on video as an evidence-base from which to draw all-too-certain conclusions about it’s meaning for all. The claims that arise from such data are often presented without speculation or scrutiny, further aided by ethical barriers that preclude public access to the footage of their origin. Combined with a push for methodological instrumentalism (Mills & Ratcliffe, 2012, p. 152) the opportunities video research offers the field are denied their fullest potential and, as a consequence, all too often fail to deliver their promise as legitimised data sources or, for that matter, in representing “what passes for experience and reality” (Sandywell & Heywood, 2012, p. 6).

This is not a new dilemma in visual studies and the associated field of visual culture (Davis, 2011). Indeed, its origins can be traced back through use of image in photograph and art (DeBord 1994), originating in Plato’s (1952) early critique of the illusionary nature of the eye and culminating in modern rhetoric which suggests that the visual can persuade and coerce certain types of seeing through tactics such as ethos, pathos, logos and kouros:

Vision is by no means an automatic function of our psychological apparatus. There is much evidence that vision is a mode of thinking. When we see, we interpret the world around us and orient ourselves in it. Sharpening our awareness, heightening our sensibility, disciplining our vision, it will increase our power to understand the world, appreciate its richness and cope with its problems (Kepes, 1944, p. 17, cited in Brannon, 2013, p. 280).

In spite of this cautionary philosophical and semiotic heritage, video is now characteristically and increasingly promoted in educational research with young children as evidence for broad pedagogical claims in various sociocultural contexts (Fleer & Ridgeway, 2014; Johansson & White, 2011; White, 2015). Jay (1993, cited in Peters, 2010) describes this phenomenon in the broader field as a “hegemony of vision” that fails to recognise its own authority as a kind of public pedagogy with the power to define what forms of learning and teaching matter, and for whom (see also, Tavin, 2015). In this location, video becomes an ocular-centric means of determining what constitutes teaching and learning by defining what ‘is’ or what ‘should be’ according to a series of moving images as a means of demonstration or exemplification. As Sandywell & Heywood (2012) suggest, this approach casts video as the “paradigmatic aesthetic machine of the nineteeth century” (p. 18) failing to recognise its wider potential as a “grenade of meaning” (p. 37) and a source of multimodal speculation.

The problem with aesthetics

In itself, the deployment of moving image as a means of fuller understanding in context-specific settings is not the problem. Indeed, its utility in developing a broader picture of educational experience is indisputable. Were this not the case the Video Journal of Education and Pedagogy would not exist. However, where the meaning ascribed to video by a researcher (or any other individual for that matter) is either absent from its representation or represented as reality for all, an ethical dilemma arises. Burri (2012) suggests that this literal emphasis on the image fails to examine how the image shapes cultural meaning, how practices are constructed through video and to understand what gives rise to their status as social ‘reality’. Treated as an aesthetic object of validity and reliability in a traditional sense, the use of video to make certain claims concerning its meaning supposes that there is only one way the footage might be interpreted. Moreover, it assumes a certain methodological holism to what can be seen, asserting an ‘eye of God’ reality which has the potential to pervert as much as represent (Baudrilland, 1967). As Mihailovic (1997) asserts - since any product of aesthetic activity (e.g. a video) is immutable - it cannot represent a living reality. This view lends support to the contention of Pink & Leder Mackley (2012) who argue that video in research is primarily methodological rather than empirical, posing pedagogical questions that ask not only how or what are we learning and teaching, but also ‘why’? Such questions shift the field of inquiry to the ideological basis for understanding what matters, and by association, who decides what is ‘seen’ as legitimate learning.

Although moving image was virtually unheard of at the time, a similar dilemma was posed by Mikhail Bakhtin almost a hundred years earlier in his philosophical discussions of aesthetic activity:

Aesthetic activity … is powerless to take possession of that moment of Being which is constituted by the transitiveness and open event-ness of Being. And the product of aesthetic activity is not, with respect to its meaning, actual being in the process of becoming, and, with respect to its being, it enters into communion with Being through a historical act of effective aesthetic intuiting (Bakhtin, 1993, p. 1).

For Bakhtin, uncritical forms of aesthetic activity which are unproblematically ‘received’ as significant or otherwise according to universal principles represent an epistemological and ethical crisis. This is because they lack any consideration of the living, evolving, shifting and located (ideological) nature of meaning in the event itself, as well as its aftermath. They are therefore static, unchanging and received truth-istina. An alternative type of meaning-making is offered instead through act Bakhtin describes as ‘intuiting’ or lived truth –pravda. In this Bakhtin invokes the capacity for humans to draw upon their intuitive responses in the interpretive event here-and-now, as opposed to relying on a set of universally sanctioned and received definitions or categories that can be overlaid on an already known process. Seeing, according to this interpretation, is a pedagogical engagement requiring the see-er to understand what can be seen as an event of becoming – both for themselves as much as the learner. This idea is enshrined in Bakhtin’s (1990) notion of answerability as a means of accountability, reflexivity and an encounter with one’s own morality in the life of others.

The route to intuitive engagement of this nature is established by Bakhtin (1993) through his notion of visual surplus – a concept which emphasizes the situatedness of interpretation. More specifically, visual surplus accepts that insights are derived from the evaluator’s unique place in the world, from which they ‘see’ (and interpret) accordingly. Located outside of the individual (that is, the ‘seen’) the visual surplus of another holds the capacity to contribute fresh ways of seeing. However, this conception also asserts that any evaluation is also ideologically located, and that the see-er is implicated for the contributions they make. A corresponding work (or ‘effort’) of the eye therefore represents the highest interpretive authority and summons an ethical imperative to interpretation. Since visual surplus holds as much potential for damage as it does for enhancement of meanings, Bakhtin grants attention to the ‘I’ – for other, for oneself and for a mutually animated ‘we’ that resides in the dialogic space. It is here that this research methodology finds its home.

The ‘work of the eye’ in video research

Contemplating the eye as a lived encounter of visual surplus as well as an ethical responsibility calls for a more complex methodological encounter with videoed events:

Here, the eye is re-cast as a visual encounter with others. As such, what can be ‘seen’ is viewed as an authorial gift that draws on the insights of another’s visual field because they offer additional opportunities for understanding and because the eye alone (with the ‘I’ or the insight of another) cannot see to its fuller extent. (White 2016a, 2016b).

This axiologic engagement with the footage in tandem with a consideration of its many, perhaps even oppositional, interpreted meanings holds potential for researchers to ‘see’ beyond the limits of their own eye/I and, in doing so, represents a much richer approach to analysis. The premise for such an approach is further expounded, by Bakhtin (1984), through Dostoevsky’s novelistic inspiration, in the notion of polyphony whereby “subjects co-exist as autonomous worlds within the world of the author and contend with him for the readers’ attention” (Krasnov, 1980, p.5). In this sense, polyphony addresses the problem of seeing as an isolated or discrete activity by relocating what can be seen as a mutually animating event. As such, traditional binaries of subject versus object in research are collapsed in order to contemplate interpretation from both the perspective of the see-er and the seen.

Coupled with the Bakhtinian notion of visual surplus, polyphony provides a revisioned way of approaching video research. Now, “emphasis is placed on the authors ability to allow multiple voices (and voices-within-voices) to remain in play and characters to speak for themselves through the multiple genres employed” (White, 2010, p. 87). Attention is therefore granted to the unspoken as well as the spoken, whispers, sideways glances and gestures alike play an integral role to the interpretations that are shared. Significance is given to the different contexts in which the event takes place, and the intended audience(s) for whom language is oriented. Just as the polyphonic novel draws on the words and actions of those whose narrative story is told, so does video research seek to invite the embodied as well as the articulated interpretations of those who are being videoed as a means of revealing a series of research narratives and their relationships to one another. More importantly, those interpretations are not mere window dressings for the researchers authoritative gaze, but play a vital role in influencing what might be seen and how it could be interpreted otherwise.

A polyphonic approach

A polyphonic approach to video research as a central form of visual surplus lies at the heart of this interpretation of dialogic methodology, as presented in the remainder of this paper. I should state from the outset that such an approach is not for the faint hearted. If certainty is desired, as is so often the case in educational research, then a polyphonic approach will fail to satisfy. If, on the other hand, a genuine desire to see richly and to be informed otherwise, is sought, polyphony offers such an encounter. In some fields of education where interpretation is less prescribed or certain this approach is, perhaps, easier to contemplate. Early childhood education (ECE) is one such domain, but there are many others that invite similar contemplation. Indeed, as I have tried to argue elsewhere (White 2011a, 2011b), those so-called ‘certain’ domains also benefit richly from suspending authoritative thresholds in order to contemplate what might be accessed in polyphonic chorus with other ways of seeing. Indeed, pedagogical research of a multi-perspectival nature is evident in early years studies with preschool aged children across diverse cultures where different teachers discuss and compare their pedagogical insights based on classroom video (see, for example, Hayashi & Tobin 2015).

Given my own interest in ECE it is hardly surprising that it is to the youngest learner that I orient by way of demonstrating a polyphonic approach to video research. It is worth pausing for a moment to explain why. As explained elsewhere (White 2016a, 2016b) infants in ECE research are often misrepresented in educational research due to a lack of understanding or, quite simply, placed in the ‘too hard basket’ and ignored altogether in pedagogical discussions. There are several reasons for this – many of which lie beyond the scope of this paper (for a fuller discussion see Dalli & White, 2016). Suffice to say that their recent entry into the educational realm has not been marked by a flurry of educational research activity beyond a legacy of developmental science and psychology that claims certain realities for infant experience and capability, largely based on laboratory tests with infants and their mothers. As a consequence, the experience of infants in educational settings is largely speculative. This is especially true for infants under the age of one year who are increasingly spending significant hours of their day in ECE settings with non-familial adults who have to work very hard to interpret their language cues (White et al. 2015b).

Taking up this challenge, I set out to action Bakhtin’s visual entreaty by trying to earnestly interpret the ECE experience of infants in polyphonic dialogue with others. Including the adults who work with infants (teachers and parents) as a means of understanding is not new to educational research (Lang et al. 2016) and is easily achieved by interviewing them about their interpretations in a similar way to David Clarke and his associates (Clarke et al. 2006). As a kind of ‘surplus’ this form of data generation goes some way to contributing to an enhanced understanding since adults who live and work with infants are able to offer important and additional insights to the research. However, in isolation of other interpretative approaches that foreground the infant experience from their own visual perspective and seeing the same event through different eyes, interviewing adults alone implies that they fully know the infant and can speak on their behalf. Moreover it denies the infant an opportunity to have their unique perspectives heard or alternative insights considered.

As such research that speaks on behalf of the infant – as if their perspective were fully known - represents a form of ‘ventriloquisation’ (Tannen, 2010) that fails to recognise the infant’s experience beyond the interpretation of another. Bakhtin has a great deal to say about this from an ethical standpoint, suggesting that an exclusive and intimate approach to evaluative activity alone may lead to a complete consummation of another in the absence of an outsider point-of-view. The same is true for approaches that are exclusively distant from the infant, and make assertions based on monologic claims that homogenise infants as developmentally ‘known’ (Cheeseman et al. 2015). Infant research is characterised by both extremes (Dalli & White, 2016).

Not withstanding the obvious linguistic, developmental and ethical limitations in providing opportunities for infants to contribute to the research (Elwick et al. 2014) a polyphonic approach deliberately sets out to view the experience through their eyes, in tandem with others. No ventriloquised assertions are made concerning infant interpretations of the event, but instead, build on what can be seen as a source of insight for all. Emphasis is placed on the language forms and their interpreted meanings in events, and the way participants give form to these through dialogue. This is a dialogic process which summons ‘the work of the eye’ – and the subjective ‘I’ of the researcher - to its fullest extent (White 2016a). As Deborah Hicks (2000) explains: “Rich seeing requires that the contemplator immerses him or herself in the “heaviness” of the social relationship’ (p. 232).

Operationalising a polyphonic approach to video data generation therefore entailed a revised form of richly seeing which encountered the visual field of the infant him or herself. Earliest attempts at this approach had revealed insights far in excess of previously held assertions concerning very young children, including their capacity to disorient adults in their understandings (White, 2011b). In order to access this lens I utilised four cameras which simultaneously shot film from a lens worn on the infants head, the teachers head and my own hand-held device. The role of the teachers is important to note here as the ECE setting operated with a key teacher-buddy system which meant that each infant had a special adult who held primary care responsibility, and who was supported by a back-up – buddy – when they were occupied elsewhere.

Figure 1 provides a view of the four visual fields, including four-month old Harrison, ten-month old Lola, Harrison’s key teacher (1) and Lola’s key teacher (2):

Fig. 1
figure 1

Screen shot of polyphonic footage

In the top left screen teacher 2 and Harrison are in the visual field of teacher 1. In the bottom left screen teacher 2 is (close up) in the visual field of infant 1 - Harrison. In the bottom right screen a different scene is evident in the visual field of infant 2 – Lola – who is in the same room. The top right screen shows the researchers visual field taken from a distance. Although all screens are shot in the same place and time, what they reveal is often very different, dependent on the direction of each participants head. While this technology cannot claim to track their explicit eye movements and thus cannot account for sideways glances (which are also important in dialogic research according to Sullivan 2013) they do provide a general overview of the visual orientation of each person.

Time synchronised, these visual fields taken over two hours were offered to the teachers for pedagogical interpretation (in an earlier study the family were also invited to offer their perspectives on polyphonic video events – see White 2009a, b). This meant that teachers were invited to select specific events from the polyphonic footage which they considered held pedagogical significance. These insights were shared in a subsequent interview which, in tandem with in-depth analysis by the researchers themselves, provided a rich source of visual surplus (White et al. 2015b). By tracing the field of vision, in tandem with the evaluative eye of researchers and teachers, a means of fuller appreciation of the pedagogical experience for these infants was established. This was seen as particularly important at the time of this study, when infant teachers were being accused of pedagogical incompetence in the absence of an articulated pedagogy that would satisfy the requirements of wider educational discourse (Education Review Office, 2015).

By way of demonstration

The video excerpt that follows offers one small example of insight from the polyphonic video as a means of demonstration. Those watching this footage in a future-oriented world of technology where more sophisticated cameras make visual work much simpler (or perhaps more complex), will probably find the filming most primitive. Indeed, for those who seek production quality the footage may be unpalatable. However, at the time of filming, in late 2013, and given the subjects involved, the (nano-pod) cameras were the best option available. Emphasis is placed on the footage as a means of dialogic engagement rather than a product or outcome as is so often the case in video-based research. As such, participants are not given instructions on what to view or how to view the split-screens, since what they choose to look at and how they approach their viewing is yet another potential means of insight and challengeFootnote 1.

This scene takes place in a New Zealand early childhood education (ECE) setting catering for infants and toddlers during the early afternoon. Both Lola and Harrison have recently been fed and are playing on the floor of the ECE setting. In this case teacher 2 is Lola’s key teacher while Harrison’s key teacher (1) is occupied in another part of the setting.

Lola. Harrison & Rachel flattened movie. The video is available to download if requested to

Among a myriad of other insights, what this event highlights is the significance of the three-way relationships that take place between the infants and their teacher, but also the infants themselves. It is difficult to separate the three in this dialogic context. The teachers highlighted this event because they noticed Lola imitating her teacher’s ‘tickley’ act (both in terms of sound and action), but in their dialogues they emphasised a great deal more – traversing their own discoveries as well as articulating their pedagogical practice using language that might otherwise have been overlooked by the researchers:

Teacher 2: I provided the provocation for her at the beginning and then invited an extension on that….She [Lola] takes the blocks, I just love that, straight away.

Teacher 1: And Harrison is watching again.

Teacher 2: And at this moment I thought it was important to talk about how we have no time restrictions, so I’m not like “right, Ok, I’ve got 20 nappies to do. I can’t sit here for this length of time. You know, like I can be in the moment again and I’m not hurried by anything. You know apart from recording nappies and stuff we’re barely looking at our watches…And then I notice that I zone out here and Lola moves away with her freedom to move, rolling. I think because we’re in an environment like this where we’ve got no restrictions, like no baby swings and bouncers and stuff, I actually think that makes you engage more with the children because in an environment where we had say Harrison in a bouncer over there and Lola in a swing over here I don’t think there would be the same amount of engagement as what happens.

Teacher 1: That’s a really good point, because you would be thinking, what should I do now?

Teacher 2: yeah, like give the swing a little push - because these children are lying on the floor we’re engaging with them…I mean we have all these provocations but if you watch most of the time its that engagement. You know that position I’m in here and that L was in with hers. We’re just so awesome [laughs]

Teacher 1: We’re figuring it out all the time. We’d never say we know because its always different. A lot of the most important stuff goes on with the children – there in the moment, just figuring it out.

[Teacher interview]

For these teachers, as much as for the researchers, the insights polyphonic footage provided not only exceeded their own independent visual fields and associated insights, but also revealed their own deeply held pedagogical beliefs concerning these infants and their pedagogical choices accordingly. Taken together, these highlight the embodied work of the early years teacher (Hayashi & Tobin, 2015) as well as the intuitive nature of engagement that calls for moment-by-moment responsivity rather than received categories. This unknown nature of their engagement represents what Shotter (2012) describes as ‘poised resourcefulness’ and represents some of the complexity teachers face when working in a manner that suspends certainty in favour of events of ‘being’ and ‘becoming’ (White 2016a, 2016b).

When asked by the researcher if there were any surprises for teachers in the footage their replies highlight the importance of having access to infant visual fields as a tremendous source of insight:

Teacher 1: I really didn’t know how much of Harrison’s time is spent watching absolutely everything going on around him and how obviously how important that is for his learning. That is so significant in this environment with regard to his social, learning about being a social person and just the way he so engages you verbally and the children.

Teacher 2: And with the key teacher it’s interesting how there’s different interactions that happen between key teachers and the buddies.

Teacher 1: I think that’s quite good because it shows, it was a concern for me earlier that I wouldn’t be giving a true picture of Lola’s relationship if we didn’t have [her key teacher 2] involved and yet she is perfectly happy for me to do things for her. The thing that has become apparent to me is that we are doing so much more than we think we are doing. You know, like, its obvious – you’re not only giving a baby a bottle, you’re interacting with another child, you’re scanning the environment, you’re thinking about who is going to need what in the near future and it all looks like you are just feeding the baby. Yeah, so its very involved what’s going on.

Teacher 2: And how we view the children as capable and confident. There was a part where Lola got stuck under the shelving. I didn’t jump in straight away because I view her as capable and confident. I wanted her to engage with her dispositions and get herself out of there because then she will feel like she is empowered to do that. To make her own decisions about how to do that. I think that’s how we view our children.

Teacher 1: Yeah I think that’s so evident in that whole part where Lola was sitting there for such a long time. It is viewing the child as being actively engaged and able to do so. When I was talking to [other teacher] about it, she was like “well imagine if you’d been sitting there passing her things” – you know – which is quite a normal thing for a teacher to do…and she was not only learning about the objects but shes learning about herself as a learner. You know, like she was looking at some beautiful objects like to paua shell and that’s huge - understanding that the environment is the third teacher.

[Teachers interview]

Similarly, the researchers had access to a great deal more understanding of pedagogical events of significance through engaging with teacher dialogues and their own independent analysis of the polyphonic screens. A full depiction of these discoveries is beyond the scope of this paperFootnote 2, suffice to say that a systematic qualitative and quantitative analysis of the different language forms (including the use of the body, including eye movement as a feature of communication the teachers highlighted to the researchers), their sequenced useage in dialogues with teachers and the infants proximity to teachers during different events, provided further visual surplus to the videoed events. Taken together, these approaches respond to a dialogic interpretation of utterance as “social phenomenon” (Voloshinov, p. 82) whereby meanings are encountered, negotiated, disputed and refuted (dialogised) rather than received as truth.

Analysing utterance

Utterance as a central unit for analysis offered a way of cross-examining the data and understanding genres in infant dialogues with teachers and/or peers. The figure below provides a screen shot capturing some of the multi-layered complexity in the social events on film as seen through the different visual fields and associated insights which laid the groundwork for a two tiered analytic process:

Tier 1: Identification of speech genres

In the first instance, events were coded against the language forms used by the infant on film and their articulated meaning(s) - based on the insights offered in dialogue between researcher and teachers as well as the visual cues offered on the video itself. This approach responds to Bakhtin’s explanation of various genres and their employment in certain social settings that are characterised by preferred combinations of language form and content (or meaning). As Mabin explains:

The notion of a genre emerging from social activity switches the focus from a more static tableau-like notion of setting (for example a classroom) to the various different social activities, involving different kinds of speech genres, which may be going on within it.

Identifying speech genres as a means of a combination of form plus content provided a means of understanding the complex ways that various language forms might be dialogised by infants and adults alike, in the ECE setting. Our initial analysis focused on teacher-infant dialogues (White et al. 2015b). That a large number of these forms were non-verbal and very subtle – requiring replay after replay revealing further layers of meaning, further legitimated both the importance of video itself and, specifically, the visual fields of the infants.

Tier 2: Visual field analysis of alteric and intersubjective events

Participants (including researchers) were then invited to focus on the infant visual field – as shown through their camera lens – as an additional source of provocation. It is here that alteric insights are generated – that is, insights whereby the infants field offers fresh perspectives on the event and its meaning of adults. These are contemplated alongside the attention give to intersecting visual fields – between participants - where intersubjectivity and shared meaning was more often emphasized in the analysis. Analysis therefore sought to try to understand the nature of both alteric and intersubjective events; their duration, content and influence on subsequent events. This was possible due to the access studiocode offered to time sequences, and the opportunity to discuss events retrospectively and over several episodes. These forms of analysis generated a rich qualitative platform for further quantitative inquiry.

Tier 3: Quantitative analysis of genre

Applying a quantitative approach to analysis provided a means of converting single language events and their meanings into frequencies over time Fig. 2. Through such means it became possible to ‘see’ patterns in dialogues and, as a result, to begin to draw conclusions about a variety of features in the learning environment that influenced these. For example, that the proximity of the key teacher consistently played a vital role in the kind of communication that took place for infants even when they were interacting with others (White & Redder, 2015). An important finding concerning eye movement revealed the importance of a lingering gaze, as opposed to a glance or a watch, in the dialogic exchanges that took place for infants with their teachers and featured as a pedagogical priority in keeping with the assertions of the teachers themselves (White et al. 2015a). With the aid of polyphonic footage, which was returned to at regular intervals, it became possible to recognise nuanced moments as dialogic events and their significance to others. Importantly these occurred between a variety of people, places and things in the ECE setting, rather than merely in dyadic relationships between adults and infants alone (as traditional research for infants might suggest). It was therefore possible to ‘see’ how events were also influenced by time-space AND axiologic coordinates, or what Bakhtin describes as “an intersection of axes and fusions” that make up the chronotopes in which language is located.

Fig. 2
figure 2

Screen shot of analysis frame

Tier 4: Beyond the adult visual field

Subsequent analysis highlighted the peer relationships that took place (Redder, 2014) – often outside of the adult visual field – and which set the scene for a dialogic encounter far beyond what might otherwise be accessible to research. On many occasions our discoveries drew from the direct visual lens of the infant; while on others it was these visual images that sparked important discussions concerning what was valued, responded to and recorded (in assessment documentation, for example – see White 2009a, b) and, perhaps even more importantly, what was not. These, and many other, insights would not have been possible without the visual surplus of the teachers, researchers and infants in polyphonic chorus. Together they respond to Bakhtin’s call to “give way to the work of the eye as performance and creativity in a particular place at a particular time” (Bakhtin, 1986, p. 38). Further, it is my contention that these approaches represent a re-visioned methodology for video work by methodologically living out the realities of richly seeing in ethical and subjectively honest ways. In educational research, this methodology provides a practical and theoretical means of understanding the complexity of pedagogical events in the lives of learners and teachers alike. Specifically, for infant research, visual surplus through polyphonic means offers potential for understanding our youngest educational partners as an effort of trying. In both literally ‘seeing’ through infant eyes and figuratively confronting the ‘I’ of the intuiting other, the teacher is morally implicated for their pedagogical expressions and associated actions. This is most certainly a pedagogical imperative also.


While this paper began with a critique of video as a means of ‘knowing’ what is learnt and its pedagogical premise, it ends with the Bakhtinian proposition that it is the effort of knowing that maps out a revised methodological orientation for the field. In so doing an approach that upholds the primacy of the optic as an aesthetic source of insight; whilst paying attention to the subjective ‘I’ of those who seek to understand learners and themselves. Bakhtin’s polyphonic imperative offers a great deal to the field of video research in education in this regard – calling researchers to account for the interpretations they make and the associated claims that are made concerning others. Keenly attuned to its ethical and moral purposes, video-based methodology of this nature therefore calls for greater transparency concerning what might be said, and the visual fields through which the narratives might be told. As Bakhtin (1984) reminds us: “Never use for objectifying or finalizing another’s consciousness anything that might be inaccessible to that consciousness, that might lie outside its field of vision” (p. 278). Drawing from the multiplicity of what can be seen and how it might be interpreted by others that arises from this entreaty becomes a central source of dialogic provocation and wonder in this view.

The insights generated out of the different forms of visual surplus provided through polyphonic footage that have been alluded to throughout this paper further highlight the importance of seeing video as a way of understanding pedagogy and, in so doing, deepening an appreciation of the complexity within all events as learning. Paying attention to the nuanced detail of what might be seen, and by whom, sets the scene for a sophisticated engagement with learning as a series of relational, dialogic and deeply ethical encounters with people, places and things. This exceeds any one interpretation but holds great potential for enhancing evaluations when granted a legitimate place in the research. Whether or not it sets out learning agendas for others is, perhaps, less of relevance than supporting those who are ‘in the moment’ to understand the possible impact of their own acts on the lives of other. This impact is not only an ethical imperative for teachers working with infants, but also for researchers who set out to ‘capture’ their lives on film.

As a polyphonic event in a dialogic space that is largely unknown to educational research, the ECE context provides a revised platform for richly seeing. But it is by no means the only educational setting that warrants the work of the eye or polyphonic access to the different perspectives of those who reside in such spaces. There are, as such, implications that arise from this methodology for the study of all pedagogical relations, especially in populations where the language of participants is not necessarily shared (indeed, from a Bakhtinian stance no language rarely is) or difficult to access. This applies to the broadest fields of educational experience – all of which call for intuitive and ethical approaches to understanding and engagement. Thus the methodology posed in this paper is not merely a provocation for video research for teachers and learners in classrooms where learning takes place. Here, video is not merely a source of knowing. It is also a source of speculation and not knowing, as well as a source of intuitive insight and creativity that, in my view, begins to painstakingly operationalise Bakhtin’s notion of visual surplus in contemplation of 21st century pedagogies.


  1. In my earliest study I began by asking participants to ‘notice’ certain aspects of learning (White 2009a, b). However, more recently I have come to appreciate the importance of paying attention to their unsolicited ‘noticing’ in the first instance, and then sharing insights concerning i) what was seen and its significance, ii) what was not seen an its significance; iii) what kinds of learning are evident when visual fields are shared (ie the same scene is evident from different camera angles) and iv) what kinds of learning are evident when visual fields are not shared and its potential for the learner. In dialogic research these dialogues and their significance to members of the educational community are central to the inquiry and act as a form of validity (Sullivan 2013). They therefore implicate the researcher for his or her interpretations also.

  2. For details concerning the language forms see White, Peter, & Redder (2015b) and White, Peter & Redder (in preparation).


  • Bakhtin MM (1986) In: Emerson C, Holquist M, McGee V (eds) Speech genres and other late essays. University of Texas Press, Austin

    Google Scholar 

  • Bakhtin MM (1990) In: Brostrom K (ed) Art and answerability. University of Texas, Austin

    Google Scholar 

  • Bakhtin MM (1993) In: Holquist M, Liapunov V, Liapunov TB (eds) Towards a philosophy of the act. University of Texas Press, Austin

    Google Scholar 

  • Bakhtin MM, Emerson C (1984) Problems of Dostoevsky's poetics, vol 8. University of Minnesota Press, Minneapolis, Print

    Google Scholar 

  • Baudrilland J (1967) Simulcra and simulation. University of Michigan Press, Ann Arbor

    Google Scholar 

  • Brannon M (2013) Standardized spaces: Satellite imagery in the age of big data. Configurations 21(3):271–299

    Article  Google Scholar 

  • Burri RV (2012) Visual rationalities: Towards a sociology of image. Curr Sociol 60:45–60

    Article  Google Scholar 

  • Cheeseman S, Press F, Sumsion J (2015) An encounter with the ‘sayings’ of curriculum: Levinas and the formalisation of infants’ learning. Educ Philos Theory 47(8):822–832

    Article  Google Scholar 

  • Clarke B, Keitel, Shimizu (eds) (2006) Mathematics classrooms in twelve countries: Insiders perspectives. Sense Publishers, Rotterdam

  • Dalli C, White EJ (2016) Group-based early childhood education and care for under-2-year-olds: Quality debates, pedagogy and lived experience. In: Farrell A, Kagan SL, Tisdall EK (eds) The Sage handbook of early childhood research. Sage Publications, London, California, New Delhi & Singapore, pp 36–54

    Chapter  Google Scholar 

  • Davis W (2011) A general theory of visual culture. Princeton University Press, Princeton

    Google Scholar 

  • DeBord G (1994) In: Nicholson-Smith D (ed) The society of the spectacle. Zone Books, New York

    Google Scholar 

  • Education Review Office (2015) Infants and toddlers: Competent and confident communicators and explorers. Retrieved from

  • Elwick S, Bradley B, Sumsion J (2014) Infants as others: Uncertainties, difficulties, and (im)possibilities in researching infants’ lives. Int J Qual Stud Educ 27(2):198–213, G

  • Fleer M, Ridgeway A (eds) (2014) Visual methodologies and digital tools for researching with young children: Transforming visuality. Springer, Dordrechdt

    Google Scholar 

  • Hayashi A, Tobin J (2015) Teaching embodied: Cultural practices in Japanese preschools. The University of Chicago Press, London

    Book  Google Scholar 

  • Hicks D (2000) Self and other in Bakhtin’s early philosophical essays: Prelude to a theory of prose consciousness. Mind Cult Act: An Int J 7(3):227–242

    Article  Google Scholar 

  • Johansson E, White EJ (2011) Educational Research with our Youngest: Voices of Infants and Toddlers. Springer, Dordrechdt, pp 65–85

    Book  Google Scholar 

  • Krasnov V (1980) Solzhenitsyn and Dostoevsky: a study in the polyphonic novel. University of Georgia Press, Georgia

    Google Scholar 

  • Lang S, Yolbert A, Schoppe-Sullivan S, Bonomi A (2016) A cocaring framework for infants and toddlers: Applying a model of coparenting to parent-teacher relationships. Early Child Res Q 34:40–52

    Article  Google Scholar 

  • Mihailovic A (1997) Corporeal words: Mikhail Bakhtin’s theology of discourse. Northwestern University Press, Illinois

    Google Scholar 

  • Mills D, Ratcliffe R (2012) After method Ethnography in the knowledge economy. Qual Res 12(2):147–164

    Article  Google Scholar 

  • Peters M (2010) Pedagogies of the image: Economies of the gaze. Anal metaphysics 9:42–61

    Google Scholar 

  • Pink S, Leder Mackley K (2012) Video and a sense of the invisible: Approaching domestic energy consumption through the sensory home. Sociol Res Online 17(1):3,

    Article  Google Scholar 

  • Plato (1952) In: Hackforth R (ed) Plato’s Phaedrus. Cambridge University Press, Cambridge

    Google Scholar 

  • Redder B (2014) Infant and peer relationships in curriculum (Unpublished Master of Education thesis). University of Waikato, Hamilton, Available from

    Google Scholar 

  • Sandywell B, Heywood I (2012) Critical approaches to the study of visual culture: An introduction to the handbook. In: Heywood I, Sandywell B (eds) The handbook of visual culture. Berg, London, pp 1–58

    Google Scholar 

  • Shotter J (2012) Agentive spaces, the ‘background’ and other not well articulated influences in shaping our lives. J Theory Soc Behav 43(2):133–154

    Article  Google Scholar 

  • Sullivan P (2013) Qualitative analysis using a dialogical approach. Sage, London

    Google Scholar 

  • Tannen (2010) Abduction and identity in family interaction: Ventriloquizing as indirectness. J Pragmat 42(2):307–316

    Article  Google Scholar 

  • Tavin K (2015) Art, visual culture and media education: Promises, problems and possibilities,

  • White (2009a) A Bakhtinian homecoming: Operationalizing dialogism in the context of an early childhood centre in Wellington, New Zealand. J Early Child Res 7(3):299–323

    Article  Google Scholar 

  • White EJ (2009b) Assessment in New Zealand early childhood education: A Bakhtinian analysis of toddlers metaphoricity. Doctoral thesis: Monash University, Australia.

  • White (2010). Polyphonic Portrayals: A Dostoevskian dream or a researcher’s reality? Proceedings of the Second International Interdisciplinary Conference on Perspectives and Limits of Dialogism in Mikhail Bakhtin. Stockholm University, Stockholm, Sweden, 3-5 June 2009, pp. 87-96.!/menu/standard/file/publication_2010_bakhtin_conf_sthlm_2009_correct_ISBN.pdf

  • White (2011a) Aesthetics of the beautiful: Ideologic tensions in contemporary assessment. In: White EJ, Peters MA (eds) Bakhtinian pedagogy: Opportunities and challenges for research, policy and practice in education across the globe. Peter Lang, New York

    Google Scholar 

  • White (2011b). “Now you see me, now you do not”: Dialogic loopholes in authorship activity with the very young. Psychol Res 1(6)

  • White (2015). A philosophy of seeing: The work of the eye/’I’ in early years educational practice. Journal of Philosophy of Education. Retrieved from early view

  • White (2016a) The ‘work of the eye’ in infant research: A visual encounter. In: Li L, Quinones G, Ridgway A (eds) Studying Babies and Toddlers: Relationships in Cultural Contexts. Springer, Dordrechdt, The Netherlands

    Google Scholar 

  • White (2016a). Introducing dialogic pedagogy: Provocations for the early years. Routledge: London

  • White, Redder B (2015) Proximity with under two-year-olds in early childhood education: A silent pedagogical encounter. Early Education and Care. doi:10.1080/03004430.2015.1028386

    Google Scholar 

  • White, Redder B & Peter M (2015a). The Work of the Eye in Infant Pedagogy: A Dialogic Encounter of ‘Seeing’ in an Education and Care Setting. Int J Early Child 47(2)

  • White, Peter M & Redder B. (2015b). Infant-teacher dialogues: A pedagogical imperative? Early Child Q 30:160–173

Download references


This paper, or the research it underpins, would not have been possible without the generosity of the early years teachers and infants who feature in this video, or their families for granting permission for it to be shared through this very public medium. Deepest gratitude is offered to them all.

Authors’ contributions

Sole authored.

Authors’ information

Information that may be of interest to reader and which explains my orientation:

Jayne has a long-standing interest in education, with particular emphasis on early years pedagogy, spanning over thirty years. As Associate of the Centre for Global Studies, and a member of the Early Years Research Centre, Jayne's work focuses on the complex processes and practices of meaning-making in contemporary ‘open’ societies, with an emphasis on the early years. She engages with a variety of methods to support her work, including the extensive and original use of ‘polyphonic video’ - and other means of visual ethnography, which emphasise ‘seeing’ as an interpretative event of ‘between-ness’. At the heart of her practice lies a strong emphasis on dialogic pedagogy, and the ways in which teachers can best engage within complex learning relationships - regardless of the age of the learner. To this end, Jayne explores philosophical ideas and their potential contribution to pedagogy. Jayne has written extensively in the field, including her recent sole-authored book “Introducing dialogic pedagogy: provocations for the early years” with Routledge, and edits for a number of journals (including Knowledge Cultures, International Journal of Early Childhood Education and Video Journal of Education and Pedagogy). She is co-Director of the Video Lab at University of Waikato, NZ.

Competing interests

The author declares that she has no competing interests.

Ethics approval and consent to participate

Ethics approval for this study was obtained from the University of Waikato, Faculty of Education. The approved application number for this project is EDU051/11. Ethics approval included consent for the use of footage obtained during the study for the purposes of this journal VJEP and open access.

Author information

Authors and Affiliations


Corresponding author

Correspondence to E. Jayne White.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

White, E.J. More than meets the “I”: A polyphonic approach to video as dialogic meaning-making. Video J. of Educ. and Pedagogy 1, 6 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: