Ātea Presence — Enabling Virtual Storytelling, Presence, and Tele - Co - Presence in an Indigenous Setting

Developing, evaluating, and disseminating IT research prototypes for and with indigenous partners is both challenging and rewarding. In conjunction with our domain expert collaborators, Te Rau Aroha Marae (Bluff, Aotea roa/New Zealand) and our academic colleagues at the universities of Waikato and Canterbury , we are implementing a mixed reality telepresence system to connect a diasporic Māori community to their historical, cultural and geographic mātauranga (knowledge). In this article we describe our projec t, Ātea Presence, which is guided by the principles of partnership, participation, and protection. We describe the design and evaluation of the system developed, the collaborative process we undertook with Te Rau Aroha Marae and our Māori academic colleagu es and report on lessons learned along the way.


Background
Māori are the indigenous people of Aotearoa/New Zealand (NZ); descended from the Polynesians who arrived in Aotearoa/NZ during the 13 th and 14 th centuries -some 500 years before European arrival. For Māori, the marae (recognisable as a complex of buildings and outdoor spaces) is a focal space for the community. The marae is used as a communal, social, cultural, and spiritual space and place to greet, meet, eat, celebrate and debate, and to host visitors. In more recent times, the urbanisation and shift of Māori to large cities and overseas has increased the size of the diaspora, with more than 80% of Māori living outside of their tribal areas. The impact of colonisation and the increased physical distanced for many Māori who live outside their tribal (iwi) boundaries has also created a situation where there is a growing number of Māori seeking various ways to interact and connect to their geographical and cultural associations.
In earlier times, temporary venues or urban maraes were constructed to accommodate Māori cultural practices away from their tribal marae (Meredith, 2015). Today, social networking sites, such as Facebook, are commonly used to maintain their connection to their tribal home (O'Carroll, 2013). Many have marae homepages where they upload events and news for the tribal community. In the North Island, the community from Te Pahou Marae have uploaded multiple 360˚ screenshots to Google Earth, so you can virtually visit the place from entrance to the inside of the central meeting house (Forbes, 2017). The Whispering Tales research project tries to use location-based AR to overlay meaningful Māori sites with imagery and video/audio of narratives that users can access with their mobile devices (Marques, 2019). Similarly, Māui Studios are also using AR to enrich Māori graphic novel stories to promote Te reo (the language) Māori, (Māui Studios, 2019). The Mātaatua VR project recorded a kaumatua (elder) speaker, using a large multiple-camera dome recording system, 3D reconstructed the wharenui using 3D laser scanners, and played the kōrero back in VR so families can listen up close (Awanuiārangi, 2020).

Figure 1: The Te Rau Aroha marae and its wharerau at Motupōhue
This desire to interact and connect manifests in a number of different ways, e.g. seeking avenues to hear, learn and speak te reo Māori (Māori language), learning, practicing and experiencing tikanga (customs and protocols), and accessing mātauranga (knowledge) and tribal histories.
For the purpose of the research presented here, three central questions emerged from the discussions with members of Te Rau Aroha Marae and our Māori academic colleagues about the nature and characteristics of this relationship and how it could be addressed technologically. As a result, we are developing, evaluating, and disseminating prototypical solutions which combine these three technological target questions: (1) How can a sense of place and space be re-established? (2) How can an oral, direct communication be supported?, and (3) How can the feeling of co-presence ('virtually being face-to-face') with others be created when they are geographically dispersed?
Recent advances in technology developments play an enabling role here, namely 3D reconstruction and visualisation techniques, virtual and mixed reality, and fast and reliable telecommunication technology.
The project builds on a solid partnership model. Strong levels of trust and relationships must be built between Māori researchers and practitioners together with Pākehā (non-Māori) researchers and technologists. The partnership works closely together with the aim of enabling the continuation of traditional Māori storytelling, presence in the appropriate environmental context, and tele-copresence for Māori over national and international network connections. Such a partnership model requires the iterative, participatory development of tangible artefacts (e.g. concrete software prototypes in contrast to just written communication), frequent communication about world views and protocols, and a continuous development of capabilities for Māori researchers and practitioners. The latter also means that, in the context of "by Māori for Māori," the Pākehā within the partnership aim for their own redundancy during the project in a way that eventually, the Māori partners are in a position to continue on their own.
To initiate and maintain a constructive and trustful relationship between all Ātea project partners we follow the guiding model of partnership, participation, and protection. The foundations for this model have been laid by the Treaty of Waitangi Act (1975) addressing how to work together in decision making, enabling inclusiveness for Māori (and Pākehā) in all levels and aspects of society, and actively protecting Maori rights as citizens, including language, knowledge, values, and other taonga. The main challenge though is to implement such a model. In collaboration with and being guided by our Māori academic collaborators (one of which is closely associated with our iwi partners), we went through a long process to establish the relationship with our iwi partners with several marae visits, undertaking personal development, giving our Māori partners authority to place bounds on our technological activities-a process not only necessary to fulfil above goals, but also very gratifying.
We believe that aspects of our work can be of value for readers who try to find solutions to work together as groups with strong cultural diversity. The value of our project for them could either come from the partnership process (whakawhanaungatanga, process of establishing relationships, see above) itself or from the actual technologies and procedures we have developed as part of the Ātea Presence project, for example, when oral and embodied communication is key and the desire for (perceived) non-mediation of technology is high.
In the remainder of this article, we describe the three main project components: virtual wharenui, voxelvideo storytelling, and 3D tele-co-presence in more detail. We will conclude the article with lessons learned and future directions.

Virtual Wharenui
Most marae normally have a communal gathering place, the wharenui, wharepuni, or here wharerau (Te Rau Aroha), which takes a central and prominent position on their marae. To emphasise tūrangawaewae or a "place to stand" / "sense of belonging" for dispersed Māori, the marae takes centre stage with the wharerau providing the focal point. The artistry evident at Te Rau Aroha, created by renowned artist and master carver Cliff Whiting, has resulted in one of the most contemporary and artistically stunning marae in Aotearoa. The wharerau is uniquely octagonal, modelled on the temporary dwellings that were used for extended food gathering excursions. Each of the eight walls focuses on a particular cultural theme and aspect, highlighted by intricate carvings and weavings. In addition, the wharerau is home to twelve 9-foot high wooden female tīpuna (ancestors) representing the primary genealogical ancestors for descendants connected to this marae (see figure 2). To create a sense of "being there", a virtual representation of the actual marae should depict the characteristic elements of the building in sufficient detail and with appropriate cultural respect. While the physical wharenui (wharerau) is very rich in fine, artistic detail, an exploration with virtual reality techniques require simplification in order to enable an interactive real-time experience i . There is an apparent conflict to be resolved: Which parts of the whare (building) should be represented in what detail for a meaningful and suitable experience? In addition, the creation process should be affordable and time-efficient to be easily handed-over to Māori sovereignty. In partnership with the marae we decided on a hybrid model of fine detail reconstruction and overall sense of presence.
According to Schubert et al. (2001) the sense of presence can be decomposed into spatial presence, involvement, and realism. Spatial presence focuses on the embodied relationship between the user and the (to scale) virtual environment. Involvement mainly stems from the way people interact with and self-navigate within the virtual environment. The weakest factor for presence is the perceived realism which allows us to neglect this aspect for some parts of the building. Hence, we produced models with varying degrees of fidelity and shared them with the domain experts to decide on required levels of detail.
For the 3D reconstruction of the inside of the wharerau building, we use a hybrid modelling and photogrammetric approach. In a first iterative step, the (simplified) architectural shape is modelled and the eight walls are textured with photos taken. This already allows for an immersive exploration of the wharerau and provides a sense of being there. However, all structures are flat (2D), including the artistically shaped tīpuna figures. In a second iterative step, hundreds of photos of one of the eight walls including the accompanying two tīpuna figures to the left and right are taken. The programs COLMAP ii and MeshLab iii were used to 3D reconstruct this part of the building and the resulting 3D mesh component is combined with the rest of the 3D model (see Figure 3).

Figure 3: Hybrid 2D/3D model of wharerau interior
Finally, in a third iterative step, the entire interior architecture of the wharerau is photogrammetrically reconstructed from thousands of photos, including 3D structures (e.g., additional four tīpuna figures around a central column). The resulting model had to be simplified-reduction of the number of triangles to be rendered-so that a real-time, immersive experience is achieved. The photogrammetric reconstruction and model reduction were computed using RealityCapture iv , a proprietary pay-per-pixels software, and the immersive experience was implemented using Unreal Engine (v4.26.2). The virtual wharerau can be experienced on any computer which has a dedicated GPU with at least 4GB VRAM and a modern CPU with at least 8GB RAM. While the software is supporting a wide range of VR peripheries, we are using an Oculus Rift CV1 or S for our system (figure 4 right). The Oculus head-mounted display system allows the user to physically and simultaneously virtually walk within a three-meter circle. For navigation beyond this circle, as it is necessary in our ~14 meters in diameter wharerau, we implemented a simple teleportation mechanism (see figure 4 left). Pointing at the virtual floor and pressing the controller button will display a "cardboard figure" to indicate where one would be teleported to, while twisting the controller changes the figure's orientation indicating the user's target orientation. With this interface, any viewing angle can be achieved within the environment, including close-up views of the artwork.
We informally tested the experience for feasibility, usability, and felt presence in the virtual wharerau. Different user groups were exposed to the system, Māori and Pākehā: experienced and first-time users of VR, people who knew the real wharerau and had been there before, and people who had never been to this marae or to any marae at all. The Pākehā partners have been invited to the marae many times and are now considered as "part of the whānau (family)". This allowed the system to be presented at a very important event, the Waitangi Day celebration, at the marae with over 500 visitors attending.
It was surprising and satisfying to see that almost all people felt a self-reported sense of being there at the wharerau, and that even novice users had little to no difficulty in exploring the virtual building.
A few people reported light dizziness during or after use, but many felt disoriented after (abruptly) doffing the head-mounted display. Also, the teleportation user interface normally needed instruction and a couple of minutes to get used to. In addition to reported excitement about the system and its (potential and current) capabilities we also observed behaviour very similar to real-world behaviour in a wharerau.
Overall, the desired experience and feasibility could be validated by our informal studies, as well as at presentations and gatherings outside the lab since those lab evaluations. After achieving the enabling sense of presence in the virtual wharerau, we added virtual storytelling to the experience. In order to maintain the sense of presence (c.f. Slater et al., 2003), we opted for the integration of threedimensional, volumetric videos in the form of voxels in the building. These voxelvideos (c.f. Regenbrecht et al., 2021) allow for a coarse "holographic" in-space visualisation of a storyteller, including a spatial audio rendering, with the aim of giving the impression of a person being co-present in the same space with the user.

Voxelvideo Storytelling
The passing on of knowledge (mātauranga) and customs and protocols (tikanga) is traditionally done by way of oral storytelling and is often supported by something physical including whakairo (carving) and tukutuku (lattice-work decoration) on the walls which also have their own kōrero (narrative). While other forms of knowledge sharing and preservation have been used since early contact, such as written, audio, and video accounts, the preferred and most true-to-values approach is relating stories orally face-to-face. Also, the storyteller's standing in the community, expressed as mana, is of high importance and will determine the depth and breadth of the story to be shared. If true face-to-face storytelling isn't possible or feasible, like in our project scenario, alternative forms have to be developed and provided.
Writing down and reading (aloud or to one-self) stories is common practice since the early Pākehā settlers arrived in New Zealand. While this form has the potential to be an accurate representation of the story to be told, it also requires a deep understanding of Māori culture and how this can be translated into (the Western form of) writing. A prominent and highly important example is Te Tiriti o Waitangi-a document settling the relationship between the British Crown, the settlers, and the indigenous population of Māori. It exists in te reo Māori and a translated English version where specific interpreted differences are debated to the present day. Hence, for our purposes, written stories are a potential but probably sub-optimal option-we opt for a spoken account of Māori storytelling over written stories.
Sound recordings of stories are one way to provide rich potential for expression and adhere to the original (oral) language; they can be easily recorded and played back and have the ability to display the character and mana of the speaker or storyteller. Their main shortcoming is the lack of non-verbal expression, such as gestures which normally accompany the story. They also lack the integration of props, artefacts, and the natural and built environment into the story. For instance, when referring to carved or woven elements of a wharerau wall, one cannot simply point at those artefacts, but has to verbally describe them.
Videos are even more media-rich and have the potential to not only present the words of the story in combination with seeing the storyteller, but can also depict parts of the environment, like the sky, details of buildings, such as the interior of a wharerau, and other objects and people. The main shortcoming here is the lack of interactivity, in particular the story receiver's ability to explore the scene, as in real life, and to interact with the storyteller in real-time and in a meaningful way.
The use of recently researched and selectively available volumetric video technology is addressing some of the above shortcomings. These videos are three-dimensional representations of a scene and storyteller, usually recorded with an array of specialised cameras and experienced with head-worn displays. The viewer can physically and virtually walk around within the scene and can take any viewing angle they want. Usually, the recording of volumetric videos is very time and labour intensive, and requires specialised recording studios (cf Awanuiārangi, 2020). For instance, the Volucap system (Schreer et al., 2019), one of the most advanced volumetric video capturing systems producing very high-quality results, requires a studio environment with 16, carefully calibrated stereo camera pairs for recording a 3mx3m scene, where each minute of recording requires hundreds of hours of postproduction time and by specially trained technicians.
We are using an alternative approach: voxelvideos. Here, the volumetric video is recorded with voxels (volumetric pixels) of rather coarse resolution, using only three off-the-shelf RGB-D (colour and depth) cameras. The system can be used in a studio setup with controlled lighting etc., or in the field using tripods, e.g. inside the wharerau on a marae. The achievable quality is significantly inferior to approaches like the ones presented by Schreer et al. (2019) or Awanuiārangi et al. (2020), but there is no significant post-processing time required, the system can be operated by trained laypeople, and it is a very affordable technique, making it ideal for our partnership model enabling Māori to record stories for Māori.
We have recorded about a dozen kōrero (stories) with our voxelvideo system, including "Te Waka o Tama Rereti," a kōrero explaining how the stars ended up in the sky; along with each of the eight stories behind the walls in the wharerau. Viewers of these stories experience them in a life-like and interactive way. For example, we have received reports of people immersed in the virtual wharerau experiencing not only a sense of presence (being there in the environment), but also a sense of copresence (being together with the storyteller). This means that, even though the virtual storyteller is not really 'there', the viewers still treat him/her as though he/she is. That is, they avoid 'bumping into' or 'walking through' the virtual storyteller and will also not leave the virtual room before the storyteller has finished speaking. In addition, we noticed with some people the reluctance to interrupt the storyteller, taking off their shoes before entering the virtual wharerau, or confirmatory spoken words during storytelling "Kia ora").
The main shortcomings of our approach are the relatively low resolution of the voxels (4-8mm) and the lack of true interactivity with the storyteller, since it is a recorded story. The low resolution did not prevent viewers from developing a sense of co-presence though-the quality was sufficient for this feeling to develop. However, a number of respondents were uncomfortable with aspects of the "ghost-like" voxelised representations which enabled the participants to see through the storyteller's head (a sacred body part). The low resolution also made it difficult to judge facial expressions of the storyteller, so a higher quality is still desirable. The lack of interactivity with the storyteller can be addressed by either algorithmically animating the storyteller's postural and gestural behaviour as video-game behaviour engines would provide, or by providing a real-time system with a storyteller recorded/captured at the same time as the receiver is experiencing the story. The latter option is supported by our voxel tele-co-presence approach and system.

3D Tele-Co-Presence
Our approach to tele-co-presence with our voxel-based system stems from two affordances: (1) to make the storytelling experience interactive in real-time for both the storyteller and the story receiver, and more importantly, (2) to connect dispersed Māori to their marae, if they are unable or hesitant to do so physically.
We extend the voxelvideo recording and playback aspect of our system towards real-time transmission of voxel streams from one geographic location to another. With such an approach to interactive storytelling it does not matter whether the storytelling participants are in the same region or on opposite sides of the Earth.
Two or more participating people are wearing immersive, tracked head-mounted displays and are simultaneously captured by a minimum of three RGB-D cameras at each location. Each party's computer processes the camera data to form a voxel stream which is then transmitted to the other party. All parties also have three-dimensional models of the virtual place to meet in, here our wharerau model, including its artistic interior. The real-time voxelvideos are placed within the virtual model together with optional pre-recorded voxelvideo stories and other 3D artefacts. All systems are synchronised so that each participant has the impression of being in a shared environment with the others (see figures 6 and 7)

Figure 6: Principle of two people meeting in virtual wharerau with virtual storyteller
The desired outcome is the feeling of tele-co-presence for all participants. The 'tele-presence' part means the feeling of "being there" in the virtual marae, i.e., of being at Te Rau Aroha (there in Bluff). Additionally, the "co-presence" part means the feeling of being together (Zhao, 2003), side-byside with the other participants and with the pre-recorded storyteller as well. While tele-presence alone means that one is present in a remote place, tele-co-presence is the feeling of presence in that space with another person or people. Our system also allows for variations on tele-presence: instead of meeting in a virtual space (in our case a beforehand 3D reconstruction), participants can also meet in each other's physical environment, as it can be voxelised and transmitted in addition to the user's body.
To date, we have informally tested and evaluated our tele-co-presence approach with studies within two adjacent lab rooms, as well as at public expositions and at our partner marae, but also between two cities in Aotearoa, and even as far as Dunedin-Aotearoa/NZ -London-UK as a proof of concept. In the latter test, one has to cope with the latency of about a half a second occurring because of the speed (of light) of the transmission. The main restriction with our immersive tele-co-presence approach lies in the use of head-mounted displays which obscure the participants' faces, and the limitations of fine gestural expressions imposed by the coarse voxel resolution. However, we could observe that larger gestures and general posture expressions work very well (like pointing, body swaying, greetings, hugging).

Lessons Learned and Future Work
We have had the opportunity and privilege to work with and to present to experts in tikanga and te ao Māori (worldview). For instance, at a recent major event in Bluff, the annual Ngāi Tahu Waitangi Day Celebrations which this year coincided with a Treaty of Waitangi discussion on certain indigenous rights, we were able to demonstrate many aspects of our project to representatives of the wider Ngāi Tahu iwi, Māori communities, kaumātua (Māori elders), the general public, etc. We are learning a lot about different world views; how to work together in partnership with indigenous communities; and how to design, integrate, and use the appropriate technology in suitable forms, to achieve the mutual learning and connectivity solutions we are aiming for. Every end of an iterative step is a new beginning for all. While the system in its current form is effective and was received very positively, we consider a number of future directions.
While the voxel resolutions used (typically 8mm or 4mm, depending on the size of environment or the person's interaction space) are sufficient to stimulate a sense of presence and co-presence, finer detail would be desirable. We are working on solutions with 4, 2, and 1mm resolutions leading to a significant, non-linear increase in bandwidth and processing requirements. More importantly for teleco-presence situations, we wish to overcome the use of face-blocking head-mounted displays. For example, optical see-through displays could be used and would allow for more of the faces of the participating people to be seen v .
More important are the cultural aspects of the experience. Visitors to a marae are normally required to partake in a welcoming protocol, known as pōwhiri, before they may freely move around the marae, in particular within the wharerau. Knowledge of the pōwhiri process is helpful and in some cases essential. Using our virtual wharerau environment, people are able expose themselves to the formal pōwhiri process, learn the protocols, and rehearse the ceremony in a safe and non-embarrassing way. In addition to using our system to learn the real-world pōwhiri protocol the question arises whether a virtual pōwhiri is required before one is free to enter the virtual wharerau. As a project team, we are investigating the tikanga, and the possible differences between the physical and virtual pōwhiri requirements.
Multi-sensory experiences are believed to increase the sense of presence, to lead to higher effectiveness, and to be more fun (see e.g. Feng, et al., 2016). We are investigating whether the addition of vibro-tactile feedback, wind, smell, etc. can be applied to our storytelling and tele-copresence context; Which forms and in what combination are culturally appropriate, and whether they deliver benefits from a Māori storytelling perspective, beyond an initial "wow" effect of new technology.
The storytellers' "ghost-like" voxelised representation are both beneficial and challenging. While this representation makes it apparent to the users that virtual characters are acting here (ethical consideration) it also sparks spiritual questions on body ownership and representation, the relationship between dead and living people, aspects of spiritual harm, and the provision of safe spaces. These and other similar concerns were encountered while presenting our system to kaumātua -some of whom are language and tikanga experts. For instance, a balance has to be found between true and truthful representation of a person (storyteller, telepresence participant) and (artificially achieved) completeness of such a representation. For example, should the presented 3D body be "hollow" or filled with arbitrary voxels. Currently we are opting for a "what is captured is what you get" approach without "auto-completion" or AI methods for generative interpolation.
Similarly, as requested by some of our Māori partners, should we animate recorded storytellers to give users the ability to interact with the simulated storyteller? And, if so, which behaviours should be applied and presented? What is a truthful representation of that person? What if the person is no longer alive? We started this process by simply animating an idling storyteller (not paying attention to anything in particular, not speaking) in a way that when a user comes into closer proximity to the storyteller, the storyteller's virtual attention is directed towards the user by looking and gazefollowing them. This very simple artificial behaviour already depicts a certain "scariness," akin to an uncanny valley effect (cf Mori et al., 2012). Much care must be taken with this approach. We plan to experiment with other techniques of applying behaviour to virtual people in the environment, e.g., responses to an outstretched hand, pointing and deictic references, and text-to-speech-to-animation approaches. Whether we will extend this to fully scripted and interactive behaviour will be determined by continuous evaluation and consultation with our Māori expert partners.

Conclusion
While we are still far from simulating (and extending) a comprehensive set of real-world appearance and behaviour, the ultimate, jointly developed goal of the extended design, deployment and management of virtually reviving marae life, we are making steady and confirmed progress on this. We are approaching our goal in iterative steps, for example, by proper documentation, frequent demonstrations, educating students and partners, and continuously communicating within the partnership and with the involved communities. We, collectively, find this challenge very rewarding.
Our project approach is based on the principles of partnership, participation, and protection. As with probably any culturally sensitive projects, forming partnerships between a diverse range of stakeholders and project members is challenging and requires tolerance, a willingness to listen and learn, and the ability to constructively criticise. True participation demands effort and energy to be put in from all partners and the development of mutually agreeable methods of taking part in the analysis, design, implementation, and evaluation of artefacts, here our Ātea system. Protection is required for all parties involved and includes physical and mental well-being, respect for culture and privacy, and intellectual and other property rights.
We would like to stress that a fourth "P" is advisable in such project settings: patience. While many contemporary research projects in the computing sciences are characterised by high paced implementation, evaluation, and reporting, such an approach would not work here based on our experience made so far. Truly sustainable outcomes and relationships demand the acceptance of a communication model and research culture around a holistic learning approach, which is termed ako in te reo Māori. Ako as to learn, to study, to instruct, to teach, and to advise together in a mutually respectful and long-term beneficial way.