Augmented reality and mixed reality for healthcare education beyond surgery: an integrative review

Jaris Gerup; Camilla B. Soerensen; Peter Dieckmann

doi:10.5116/ijme.5e01.eb1a

REVIEW LITERATURE 12503 DOWNLOADS

Augmented reality and mixed reality for healthcare education beyond surgery: an integrative review

Jaris Gerup¹, Camilla B. Soerensen² and Peter Dieckmann³

¹School of Medical Sciences, University of Copenhagen, Denmark

²Department of Pediatrics, Herlev and Gentofte Hospital, Denmark

³Copenhagen Academy of Medical Education and Simulation (CAMES), Center for Human Resources, Herlev and Gentofte Hospital, Denmark

Submitted: 27/04/2019; Accepted: 24/12/2019; Published: 18/01/2020

Int J Med Educ. 2020; 11:1-18; doi: 10.5116/ijme.5e01.eb1a

© 2020 Jaris Gerup et al. This is an Open Access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use of work provided the original work is properly cited. http://creativecommons.org/licenses/by/3.0

Abstract

Objectives: This study aimed to review and synthesize the current research and state of augmented reality (AR), mixed reality (MR) and the applications developed for healthcare education beyond surgery.

Methods: An integrative review was conducted on all relevant material, drawing on different data sources, including the databases of PubMed, PsycINFO, and ERIC from January 2013 till September 2018. Inductive content analysis and qualitative synthesis were performed. Additionally, the quality of the studies was assessed with different structured tools.

Results: Twenty-six studies were included. Studies based on both AR and MR involved established applications in 27% of all cases (n=6), the rest being prototypes. The most frequently studied subjects were related to anatomy and anesthesia (n=13). All studies showed several healthcare educational benefits of AR and MR, significantly outperforming traditional learning approaches in 11 studies examining various outcomes. Studies had a low-to-medium quality overall with a MERSQI mean of 12.26 (SD=2.63), while the single qualitative study had high quality.

Conclusions: This review suggests the progress of learning approaches based on AR and MR for various medical subjects while moving the research base away from feasibility studies on prototypes. Yet, lacking validity of study conclusions, heterogeneity of research designs and widely varied reporting challenges transferability of the findings in the studies included in the review. Future studies should examine suitable research designs and instructional objectives achievable by AR and MR-based applications to strengthen the evidence base, making it relevant for medical educators and institutions to apply the technologies.

The integration of digital strategies has brought healthcare education to a paradigm shift, now reflected in many educational curricula.¹ Modern teaching curricula aim to educate trainees efficiently in safe environments to establish transferability into the clinical context. Augmented reality (AR) and mixed reality (MR) have long been expected to be disruptive technologies, with potential uses in medical education, training, surgical planning and to guide complex procedures.² While virtual reality (VR) has mainly led the way for the implementation of the display technologies, it is criticized for several limitations.³^,⁴ The term display technologies will hereafter be used to refer to AR and MR although it in principle also covers VR. The latter, however, is beyond the scope of this review.

AR describes display-based systems that combine real and virtual imagery, which are interactive in real-time and register the real-world environment to be augmented by virtual imagery.⁵ The visual display technology augments the physical environment by especially two principal manifestations: See-through (transparent) head-mounted display and non-immersive monitor-based video (window on the world). ⁶ AR systems are based on the combination of the physical and the virtual environment. On the contrary, in VR systems the participant is totally immersed in a completely virtual one.

MR is defined as the merging of real and virtual worlds and can be seen as a larger class of technologies covering the display environment of AR and augmented virtuality (AV).⁷ Where virtual information augments the real view in AR, real-world information augments the virtual scene in AV. The external inputs providing real-world context are also seen in VR but were classified as MR in this review. The term of MR was included to embrace new technology labeled as MR, that tries to define a clear distinction between AR and MR, even if there is none.⁸

The abilities to provide situated and authentic experience connected with the real environment, enhance interaction between the physical and virtual content, while preserving a feeling of presence explains the growing expectations that AR and MR may be suitable for healthcare education in various contexts.⁹

Concerning healthcare education, the process of teaching, learning and training with an ongoing integration of knowledge, experience, skills and responsibility qualifies an individual to practice medicine.¹⁰ Looking into medical education, several authors request to eliminate outdated, inefficient, and passive learning approaches and start to embrace these newer methodologies of learning.¹¹ Surgeons have historically always been quick to adapt to new technology developing new treatment and learning methodologies, while physicians were rather more tardy.¹² Today most studies on display technologies stem from surgery. In an integrative review on AR in healthcare education from 2014, surgical studies accounted for 64% (n=16) of the studies included.¹³ A recent systematic review on AR for the surgeon clarifies the current lack of systematic reviews for physicians and ultimately educators within the field of medicine.¹⁴ Many internists and other medical specialists do no longer diagnose and treat illnesses using only their knowledge of pathophysiology and pharmacology.¹⁵ Today, many physicians have taken up procedures and surgical treatment initiatives by operation or manipulation defined as the use of hands to produce the desired movement or therapeutic effect in part of the body.¹⁶ Nevertheless, medicine consists essentially of non-surgical treatment, procedures and other approaches of diagnostics and prevention of disease that need to be taught, learned and trained with an ongoing evaluation of adaptations. AR and MR may effectively help medical educators achieve such instructional objectives for medical education as it is being used for surgical training.

According to the review by Zhu and colleagues, publications in the field of AR increased significantly in 2008.¹³ Now, ten years after that publication outbreak, a new review is warranted. To the best of our knowledge, current reviews on AR and MR have not specifically studied applications for medical subjects in healthcare education. Most papers predominantly include surgical studies and only a few focused on AR in either otolaryngology or medical training.¹^,³^,⁴^,⁹^,¹³^,¹⁷ Currently, no adequate reviews are available that uncover the educational profile of both AR and MR-based applications across different medical specialties, subjects and target groups.

Our aim of this integrative review was to investigate the current research and state of AR and MR-based applications for healthcare education beyond surgery, providing an overview of the findings, strengths and weaknesses of the reported studies.

We chose to conduct an integrative review, given that previous reviews showed only a few studies relevant for the current scope.³^,⁴^,¹³^,¹⁷ This is thought to be the broadest type of review as it allows the inclusion of various research designs and information sources.¹⁸ The method also integrates a process of quality assessment of the studies included that may qualify the integrative review for recommending practice and answering complex search questions.¹⁹^,²⁰ The digital databases of PubMed, PsycINFO and ERIC were searched. The journal of Medical Teacher was hand-searched. Ted Talks and podcasts on the iTunes Podcast app were included, acknowledging the increasing importance of “new media”.²¹^,²² Studies published between January 2013 and September 2018 were included. Relevant word groups, combinations and open-ended terms used for the search were: “Augmented reality OR mixed reality” AND “medicine OR medical OR healthcare” AND “educat* OR simulat* OR train* OR learn*”. We did not implement any filter of ‘NOT virtual reality OR surgery’ in our search string to avoid missing relevant studies examining non-surgical elements despite being termed as a surgical study.

Eligibility criteria

The selection process was done according to three overall criteria regarding research, focus on technology and content. According to the criterion of research studies were included if they described 1) a goal or research question, 2) an appropriate study design, 3) data collection and analysis methods and 4) the discussion of results. Research articles were excluded if they 1) neither described goal nor research question, 2) were review papers and 3) were focused on system descriptions without evaluation or other data. Table 1 provides the inclusion and exclusion criteria for the study.

Study selection

All abstracts were read by JG, who assessed whether they met the inclusion criteria. In case of doubt, JG discussed the inclusion of studies with the other authors. All duplicates were removed.

Data extraction and synthesis

Study characteristics and information of all articles were extracted and described by JG. Characteristics were authors, study aim, subject of healthcare education, design, participants, outcome measures, results, application/technologies, training time and display system. Content analysis was used to describe the study designs and to inductively identify the strengths and weaknesses of AR and MR as described by the studies included.

Table 1. Inclusion and exclusion

Quality assessment

The methodological quality of quantitative and mixed methods studies was evaluated with the Medical Education Research Study Quality Instrument (MERSQI).²³ This 10-item instrument has been thoroughly assessed and evaluated for its correlation with other assessment tools for research quality.²⁴ MERSQI covers six domains of studies: Study design, sampling, type of data, the validity of evaluation instrument, data analysis and outcome. All domains assign 0-3 points valuing the study to a final score between 0 and 18, the larger number indicating better study quality. The score will be presented as mean, standard deviation (SD) and range in parentheses. Each study was scored at the highest possible level. If a study reported more than one outcome, the rating for the highest outcome score was recorded not differentiating between primary or secondary outcome.

The quality assessment of all studies was done by JG. In addition, to assess the quality of JG evaluation, a level of approximately 20% of the studies were randomly selected for assessment by co-authors and independently evaluated by at least two authors. We computed the intraclass correlation coefficient (ICC) to calculate the inter-rater reliability (IRR) between all authors.

The methodological quality of qualitative studies was evaluated with a 12-item grid for Appraising Qualitative Research Articles in Medical Education that was converted into a quality assessment tool (AQRAME) by the authors of this review.²⁵ The instrument covers five domains: Introduction, methods, results, discussion and conclusion. The domain of methods assigns 0-5 points and the conclusion domain only assigns 0-1 point, while the three remaining domains assign 0-2 points. It includes a score range between 0 and 12 points, with a larger number indicating better study quality. A score of 0.5 was given in case of an unclear answer of neither yes nor no. The score will be presented as mean, SD and range in parentheses.

An overall quality assessment tool was developed for rating all included studies regardless of their methodological design, assigning a figure of 1 to 7, with the larger number indicating better study quality. This was introduced to challenge the relative judgements of the MERSQI and AQRAME, acknowledging that different research questions inherently require different study designs. The appraisal was based on the need to be explicit about the role and assessment of the researcher in qualitative research.²⁶ For studies with mixed-method designs, we applied the MERSQI tool only, rating the quantitative parts of the study.

Out of the 315 papers initially identified, four duplicates were removed, three articles in Chinese excluded, and one article could not be retrieved. No reporting of research was found in 14 Ted Talks and iTunes podcasts. Three hundred seven publications were screened and 281 excluded as they did not meet the inclusion criteria. Study subjects related to nasogastric tube insertion, facet joint injection, catheterization or needle guidance were interpreted to clinically related to medicine as a practice of diagnosis and so these studies were classified to fulfill the inclusion criteria. One study focusing on resection planning was included and categorized as preoperative visualization.²⁷ However, needle insertion itself was interpreted not to produce a desired movement or therapeutic effect in part of the body and not classified as a surgical procedure. This resulted in a total of 26 studies being included in the integrative review. The flow chart of publications selected for inclusion in this integrative review is displayed in Figure 1.

Study characteristics

The studies applied AR and MR primarily by integrating the display technologies into knowledge platforms and guidance systems for simulator practice. Some studies offered feedback in the endeavor of a skill or a field of knowledge, while others provided an immersion into scenarios and remote assessment-training for telemedicine. The display technologies showed the ability to stimulate the learning process and support the learner for several competencies:

Figure 1

Selection process of studies

To understand spatial relationships and construct mental 3D models of anatomy with the help or without 2D imaging. To acquire cognitive-psychomotor abilities, prolong learning retention, experience student-centered motivation and obtain flexibility to learn anytime and anywhere in their own pace and style. Furthermore, the studies suggested that AR and MR could complement practice in safe simulation environments contributing to patient safety and a higher degree of confidence (See Appendix 1 – “Summary of results”).

Technical specifications

The majority of studies (n=22) examined an actual application of AR.²⁸^-⁴⁹ The rest (n=4) investigated an application based on MR.²⁷^,⁵⁰^-⁵²Six applications developed by companies were reported in 10 studies.³⁰^,³¹^,³⁷^,³⁹^,⁴⁰^,⁴³^,⁴⁷^,⁴⁸^,⁵⁰^,⁵¹ The remaining studies (n=16) involved self-developed applications primarily developed at universities and hospitals.

Mobile device-based (tablets and smartphones) applications were used in nine studies.³³^,³⁵^,³⁷^,³⁹^,⁴¹^,⁴²^,⁴⁷^-⁴⁹ Of these two thirds (n=6) involved camera and marker-based recognition, and three studies did not report any further on the applications developed.⁴¹^,⁴⁷^,⁴⁸ Eight studies implemented head-mounted display.²⁷^,²⁸^,³⁸^,⁴⁰^,⁴³^-⁴⁶ Two studies utilized the same head-mounted display.⁴⁰^,⁴³ The head-mounted display-integrated applications had marker-based recognition in four of the studies.²⁸^,⁴⁰^,⁴³^,⁴⁴ One study recognized the hands and gestures of a mentor projecting these into in the trainee’s display.⁴⁶ Two studies implemented a foot pedal to interact with the application.²⁷^,³⁸ For one study this included toggling between AR and MR-mode.²⁷ Computers were used in 11 studies.³⁰^,³¹^,³⁴^,³⁶^,³⁸^,⁴⁰^,⁴³^,⁴⁶^,⁵⁰^-⁵²These delivered the computing power for head-mounted display-based applications in four studies. ³⁸^,⁴⁰^,⁴³^,⁴⁶ One computer-based application had marker-based recognition.³⁶ Seven studies were sensor-based.³⁰^,³¹^,³⁴^,⁴⁶^,⁵⁰^-⁵²Two studies recognized landmarks of the user’s body.³⁰^,³¹ Four studies recognized a virtual model registered with a phantom characterized as MR.²⁷^,⁵⁰^-⁵²Eleven studies reported using external cameras and tracking devices.²⁷^,²⁸^,³¹^,³²^,³⁴^,³⁶^,⁴⁴^,⁵⁰^-⁵²Two studies used applications based on projectors, one recognizing markers on a phantom, and one projecting images directly onto a phantom without using a tracking device.²⁹^,⁵¹

Methodological quality

In the included 26 studies, nine were solely quantitative, 16 were mixed research methods and one was qualitative. Based on rating comparisons of the approximately 20% (n=5) randomly selected papers, the authors’ agreed to use the ratings by JG for MERSQI, AQRAME and the overall score for the remaining papers. The average total MERSQI score of the 25 quantitative and mixed methods studies was mean 12.26, SD=2.63 (7-15.5). The ICC between all raters were computed to IRR=.50 for the MERSQI overall score, which corresponds to a moderate reliability.⁵³ Nearly one-third of all studies (n=8) either had no evaluation tool or did not report any validity of the instrument used.²⁸^-³⁵

The qualitative study involved semi-structured face-to-face interviews that explored the needs and challenges of applying AR for healthcare education. The study demonstrated a detailed clarity and rigor according to the individual AQRAME score of all three authors corresponding to 12 (JG), 11.5 (CBS), and 12 (PD). As there was only one qualitative study, we did not report any IRR for the AQRAME overall score.

The mean average overall quality score of all studies was 4.08, SD=1.65 (1-7) with an adjusted ICC equaling IRR=.429 also corresponding to a moderate reliability.⁵³ The scores of the individual studies and the study characteristics are reported in Appendix 1.

Strengths and weaknesses of AR and MR

Three themes were inductively identified indicating the strengths and weaknesses of AR and MR in healthcare education beyond surgery.

Strengths

Implemented across various subjects for learner types of all levels spanning different sectors

The most frequently studied subjects of healthcare education were found within anatomy (n=6) and anesthesia (n=7), the ladder represented by four studies focusing on central vein catheterization.²⁹^,³⁸^,⁴⁴^,⁵² Study participants were divided into 12 different categories: Pre-medical, medical, nursing, and health science students, novices, residents, fellows and established clinicians of different specialties, technicians, non-clinicians, non-specified participants and managers. The mean number of participants was 77.1, SD=170.6 (1-880) since the sample size was set to one in a study that did not report or specify the study participants.³³ The distribution of studies across subjects of healthcare education related to the number of participants enrolled is described in Appendix 2.

The rich diversity of research and outcome focus

A total of six proof-of-concept, pilot or user studies sought to introduce an application or assess initial validity.²⁸^,²⁹^,³³^-³⁵^,⁴⁷Eight studies focused on evaluating training by an application for strengthening the validity of the construct.³⁰^,³⁷^,³⁹^,⁴⁰^,⁴²^,⁴³^,⁵⁰^,⁵¹ The remaining studies (n=12) focused on the application-based assessment of a specific skill or procedure, eventually correlating the performance to other outcomes such as cognitive load.²⁷^,³¹^,³⁶^,³⁸^,⁴¹^,⁴⁴^-⁴⁶^,⁴⁸^,⁴⁹^,⁵¹^,⁵² Technical test outcomes were reported in 17 studies and concerned primarily needle insertion in terms of accuracy and precision (n=11).²⁷^-²⁹^,³¹^,³³^,⁴⁰^,⁴³^,⁴⁴^,⁵⁰^-⁵² The secondly most reported technical test outcome concerned procedure time (n=9).²⁷^,²⁹^,³⁸^,⁴³^,⁴⁴^,⁴⁶^,⁵⁰^-⁵² Nineteen studies investigated learning experience and user acceptance based on especially Likert scales.³⁰^-³²^,³⁴^-⁴²^,⁴⁴^-⁴⁹^,⁵² Other questionnaire-based outcomes were cognitive load, stress response, adverse health effects and ergonomics.³⁸^,³⁹^,⁴¹^,⁴⁴^-⁴⁶ Knowledge tests were examined in combination with questionnaire-based outcomes in six studies.³⁶^,³⁷^,³⁹^,⁴¹^,⁴²^,⁴⁹ One study included an observational method to determine learning behavior.⁴⁹

Growing evidence for improving learning

In 11 studies AR and MR were claimed to significantly improve the learning process or part-tasks associated in all or in the majority of outcome measures.²⁷^,²⁹^,³⁶^,³⁷^,³⁹^,⁴⁰^,⁴³^,⁴⁸^-⁵⁰^,⁵² Four out of six studies examining the acquisition of anatomy knowledge reported significantly improved learning.³⁶^,³⁷^,³⁹^,⁴⁹ Significant positive findings were found in six of 11 studies concerning skill training of needle insertion favoring both students and established clinicians.²⁷^,²⁹^,⁴⁰^,⁴³^,⁵⁰^,⁵² Procedure time was significantly reduced in three of nine studies.²⁷^,²⁹^,⁵² Examining different questionnaire-based aspects of the learning experience and user acceptance four of 19 studies demonstrated significant positive findings advocating the usability

of the display technologies.³⁶^,³⁷^,³⁹^,⁴⁸ Fifteen studies found no significant positive results but all suggested the AR and MR-based applications may outperform traditional learning approaches within the involved subjects of healthcare education.²⁸^,³⁰^-³⁵^,³⁸^,⁴¹^,⁴²^,⁴⁴^-⁴⁷^,⁵¹ Other promising learning factors facilitated by the display technologies were related to visualization, directing attention, intrinsic benefits of motivation, physical interaction activating kinesthetic schemes, patient safety, skill retention, simulation confidence related to transferability, mobile learning and using oneself as a learning object.³⁹^,⁴¹^,⁴²^,⁴⁵^,⁴⁹^,⁵¹

Weaknesses

Reporting of prototypes, technological limitations and poor ergonomics

Sixteen studies presented a prototype, typically as preliminary feasibility studies lacking to report adequately on the educational impact of the prototype tested.²⁷^-²⁹^,³²^-³⁶^,³⁸^,⁴¹^,⁴²^,⁴⁴^-⁴⁶^,⁴⁹^,⁵² Ten studies were conducted on one of six established applications.³⁰^,³¹^,³⁷^,³⁹^,⁴⁰^,⁴³^,⁴⁷^,⁴⁸^,⁵⁰^,⁵¹ The studies of head-mounted display-based applications (n=8) addressed technological limitations related to limited computing power, occlusion of the user’s field of view and poor ergonomics by head-mounted displays being tethered to workstations and when wearing glasses underneath.²⁷^,⁴⁴^,⁴⁶

Shortcomings of the study designs for transferability

Four studies were designed as a single group user study only, making strong conclusions difficult.³¹^-³³^,³⁵ Twenty-two studies used a group design or comparison, of which the most (n=17) compared two groups.²⁷^-³⁰^,³⁴^,³⁶^,³⁸^-⁴⁰^,⁴²^,⁴⁴^,⁴⁵^,⁴⁷^-⁵¹Only two studies did not compare AR or MR with another media corresponding to lectures, books, video, virtual reality, mobile devices, conventional training platforms, and telemedical full-setup.²⁸^,³⁴ Two studies compared the media of mobile devices after having provided AR content to one of the groups.⁴¹^,⁴² Five studies encompassed three groups.³⁷^,⁴¹^,⁴³^,⁴⁶^,⁵² Two of the two-group studies used a cross-over design.²⁹^,³⁰ No study involved patients in an authentic context, but two studies included patient data.²⁷^,³²

Lacking evidence for improving learning

Eight studies reported descriptive frequencies of self-reported evaluations and measures without any statistical analysis of significance.²⁸^,³⁰^-³⁵^,⁴⁷Seven studies claimed the display technologies offered no significant impact for improving learning in all or in the majority of outcome measures. ³⁸^,⁴¹^,⁴²^,⁴⁴^-⁴⁶^,⁵¹ The two studies that compared AR within the same media of mobile devices found no significant difference in any of the outcome measures.⁴¹^,⁴² Only a single study presented a significant negative finding of prolonged completion time of an ultrasound examination in the AR group.⁴⁶ Potentially conflicting factors were addressed in terms of visual misperception, media or technology enthusiasm-based motivation, negation of patient discomfort related to patient safety, and missing translation of performance from simulation to clinical setting.²⁷^,⁴¹^,⁵⁰^,⁵¹

Virtual augmentation and guidance of AR and MR are increasingly used in applications for medical subjects of healthcare education these years. The quality of the existing studies and applications including the educational benefits of the display technologies remain unclear at the moment.

We reviewed the current research and state of AR and MR-based applications for healthcare education in medical disciplines beyond surgery. Our integrative review identified 26 original studies examining various applications of both display technologies. The applications were found to measure numerous outcomes related to the learning process, acquisition of knowledge and skill training while providing feedback on patient care-related outcomes such as complication rates, insertion time and needle path related to tissue damage. This differs greatly from the findings of a systematic review by Barsom and colleagues on applications for medical training for professionals, in which none were developed to measure the prevention of errors for the interest of patient safety.⁴

Our work revealed an increased emergence of established applications corresponding to 27% (n=6) investigated in 10 studies against 16 prototypes. A prior review by Zhu and colleagues only found one established application for laparoscopic colorectal surgery.¹³ In the same review, the authors found the application designs lacking guidance by learning theories only resting on traditional learning strategies. We observed that the applications of AR and MR still have not exploited the integration of learning theories and strategies into their design. Still, the increased number of established applications is a step towards turning the research base away from feasibility studies examining prototypes.

We conclude that the studies overall were of low-to-medium quality. This is consistent with the low to modest strength of evidence level reported in previous systematic reviews.⁴^,¹⁷The single qualitative study was found to be of high quality in terms of clarity and rigor, while the relative judgement of the overall quality was found to be of a low-to-medium quality. The greatest limitation across the pool of studies noted in nearly one-third of all studies (n=8) was either the utter lack or poor reporting of the validity of the evaluation instruments indirectly providing the evidence base for the study findings. Additionally, the statistical analyses reported incomplete results or were unclearly interpreted. Shortcomings of the reviewed studies further included heterogeneity of research designs, unstandardized outcome measures and wide variation in details given. Widespread heterogeneity among studies is stated to be one of the greatest challenges of quantitatively synthesizing research evidence.⁵⁴ At the same time, an outspoken concern argues that media-comparative studies in learning are virtually useless and not valid for comparison.⁵⁵ From this perspective, the studies failed to determine which media or technologies were best for healthcare education but rather informed practice with the specific application. These limitations are general for much education research but may be especially pronounced for research in the nexus of learning and technology.⁵⁶ Nevertheless, we did not exclude studies based on their quality due to our aim of providing an overview of the strengths and weaknesses of all relevant research in AR and MR for healthcare education beyond surgery during the past half-decade.

Limitations and recommendations for future studies

To our knowledge, this is the first integrative review of AR and MR solely focusing on medical subjects of healthcare education. Three articles in Chinese were not included, meaning that we possibly excluded relevant knowledge. Moreover, we may have missed relevant research either published or not published in technical journals as our main focus was on databases for healthcare and education. Our finding that all included studies suggested or reported significant positive findings should be interpreted with caution since publication bias cannot be excluded. We tried to minimize the drop-out of relevant material by including unpublished work from new online sources such as TED Talks and the podcast media of iTunes. There was a contentious issue of the designs and presentations of these varying too extensively without enhancing the quality and usefulness of the review. Our study abstained from addressing the educational profile of AV compared to AR both being encompassed by MR. This could not be done due to a low number of studies measuring AV-based learning, possibly related to the impaired technologic and conceptual understanding of MR across the research field and industry. The quality of the included studies was assessed with the MERSQI scale, which revealed inconsistencies across a few domains in the process of rating. This was mainly due to missing information in the reviewed studies as well as a lack of clarity in the MERSQI guidelines. Though moderate reliability was found between all raters in the MERSQI and the overall quality assessment tool, one could argue that the sample size of the rating corresponding to approximately 20% (n=5) of the studies either hinders or disallows reliable calculations beyond descriptive analysis. Finally, the self-developed assessment tool of AQRAME has not been validated for quality scoring qualitative research despite relying on a known 12-item grid for quality appraisal. This tool was introduced since we were not aware of any validated evaluation instruments for quality assessment of qualitative research in healthcare education.

A variety of applications for subjects of healthcare education beyond surgery have been developed, and their benefits were supported by this integrative review. We expect that more research will be done on the field as more institutions will explore and apply applications based on AR and MR in the future. Randomized controlled trials should continuously be organized for evaluating clinical performance and patient-care related outcomes. Specifically, the actual effects on real patients and physician behaviors towards patients in a real context are yet to be elucidated. We recommend future studies to justify and validate metrics and report the reliability of measures for higher-quality evaluations. Established guidelines and recommendations for high-quality research formulating joint standards could promote the adoption of the display technologies and facilitate exchange among researchers, educators and developers with widely different experiences and approaches.⁵⁷

Similar to the words of David A. Cook, professor of medicine and medical education, we suggest placing more emphasis on the ‘How’ and ‘When’ to use AR and MR-based learning and to focus less on ‘Whether’.⁵⁵ Answering these questions researchers, educators and developers should share and evaluate the instructional design and learning theory-based methods while looking into effective use of simulation, and integration of the display technologies within and between institutions. Eventually, this could also provide an understanding of learning concepts revealed from the included studies involving intrinsic benefits of motivation, physical interaction activating kinesthetic schemes, skill retention, transferability of simulation confidence, mobile learning and using oneself as a learning object. By defining instructional objectives beforehand, the display technologies should be used only when it could refine or even replace training programs and curricula.

With that being said partially immersive environments such as AR and MR may offer unique qualities for specifically, assessment and training procedural strategies integrating real patient data and without breaching patient safety. By using non-invasive sensors for imaging, the display technologies could complement the established imaging technologies of MRI, CT scan and ultrasound for monitoring of technical performance with an objective-comparative function as observed in our review.²⁷^,²⁹^,⁵⁰ To tap the full potential of the display technologies, the study and application design must be based on a throughout investigation of the educational context, learner types and learning objectives whether the latter being cognitive, technical, or non-technical such as measuring situational awareness, communication, or stress coping.

This review reports the current state of AR and MR-based applications for healthcare education beyond surgery. Studies based on both display technologies across various specialties and subjects states an increased number of established applications moving the research base away from feasibility studies on prototypes. All included studies suggested various healthcare educational benefits by the display technologies which significantly outperformed traditional learning approaches in 11 studies, specifically regarding the acquisition of anatomy knowledge and needle insertion skills. Yet, this review identifies multiple shortcomings of the studies. Study quality was low-to-medium especially due to lacking validity of the evaluation instruments, heterogeneity of research designs and widely varied reporting. Future studies are thus needed for researchers, educators and developers to build an evidence base defining suitable research designs and instructional objectives achievable by AR and MR-based applications, for these to complement conventional learning, curricula, and conduct a transformation in healthcare education.

Acknowledgements

We would like to thank for financial support by the institutional funds of the Copenhagen Academy of Medical Education and Simulation (CAMES), and valuable feedback by the employees of the academy.

Conflict of Interest

PD holds a professorship with the University of Stavanger, Norway that is supported unconditionally by a grant from the Laerdal Foundation in Norway.

Supplementary file 1

Appendix 1. Study characteristics including quality scores (S1.pdf, 172 kb)

Supplementary file 2

Appendix 2. Distribution of studies across medical specialty or health science, number of studies, and participants enrolled according to number of studies (S2.pdf, 96 kb)

Kamphuis C, Barsom E, Schijven M and Christoph N. Augmented reality in medical education? Perspect Med Educ. 2014; 3: 300-311.
Full Text PubMed
Chen L, Day TW, Tang W, John NW. Recent developments and future challenges in medical mixed reality. Proceedings of 16th IEEE International Symposium on Mixed and Augmented Reality (ISMAR); 9-13 October 2017. Nantes, France: IEEE; 2017.
Silva JNA, Southworth M, Raptis C and Silva J. Emerging applications of virtual reality in cardiovascular medicine. JACC Basic Transl Sci. 2018; 3: 420-430.
Full Text PubMed
Barsom EZ, Graafland M and Schijven MP. Systematic review on the effectiveness of augmented reality applications in medical training. Surg Endosc. 2016; 30: 4174-4183.
Full Text PubMed
Azuma R, Baillot Y, Behringer R, Feiner S, Julier S and MacIntyre B. Recent advances in augmented reality. IEEE Comput Grap Appl. 2001; 21: 34-47.
Full Text
Milgram P, Takemura H, Utsumi A, Kishino F. Augmented reality: a class of displays on the reality-virtuality continuum. Telemanipulator and Telepresence Technologies. 1995;2351:282-92.
Milgram P, Kishino F. Taxonomy of mixed reality visual displays. IEICE Transactions on Information Systems. 1994;E77-D(12):449-459.
Brigham TJ. Reality check: basics of augmented, virtual, and mixed reality. Med Ref Serv Q. 2017; 36: 171-178.
Full Text PubMed
Zhu E, Lilienthal A, Shluzas LA, Masiello I and Zary N. Design of mobile augmented reality in health care education: a theory-driven framework. JMIR Med Educ. 2015; 1: e10.
Full Text PubMed
Wojtczak A. Glossary of medical education terms: Part 4. Med Teach. 2002; 24: 567-568.
Full Text PubMed
Smith ML and Foley MR. Transforming clinical education in obstetrics and gynecology: gone is the day of the sage on the stage. Obstet Gynecol. 2016; 127: 763-767.
Full Text PubMed
Ellis H, Abdalla S. A history of surgery. Boca Raton: CRC Press; 2018.
Zhu E, Hadadgar A, Masiello I and Zary N. Augmented reality in healthcare education: an integrative review. PeerJ. 2014; 2: e469.
Full Text PubMed
Yoon JW, Chen RE, Kim EJ, Akinduro OO, Kerezoudis P, Han PK, Si P, Freeman WD, Diaz RJ, Komotar RJ, Pirris SM, Brown BL, Bydon M, Wang MY, Wharen RE and Quinones-Hinojosa A. Augmented reality for the surgeon: systematic review. Int J Med Robot. 2018; 14: 1914.
Full Text PubMed
Aggarwal A. The evolving relationship between surgery and medicine. Virtual Mentor. 2010; 12: 119-123.
Full Text PubMed
Martin E. Concise medical dictionary. Oxford: Oxford University Press; 2015.
Wong K, Yee HM, Xavier BA and Grillone GA. Applications of augmented reality in otolaryngology: a systematic review. Otolaryngol Head Neck Surg. 2018; 159: 956-967.
Full Text PubMed
Whittemore R. Combining evidence in nursing research: methods and implications. Nurs Res. 2005; 54: 56-62.
Full Text PubMed
Whittemore R and Knafl K. The integrative review: updated methodology. J Adv Nurs. 2005; 52: 546-553.
Full Text PubMed
Grant MJ and Booth A. A typology of reviews: an analysis of 14 review types and associated methodologies. Health Info Libr J. 2009; 26: 91-108.
Full Text PubMed
TED. TED: Ideas worth spreading. [Cited 1 September 2018]: Available from: https://www.ted.com.
iTunes Podcast app. Podcasts Downloads on iTunes. [Cited 1 September 2018]; Available from: https://itunes.apple.com/gb/genre/podcasts/id26.
Reed DA, Cook DA, Beckman TJ, Levine RB, Kern DE and Wright SM. Association between funding and quality of published medical education research. JAMA. 2007; 298: 1002.
Full Text PubMed
Cook DA and Reed DA. Appraising the quality of medical education research methods: the Medical Education Research Study Quality Instrument and the Newcastle-Ottawa Scale-Education. Acad Med. 2015; 90: 1067-1076.
Full Text PubMed
Côté L and Turgeon J. Appraising qualitative research articles in medicine and medical education. Med Teach. 2005; 27: 71-75.
Full Text PubMed
Stacy R and Spencer J. Assessing the evidence in qualitative medical education research. Med Educ. 2000; 34: 498-500.
Full Text PubMed
Abhari K, Baxter JSH, Chen ECS, Khan AR, Peters TM, de Ribaupierre S and Eagleson R. Training for planning tumour resection: augmented reality and human factors. IEEE Trans Biomed Eng. 2015; 62: 1466-1477.
Full Text PubMed
Bifulco P, Narducci F, Vertucci R, Ambruosi P, Cesarelli M and Romano M. Telemedicine supported by Augmented Reality: an interactive guide for untrained people in performing an ECG test. Biomed Eng Online. 2014; 13: 153.
Full Text PubMed
Jeon Y, Choi S and Kim H. Evaluation of a simplified augmented reality device for ultrasound-guided vascular access in a vascular phantom. J Clin Anesth. 2014; 26: 485-489.
Full Text PubMed
Kugelmann D, Stratmann L, Nühlen N, Bork F, Hoffmann S, Samarbarksh G, Pferschy A, von der Heide AM, Eimannsberger A, Fallavollita P, Navab N and Waschke J. An Augmented Reality magic mirror as additive teaching device for gross anatomy. Ann Anat. 2018; 215: 71-77.
Full Text PubMed
Ma M, Fallavollita P, Seelbach I, Von Der Heide AM, Euler E, Waschke J and Navab N. Personalized augmented reality for anatomy education. Clin Anat. 2016; 29: 446-453.
Full Text PubMed
Mewes A, Heinrich F, Kägebein U, Hensen B, Wacker F and Hansen C. Projector-based augmented reality system for interventional visualization inside MRI scanners. Int J Med Robot. 2019; 15: 1950.
Full Text PubMed
Solbiati M, Passera KM, Rotilio A, Oliva F, Marre I, Goldberg SN, Ierace T and Solbiati L. Augmented reality for interventional oncology: proof-of-concept study of a novel high-end guidance system platform. Eur Radiol Exp. 2018; 2: 18.
Full Text PubMed
Sutherland C, Hashtrudi-Zaad K, Sellens R, Abolmaesumi P and Mousavi P. An augmented reality haptic training simulator for spinal needle procedures. IEEE Trans Biomed Eng. 2013; 60: 3009-3018.
Full Text PubMed
Wang LL, Wu HH, Bilici N and Tenney-Soeiro R. Gunner goggles: implementing augmented reality into medical education. Stud Health Technol Inform. 2016; 220: 446-449.
PubMed
Ferrer-Torregrosa J, Torralba J, Jimenez MA, García S and Barcia JM. ARBOOK: development and assessment of a tool based on augmented reality for anatomy. J Sci Educ Technol. 2015; 24: 119-124.
Full Text
Ferrer-Torregrosa J, Jiménez-Rodríguez M, Torralba-Estelles J, Garzón-Farinós F, Pérez-Bermejo M and Fernández-Ehrling N. Distance learning ects and flipped classroom in the anatomy learning: comparative study of the use of augmented reality, video and notes. BMC Med Educ. 2016; 16: 230.
Full Text PubMed
Huang CY, Thomas JB, Alismail A, Cohen A, Almutairi W, Daher NS, Terry MH and Tan LD. The use of augmented reality glasses in central line simulation: "see one, simulate many, do one competently, and teach everyone". Adv Med Educ Pract. 2018; Volume 9: 357-363.
Full Text PubMed
Küçük S, Kapakin S and Göktaş Y. Learning anatomy via mobile augmented reality: effects on achievement and cognitive load. Anat Sci Educ. 2016; 9: 411-421.
Full Text PubMed
Leitritz MA, Ziemssen F, Suesskind D, Partsch M, Voykov B, Bartz-Schmidt KU, et al. Critical evaluation of the usability of augmented reality ophthalmoscopy for the training of inexperienced examiners. Retina. 2014;34(4):785–91.
Moro C, Štromberga Z, Raikos A and Stirling A. The effectiveness of virtual and augmented reality in health sciences and medical anatomy. Anat Sci Educ. 2017; 10: 549-559.
Full Text PubMed
Noll C, von Jan U, Raap U and Albrecht UV. Mobile augmented reality as a feature for self-oriented, blended learning in medicine: randomized controlled trial. JMIR Mhealth Uhealth. 2017; 5: e139.
Full Text PubMed
Rai AS, Rai AS, Mavrikakis E and Lam WC. Teaching binocular indirect ophthalmoscopy to novice residents using an augmented reality simulator. Can J Ophthalmol. 2017; 52: 430-434.
Full Text PubMed
Rochlen LR, Levine R and Tait AR. First-person point-of-view-augmented reality for central line insertion training: a usability and feasibility study. Simul Healthc. 2017; 12: 57-62.
Full Text PubMed
Siebert JN, Ehrler F, Gervaix A, Haddad K, Lacroix L, Schrurs P, Sahin A, Lovis C and Manzano S. Adherence to AHA guidelines when adapted for augmented reality glasses for assisted pediatric cardiopulmonary resuscitation: a randomized controlled trial. J Med Internet Res. 2017; 19: e183.
Full Text PubMed
Wang S, Parsons M, Stone-McLean J, Rogers P, Boyd S, Hoover K, Meruvia-Pastor O, Gong M and Smith A. Augmented reality as a telemedicine platform for remote procedural training. Sensors (Basel). 2017; 17: .
Full Text PubMed
Zhu E, Fors U and Smedberg . Exploring the needs and possibilities of physicians’ continuing professional development - An explorative qualitative study in a Chinese primary care context. PLoS ONE. 2018; 13: e0202635.
Full Text PubMed
Aebersold M, Voepel-Lewis T, Cherara L, Weber M, Khouri C, Levine R and Tait AR. Interactive anatomy-augmented virtual simulation training. Clin Simul Nurs. 2018; 15: 34-41.
Full Text PubMed
Albrecht UV, Folta-Schoofs K, Behrends M and von Jan U. Effects of mobile augmented reality learning compared to textbook learning on medical students: randomized controlled pilot study. J Med Internet Res. 2013; 15: e182.
Full Text PubMed
Keri Z, Sydor D, Ungi T, Holden MS, McGraw R, Mousavi P, Borschneck DP, Fichtinger G and Jaeger M. Computerized training system for ultrasound-guided lumbar puncture on abnormal spine models: a randomized controlled trial. Can J Anaesth. 2015; 62: 777-784.
Full Text PubMed
Moult E, Ungi T, Welch M, Lu J, McGraw RC and Fichtinger G. Ultrasound-guided facet joint injection training using Perk Tutor. Int J Comput Assist Radiol Surg. 2013; 8: 831-836.
Full Text PubMed
Robinson AR, Gravenstein N, Cooper LA, Lizdas D, Luria I and Lampotang S. A mixed-reality part-task trainer for subclavian venous access. Simul Healthc. 2014; 9: 56-64.
Full Text PubMed
Landis JR and Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977; 33: 159-174.
PubMed
Barry Issenberg S, Mcgaghie WC, Petrusa ER, Lee Gordon D and Scalese RJ. Features and uses of high-fidelity medical simulations that lead to effective learning: a BEME systematic review. Med Teach. 2005; 27: 10-28.
Full Text PubMed
Cook DA. Where are we with Web-based learning in medical education? Med Teach. 2006; 28: 594-598.
Full Text PubMed
Jensen L and Konradsen F. A review of the use of virtual reality head-mounted displays in education and training. Education and Information Technologies. Education and Information Technologies. 2018; 23: 1515-29.
Full Text
Cheng A, Kessler D, Mackinnon R, Chang TP, Nadkarni VM, Hunt EA, Duval-Arnould J, Lin Y, Cook DA, Pusic M, Hui J, Moher D, Egger M and Auerbach M. Reporting guidelines for health care simulation research: extensions to the CONSORT and STROBE statements. Simul Healthc. 2016; 11: 238-248.
Full Text PubMed

Augmented reality and mixed reality for healthcare education beyond surgery: an integrative review

Abstract

Introduction

Methods

Eligibility criteria

Study selection

Data extraction and synthesis

Quality assessment

Results

Study characteristics

Figure 1

Technical specifications

Methodological quality

Strengths and weaknesses of AR and MR

Strengths

Implemented across various subjects for learner types of all levels spanning different sectors

The rich diversity of research and outcome focus

Growing evidence for improving learning

Weaknesses

Reporting of prototypes, technological limitations and poor ergonomics

Shortcomings of the study designs for transferability

Lacking evidence for improving learning

Discussion

Limitations and recommendations for future studies

Conclusions

Acknowledgements

Conflict of Interest

Supplementary materials

Supplementary file 1

Supplementary file 2

References