Resident self-assessment versus faculty assessment of laparoscopic technical skills using a global rating scale

Objectives: Accurate self assessment of laparoscopic technical skill development can accelerate a trainee’s learning growth and guide life-long development for surgeons. The aim of this study was to measure the accuracy of obstetrical and gynecology resident self assessments as compared to faculty assessments of laparoscopic technical skill performance using a technical skill global rating scale evaluation. Methods: 37 residents completed four laparoscopic procedures as part of a porcine simulation teaching laboratory: tubal ligation, ovarian cystectomy, salpingostomy and hysterectomy. At the conclusion of the session, residents and faculty completed global rating scale assessments for all laparoscopic procedures completed. Mann-Whitney U analysis was used to compare faculty and resident results. Results: Overall residents self assessment was lower than faculty evaluation (148 cases). Post Graduate Year 1’s (n=8/n=31 cases) reported significantly lower scores on 5 of 10 items. Post Graduate Year 2’s (n=8/n=35 cases) reported significantly higher scores in 6 of 10 items. Post Graduate Year 3’s (n=10/n=39 cases) reported lower scores on all ten items (p<0.05). Post Graduate Year 4’s (n=11/n=43 cases) reported significantly lower scores in 9 of 10 items. Conclusions: Residents are more critical of their performance than faculty when using a global rating scale during a laparoscopic simulation teaching session. Implications of inaccurate self perceptions can slow learning development of trainees. Further research is needed to examine the accuracy of resident self perceptions and faculty assessments of performance in the operating room.


Introduction
Laparoscopic skill competency is a central requirement of gynecological residency training. 1 The development of procedural knowledge and the ability to perform the technical tasks that are associated with each laparoscopic procedure requires modeling, deliberate practice and feedback. 2,3 For adult learners, the ability to accurately self assess performance can facilitate additional learning and growth. 4,5 One of the aims of training programs is to nurture the skills of residents to be life-long learners. 6 Self reflection and valid assessment of performance is necessary for this process.
Educational learning theory describes the importance of accurate self assessment. 7 Stemming from this assertion, there have been several studies in the medical education literature that have reported on the impact of residents' inability to accurate assess their performance. 6,8,9 Early research focused on cognitive knowledge rather than technical skill, 6,9,10 but in the last five years there have been increasing reports of resident technical skill self-assessment in the operating room and in simulated environments. [11][12][13][14][15][16][17] Findings from these studies have focused on improving resident self-assessment, and the reliability and validity of instruments in measuring true and perceived performance.
To fully understand this issue we need to also have definitive "gold standard" assessment mechanism. Currently, this standard is assumed by the evaluations of faculty, but is noted by many to have limitations. 15 The American Board of Obstetrics and Gynecology and the Council on Resident Education in Obstetrics and Gynecology have initiated comprehensive in-service and board examinations that measure cognitive knowledge and decision making; but measurement of technical skill relies solely on the individual training programs to certify competency. There have been several attempts by obstetrics and gynecology faculty to develop objective assessment instruments that are reliable and valid. One of these instruments is the Objective Structured Assessment of Technical Skills (OSATS) global rating scale, originally developed for use in general surgery 18 , and adopted by several obstetrics and gynecology training programs as an instrument of objective assessment. 19,20 This 10-item Likert scale global rating instrument measures aspects of technical skill, anatomy and preparation. This instrument has strengths in application because of its minimal rater training and feasibility for faculty evaluations.
This study investigated the following questions: (1) Do residents accurately self assess laparoscopic technical skill as compared to faculty evaluators using a global rating scale instrument? (2) Do residents at different stages in their training and development have differing accuracy in their self assessments of their laparoscopic technical skill?

Participants
Thirty-seven (post graduate years 1 to 4) obstetrics and gynecology residents from the University of Southern California and Los Angeles County Medical Center participated in this study. Four of the participants were male and 33 were female. This study was a retrospective review of collected evaluation data from five laparoscopic simulation teaching sessions. There were no participants who were eligible that declined to participate. This research protocol received ethical approval by the Institutional Review Board at the University of Southern California.

Task
Prior to the beginning of a laparoscopic porcine laboratory, residents participated in a 30 minute didactic lecture that focused on teaching the planning for surgery, instrumentation, dealing with complications, and general technical considerations of laparoscopic surgery. Residents from the same training level were assigned in pairs to an animal. A faculty preceptor was assigned to each group to provide guidance and feedback on performance during the laboratory. All groups were allowed two hours to practice skills and asked to complete the procedures of tubal ligation, ovarian cystectomy, salpingostomy and hysterectomy.
Residents were asked to share technical responsibilities for each case which enabled each of the residents to engage and complete a significant amount of each procedure. Residents were encouraged to ask questions and clarification from their faculty preceptors on their technical skill and the elements of the procedures.

Assessment
At the conclusion of each laboratory session, each resident was asked to complete a self assessment of their laparoscopic technical skill for each of the four procedures. The instrument was a 10 item global rating instrument that used a five point Likert scale with descriptive anchors. The 10 items included preparedness, respect for tissue, time and motion, knot tying, instrument handling, knowledge of instruments and suture, use of assistants, anatomy and knowledge of specific procedure. This assessment form also asked for an overall performance rating ranging from substandard performance to the level of competent to perform without supervision. Faculty physicians were also asked to fill out this same instrument for each of the two residents that they observed for all four procedures. The same faculty were used during the five teaching sessions and they were all previously trained in using the instrument to evaluate resident performance. Average resident and faculty scores on the 5-point Likert scale, by training class, were used for Mann-Whitney U analysis to measure significant differences.

Results
Faculty and resident assessment data was analyzed for 148 procedures completed during five porcine laparoscopic technical skill sessions. Results showed that overall residents self assessment of technical skills were lower than faculty evaluations with two items being significant; preparedness (p<0.01) and knot tying (p<0.01). Results were grouped by post graduate year (PGY) class and the technical skill of each procedure to analyze global performance. PGY 1 residents (n=8 residents/n=31 completed case evaluations) reported significantly lower scores on 5 of 10 categories: preparedness (p=0.01), respect for tissue (p=0.02), instrument handling (p=0.01), knowledge of instruments/suture (p=0.02) and overall performance (p=0.04). PGY 2 residents (n=8 residents/n=35 completed case evaluations) reported significantly lower scores in 6 of 10 categories: respect for tissue (p<0.01), time and motion (p<0.01), knot tying (p=0.03), instrument handling (p=0.01), use of assistants and anatomy (p=0.03). PGY 3 residents (n=10 residents/n=39 completed case evaluations) reported significantly lower score on all ten items of the scale (p<0.03) (Figure 1). PGY 4 residents (n=11 residents/n=43 completed case evaluations) reported significantly lower scores in 9 of 10 categories: respect for tissue (p=0.02), time and motion (p<0.01), knot tying (p<0.01), instrument handling (p<0.01), knowledge of instrument (p<0.01), use of assistants (p<0.01), anatomy (p=0.02), knowledge of specific procedure (p<0.01) and overall performance (p=0.02). All results are displayed in Table 1. Overall the obstetrics and gynecology residents who participated in this study self assessed their own technical skills to be lower than faculty preceptor assessments. By resident training class, differences between faculty and residents were significant. These results also indicate that the more senior the resident is the less likely they will report an accurate assessment of their technical skill. In other words, the gap between self assessment and faculty assessment of laparoscopic surgical skill widens when looking at senior residents as compared to junior residents.

Discussion
The development of expertise for any skill requires the ability of a practitioner to recognize their capabilities and limitations. 21 During residency training, individuals receive performance input for their development and growth. However, if a resident does not receive or accurately process faculty formative or summative evaluations then they can develop an inaccurate perception of their abilities. If a resident has a low self efficacy concerning technical performance, their motivation may decline and their learning development may be negatively affected. Considering the technological advances of the specialty, it is crucial for obstetrics and gynecology residents to accurately self assess laparoscopic performance so they can develop competency and confidence in handling the breadth of procedures required. In addition, faculty need to be accurate in their assessments of resident performance to facilitate growth. The results presented in this study imply that either residents have an inaccurate self perception and/or faculty are not critical of senior residents, therefore not providing accurate assessment.
The medical education literature has examined this issue from various points of view that has implications for this body of research. Anthony found in a study of medical students that those who had the least proficiency tended to be more inaccurate in self-assessment and overestimate their abilities. 10 Results presented here show the opposite results in that the more senior the residents are in their training, the wider the gap is in their self assessment. To improve self assessment for all learners it has been shown that review of videotaped performance and repetition on a simulator are both effective. 14, 17 Mandel and colleagues reported obstetrics and gynecology residents using global rating scales can self-assess their surgical skills with good reliability and validity on bench models. 15 Interestingly, that body of research also showed that residents rate their open skills higher than laparoscopic skills; but all skills were rated lower than faculty observations. Those results parallel those demonstrated in this research. Further investigation is needed on why the task domain of laparoscopic surgery has different technical demands that create different resident self perceptions compared to open procedures.
A limitation of this study is that performance evaluation data was collected during a simulation training session where there were multiple residents performing a single operation. This constraint was largely due to funding for animals and due to number of available faculty to participate. Other limitations include that only a single faculty rater was used for each performance, and performance data was collected using a cross sectional methodology.  A better description of the participants self assessment, and namely development of technical skill would be stronger through a longitudinal examination. Despite the limitations specific to this body of research, simulation has been widely shown to improve the acquisition of technical skill development in surgical trainees so it is in our opinion that it is an appropriate venue to measure technical skill development. Specific to obstetrics and gynecology, Banks et al have shown that compared with apprenticeship teaching alone, a surgical simulator laboratory on laparoscopic tubal ligation improved resident knowledge and performance in the operating room. 22 In a different study, Banks also showed that surgical skills laboratory improved residents' knowledge and performance in the clinical setting. Improvement was greatest for PGY-1 residents. 23 There have also been several studies surrounding the efficacy of simulation training for shoulder dystocia skill development and retention. 24,25 General surgery simulation research has found similar results. [26][27][28][29] In examining resident performance, if we consider that faculty evaluations are a measure of true performance, then lower resident self perception is a problem that needs to be altered. If residents can not accurately assess their skill level then it will delay their development and possibly prevent them from acquiring new skills. This study occurred looking at resident performance in a simulated environment, but if this perception is consistent then ultimately there is a need to measure if this self perception affects operating room performance. Possibly the role of resident self reflection is to compare resident perceptions with faculty perceptions so that gaps can be identified and residents can develop a more accurate analysis of their surgical skills.