ORIGINAL RESEARCH 695 DOWNLOADS

Linguistic analysis of empathy in medical school admission essays

Mary Yaden1, David Yaden2, Anneke Buffone3, Johannes Eichstaedt3, Patrick Crutchley3, Laura Smith3, Jonathan Cass4, Clara Callahan4, Susan Rosenthal4, Lyle Ungar5, Andrew Schwartz6 and Mohammadreza Hojat4

1Department of Psychiatry at the University of Pennsylvania, Philadelphia, PA, USA

2Department of Psychiatry and Behavioral Sciences at Johns Hopkins Medicine, Baltimore, MD, USA

3Department of Psychology at the University of Pennsylvania, Philadelphia, PA, USA

4Sidney Kimmel Medical College at Thomas Jefferson University, Philadelphia, PA, USA

5Department of Computer and Information Science at the University of Pennsylvania, Philadelphia, PA, USA

6Department of Computer Science at Stony Brook University, Stony Brook, New York, USA

Submitted: 26/10/2019; Accepted: 07/08/2020; Published: 18/09/2020

Int J Med Educ. 2020; 11:186-190; doi: 10.5116/ijme.5f2d.0359

© 2020 Mary Yaden et al. This is an Open Access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use of work provided the original work is properly cited. http://creativecommons.org/licenses/by/3.0

Objectives: This study aimed to determine whether words used in medical school admissions essays can predict physician empathy.

Methods: A computational form of linguistic analysis was used for the content analysis of medical school admissions essays. Words in medical school admissions essays were computationally grouped into 20 'topics' which were then correlated with scores on the Jefferson Scale of Empathy. The study sample included 1,805 matriculants (between 2008-2015) at a single medical college in the North East of the United States who wrote an admissions essay and completed the Jefferson Scale of Empathy at matriculation.

Results: After correcting for multiple comparisons and controlling for gender, the Jefferson Scale of Empathy scores significantly correlated with a linguistic topic (r = .074, p< .05). This topic was comprised of specific words used in essays such as "understanding," "compassion," "empathy," "feeling," and "trust." These words are related to themes emphasized in both theoretical writing and empirical studies on physician empathy.

Conclusions: This study demonstrates that physician empathy can be predicted from medical school admission essays. The implications of this methodological capability, i.e. to quantitatively associate linguistic features or words with psychometric outcomes, bears on the future of medical education research and admissions. In particular, these findings suggest that those responsible for medical school admissions could identify more empathetic applicants based on the language of their application essays. 

Initiatives to improve interpersonal aspects of patient care often forefront the empathy of medical providers.1 Empathy has been variously defined, with a rich theoretical and empirical literature.2-5 In patient care contexts, physician empathy has been defined as a predominantly cognitive attribute to understand patient experiences, combined with a capacity to communicate this understanding to patients, and an intention to help.6-7 Physician empathy measurably affects patient outcomes. Decades of theoretical and qualitative work in physician empathy preceded the development of quantitative measures of physician empathy,8-10 such as the Jefferson Scale of Empathy (JSE).7,11  Illustrative examples of empathy impacting patient outcomes include diabetic patients under the care of more empathic physicians have lower rates of metabolic complications.12 Diabetic patients who were treated by more empathic physicians show better control of their disease indicated by the results of laboratory tests of glycemic control and cholesterol.12 More empathic physician communication is also associated with improved patient satisfaction and patient compliance.13  Further, medical students with higher empathy do better during their clinical clerkships14 and are rated as more competent in the patient encounters.15 In sum, there is good evidence that physician empathy is linked to better patient outcomes and patient satisfaction. Research indicates that both patients and clinicians can benefit from empathic engagement.6

Teaching empathy has been included in the curriculum of several medical schools. Over the past few decades, educational programs have been initiated with varying degrees of success.16-21 Medical humanities programs have also been implemented with the intention of increasing empathy.22 Despite these initiatives, a significant drop in empathy occurs during the third year of medical school.23

Some medical educators have suggested that empathy should be included as a selection criterion for medical school applicants.24,25 Similar proposals have been made to adjust selection criteria for applicants with interest in primary care26 or to favor those with some background in the arts or humanities.27 However, self-reported measures of empathy can be influenced by social desirability response bias. An unobtrusive measure of physician empathy would be ideal, but developing unobtrusive measures presents a methodological challenge. A knowledge gap exists within medical education for measuring traits such as empathy using methods other than self-report.

Computational linguistic analysis, a quantitative method of corpus analysis, has been used in recent years to predict health issues such as heart disease mortality at the county level using language from posts on Twitter.28 A number of demographic and personality characteristics have also been explored using this technique with language from Facebook posts.29 In general, in cases where a link exists between language data and an outcome variable of interest in a given population, then this linguistic analysis method can identify the words that most correlate with scores on a given outcome measure.

In this study, we aimed to identify the words used in medical school admissions essays that are associated with self-reported physician empathy. The association of language use with physician empathy fills a knowledge gap by determining whether high empathy applicants can be detected through words used in medical school application essays. Our objective was to provide particular words from admissions essays that are most predictive of physician empathy.

Procedures

We used data from The Jefferson Longitudinal Study of Medical Education, an on-going study that surveys medical students on a yearly basis across a number of topics, including physician empathy.30 We also requested and received permission from the Association of American Medical Colleges (AAMC) to use medical school application essays written by the study participants. The texts of the essays were then merged with the Jefferson Longitudinal Study (Jefferson scale of empathy scores and demographics). This study was approved by the Thomas Jefferson University institutional review board (IRB).

Study participants

Research participants included N=1,805 matriculants to Sidney Kimmel Medical College at Thomas Jefferson University between 2008-2015 who completed a survey, including a measure of physician empathy, at the beginning of medical school. This sample represents 85% of all matriculants (2,118) during that time period.

The study sample comprised 893 (49.5%) men and 912 (50.5%) women, with a mean age of 23.5 years. The gender composition and age of the study sample were similar to the total matriculants in the study period. Due to the reduced reliability of entries with lower word counts, participants must have written at least 500 words in their essays to be included in the sample. The 500-word cut-off also removed applicants from the sample for whom a full personal statement was not required.

Instruments

Jefferson Scale of Empathy: We used the Jefferson Scale of Empathy (JSE), a 20-item, validated instrument specifically developed to measure empathy in the context of patient care in medical and other health professions students and practitioners. We used the 'S-version' of the JSE, which was developed for administration to medical students. Evidence in support of the JSE's validity and reliability6,11,14 has been reported. The possible score range is 20 to 140; a higher score on this scale indicates a greater orientation toward empathic engagement in patient care. The typical Cronbach's alpha for this instrument, which has been reported in many studies, is around .75.6 A sample item on this scale is: "It is difficult for a physician to view things from patients' perspectives." The JSE was completed by all of the medical students in this sample at matriculation.

Data analysis

We used the process of Differential Language Analysis (DLA)29 to automatically identify clusters of words associated with a given outcome. DLA proceeds in two steps: (1) linguistic feature extraction – quantifying how often groups of words were mentioned and (2) correlation analysis – finding the association of linguistic features with given outcomes. The analysis was carried out within the computerized analysis program, Differential Language Analysis ToolKit,32 and the specific methods for each step that follow. DLA has been used previously in a number of studies to predict population health issues as well as explore linguistic correlations of personality and gender.28

For linguistic feature extraction, we first broke the admissions essays into words using DLATK's tokenizer, which separates sentences into words by spaces or other white space and punctuation.31 Based on a tokenized version of the entire corpus of essays, we then grouped words into related clusters, known as "topics", using two well-established topic modeling approaches: Latent Dirichlet Allocation (LDA)32 and Non negative Matrix Factorization (NMF).33 These statistical techniques find words that often appear in essays with the similar linguistic context. This approach leverages the individual advantages of LDA and NMF, allowing LDA to produce coherent topics when they are larger in number34 and for NMF to reduce dimensions while maintaining variance of count data effectively.35 In other words, while LDA is often run alone to produce hundreds of topics, our relatively small sample size limited our statistical power and required a lower number of linguistic variables. More broadly, these techniques reduce a very large number of words to a limited set of language variables called topics, which are comprised of words that share semantic similarities.32

Based on power analyses for effect sizes of r > 0.05, we calculated that approximately 20 topics as variables were sufficient while also correcting for false discovery rate in our significance tests.  At the end of this process, for each essay, we then have a usage score for the 20 topics which can be interpreted as the relative amount the topic words were mentioned within the essay.

For correlation analysis, the usage scores for the 20 topics were then treated as independent variables and were then associated with scores on the JSE, which was the dependent variable, using multivariate linear regression. Specifically, ordinary least squares linear regression was used with input variables standardized and with gender included as a covariate since it has been shown to be a significant factor in empathy in previous research.36 The correlations for all 20 topics were recorded along with p-values which were corrected for multiple comparisons at p<.05 using the Benjamini-Hochberg False Discovery Rate procedure.37

A linguistic topic was correlated with physician empathy (r = .074, p < .05), after correcting for multiple comparisons and controlling for gender. This topic consisted of words associated with key features of empathic engagement in patient care, such as "empathy", "understanding," "compassion," "perspective," "caring," and "trust." Figure 1 shows the language topic extracted by linguistic analysis of admission essays that were significantly correlated with scores on the JSE.

This study shows that some language used in medical school application essays predicts physician empathy. This finding could inform medical school admissions contexts, which are increasingly interested in selecting for more empathetic future physicians. Further, the observed linguistic findings provide insight into how empathy is expressed in language by future physicians more generally.

The words that were associated with empathy may suggest a primary focus on the experience of the patient. The top three words associated with empathy in our sample were "health," "patient," and "care." While these findings may seem nonspecific in a sample of students pursuing a career in medicine, they suggest an interest in patients rather than other aspects of medical practice such as technology, financial gain, professional prestige, or career-related motivations. This finding is interesting in the context of healthcare's current emphasis on patient-centered medicine. Where the physician's role was once to dictate a diagnosis and course of treatment, practitioners are now encouraged to understand and address the individual values and needs of patients in clinical contexts.38 Further research might explore how teaching patient-centered approaches to medicine impact the empathy of medical students in their clinical training. The other words associated with high empathy scores also reflect key components of empathic engagement in patient care such as "understanding," "compassion," "human," "feeling," "knowledge," and "trust." In general, our findings provide further support for current characterizations of empathy in healthcare as a cognitive attribute that involves understanding of patient's experiences, coupled with compassionate concern to minimize suffering.

These language results are in line with several specific findings in the research literature on physician empathy. Empathy in medical students is correlated with sociability,38 emotional intelligence,39 and conscientiousness.40 Empathic concern has been linked to prosocial behaviors such as higher rates of organ donation.41 More empathic healthcare practitioners also have more positive attitudes toward integrative care, and cooperative attitudes towards one another.42 Medical students nominated by their classmates for excellence in clinical competence had higher than average empathy scores.43 More empathic medical students also tended to choose people-oriented over technology-oriented specialties.44 These findings complement the language results in the present study by suggesting a link between higher levels of empathy and an orientation towards others, compassionate concern, and emotional intelligence.

Our study had several limitations. First, we had a relatively small sample size by linguistic analysis standards. While the study includes 1,805 participants, a larger than average sample size in most educational studies, many studies using computational linguistic analysis involve an order of magnitude more participants (closer to N = 10,000). Second, this study was conducted at a single private medical school in the northeast of the United States, so care should be taken when generalizing beyond this context, especially in regard to the international medical education community.

Figure 1
Figure 1
Language topic correlated with high physician empathy.

Figure 1. Language topic correlated with high physician empathy. This linguistic topic positively correlated with scores on the Jefferson Empathy Scale (r = .074, p < .05). Note that the larger size of the words indicates a higher correlation strength.

Third, while effect sizes are within standard ranges in linguistic analysis studies, they are relatively low in absolute magnitude. These small effect sizes may be due to a self-presentation bias in responses to the empathy scale and within the essays themselves, resulting in a ceiling effect which together may have constrained variance and decreased signal in the language data. In other words, the task of an admissions essay is to represent oneself in the best possible light; therefore, other more spontaneously generated sources of natural language might provide more variance in empathy and should be explored in future studies. Fourth, because our sample included students that were accepted and chose to attend medical school at a single institution, the variation could be limited by admission selection criteria.

Despite these limitations, the present study represents a step toward better understanding and selecting for more empathic medical students. Previous research has shown that there are no formal criteria for readers of medical school personal statements.45 In one study, ratings of personal statements had no predictive validity for future success.46 For these reasons, some have suggested that a more quantitative assessment should complement the essay reading process.47 While most self-report measures are subject to social desirability bias, linguistic analysis offers one method of bypassing some of these demand characteristics, particularly if applicants are not aware of the traits being considered or the language models that are associated with them.

Computational linguistic analysis is a method currently used in evaluating job applications at large companies48 and may soon be used for admissions purposes in many academic contexts. Computational linguistic models are capable of predicting scores related to a variety of outcome measures based on language alone.49 In other words, our findings suggest that an empathy score could, through future linguistic modeling and validation work, be automatically generated for each medical school admissions essay using this technology.

Empathy assessments have received increasingly widespread attention in medical education. The language findings in the present study shed light on the words correlated with empathy and suggest that physician empathy can be identified in medical school admission essays using these methods. Demonstration of this technological capability to associate empathic orientation with linguistic features is the first step towards admissions committees selecting for more empathetic medical school applicants. Specific language themes identified in this study should be followed by future research to further specify their relationship with empathy. These linguistic insights may impact not only our understanding of physician empathy but inform selection committees responsible for medical student admissions.

Acknowledgments

Contributors: The authors would like to thank Dr. Elizabeth Y. Brooks, DPM for her enthusiasm and support of this project as well as Dr. Martin E. P. Seligman, PhD.

Funders: This publication was made possible through the support of a grant from the Templeton Religion Trust. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the Templeton Religion Trust. We also thank the Noguchi Medical Research Institute in Tokyo, Japan.

Conflict of Interest

The authors declare that they have no conflict of interest.

  1. Klemenc-Ketis Z and Vrecko H. Development and validation of a professionalism assessment scale for medical students. Int J Med Educ. 2014; 5: 205-211.
    Full Text PubMed
  2. Batson CD. The altruism question: toward a social-psychological answer. New York: Psychology Press; 2014.
  3. Bloom P. How do morals change? Nature. 2010; 464: 490.
    Full Text PubMed
  4. Buffone AE and Poulin MJ. Empathy, target distress, and neurohormone genes interact to predict aggression for others-even without provocation. Pers Soc Psychol Bull. 2014; 40: 1406-1422.
    Full Text PubMed
  5. Davis MH. Measuring individual differences in empathy: Evidence for a multidimensional approach. Journal of Personality and Social Psychology. 1983; 44: 113-126.
    Full Text
  6. Hojat M. Empathy in health professions education and patient care. New York: Springer; 2016.
  7. Hojat M, Vergare MJ, Maxwell K, Brainard G, Herrine SK, Isenberg GA, Veloski J and Gonnella JS. The devil is in the third year: a longitudinal study of erosion of empathy in medical school. Acad Med. 2009; 84: 1182-1191.
    Full Text PubMed
  8. ARING CD. Sympathy and empathy. J Am Med Assoc. 1958; 167: 448-452.
    Full Text PubMed
  9. Bellet PS and Maloney MJ. The importance of empathy as an interviewing skill in medicine. JAMA. 1991; 266: 1831-1832.
    PubMed
  10. Levinson W. Physician-patient communication. A key to malpractice prevention. JAMA. 1994; 272: 1619-1620.
    PubMed
  11. Hojat M, Mangione S, Nasca TJ, Cohen MJM, Gonnella JS, Erdmann JB, Veloski J and Magee M. The Jefferson Scale of Physician Empathy: development and preliminary psychometric data. Educational and Psychological Measurement. 2001; 61: 349-365.
    Full Text
  12. Hojat M, Louis DZ, Markham FW, Wender R, Rabinowitz C and Gonnella JS. Physiciansʼ empathy and clinical outcomes for diabetic patients. Academic Medicine. 2011; 86: 359-364.
    Full Text
  13. Kim SS, Kaplowitz S and Johnston MV. The effects of physician empathy on patient satisfaction and compliance. Eval Health Prof. 2004; 27: 237-251.
    Full Text
  14. Hojat M, Gonnella JS, Mangione S, Nasca TJ, Veloski JJ, Erdmann JB, Callahan CA and Magee M. Empathy in medical students as related to academic performance, clinical competence and gender. Med Educ. 2002; 36: 522-527.
    Full Text
  15. Ogle J, Bushnell JA and Caputi P. Empathy is related to clinical competence in medical care. Med Educ. 2013; 47: 824-831.
    Full Text
  16. Diseker RA and Michielutte R. An analysis of empathy in medical students before and following clinical experience. Academic Medicine. 1981; 56: 1004-10.
    Full Text
  17. Elizur A and Rosenheim E. Empathy and attitudes among medical students. Academic Medicine. 1982; 57: 675-83.
    Full Text
  18. Fine VK and Therrien ME. Empathy in the doctor-patient relationship. Academic Medicine. 1977; 52: 752-7.
    Full Text
  19. Kramer D, Ber R and Moore M. Impact of workshop on studentsʼ and physiciansʼ rejecting behaviors in patient interviews. Academic Medicine. 1987; 62: 904-10.
    Full Text
  20. Sanson-Fisher RW and Poole AD. Training medical students to empathize: an experimental study. Med J Aust. 1978; 1: 473-476.
    PubMed
  21. Winefield HR and Chur-Hansen A. Evaluating the outcome of communication skill teaching for entry-level medical students: does knowledge of empathy increase? Med Educ. 2000; 34: 90-94.
    Full Text
  22. Charon R. The patient-physician relationship. Narrative medicine: a model for empathy, reflection, profession, and trust. JAMA. 2001; 286: 1897-1902.
    Full Text PubMed
  23. Hojat M, Vergare MJ, Maxwell K, Brainard G, Herrine SK, Isenberg GA, Veloski J and Gonnella JS. The devil is in the third year: a longitudinal study of erosion of empathy in medical school. Acad Med. 2009; 84: 1182-1191.
    Full Text PubMed
  24. Hemmerdinger JM, Stoddart SD and Lilford RJ. A systematic review of tests of empathy in medicine. BMC Med Educ. 2007; 7: 24.
    Full Text PubMed
  25. Hojat M. Assessments of empathy in medical school admissions: what additional evidence is needed? Int J Med Educ. 2014; 5: 7-10.
    Full Text PubMed
  26. Schroeder SA, Zones JS and Showstack JA. Academic medicine as a public trust. JAMA. 1989; 262: 803-812.
    PubMed
  27. Muller D. Reforming premedical education--out with the old, in with the new. N Engl J Med. 2013; 368: 1567-1569.
    Full Text PubMed
  28. Eichstaedt JC, Schwartz HA, Kern ML, Park G, Labarthe DR, Merchant RM, Jha S, Agrawal M, Dziurzynski LA, Sap M, Weeg C, Larson EE, Ungar LH and Seligman ME. Psychological language on Twitter predicts county-level heart disease mortality. Psychol Sci. 2015; 26: 159-169.
    Full Text PubMed
  29. Schwartz HA, Eichstaedt JC, Kern ML, Dziurzynski L, Ramones SM, Agrawal M, Shah A, Kosinski M, Stillwell D, Seligman ME and Ungar LH. Personality, gender, and age in the language of social media: the open-vocabulary approach. PLoS ONE. 2013; 8: 73791.
    Full Text PubMed
  30. Gonnella JS, Hojat M and Veloski J. AM last page: the Jefferson longitudinal study of medical education. Acad Med. 2011; 86: 404.
    Full Text
  31. Schwartz HA, Giorgi S, Sap M, Crutchley P, Ungar L, Eichstaedt J. DLATK: Differential language analysis toolkit. Proceedings of the 2017 conference on empirical methods in natural language processing: System demonstrations. Copenhagen, Denmark: Association for Computational Linguistics; 2017. DOI: 10.18653/v1/D17-2010
  32. Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. Journal of Machine Learning Research. 2003;3:993-1022.
  33. Lee DD, Seung HS. Algorithms for non-negative matrix factorization. 14th Annual Neural Information Processing Systems Conference, NIPS 2000; 27 Nov 2000-2 Dec 2000; Denver, CO, United States. Neural information processing systems foundation; 2000.
  34. Schwartz HA and Ungar LH. Data-driven content analysis of social media. Ann Am Acad Pol Soc Sci.. 2015; 659: 78-94.
    Full Text
  35. CICHOCKI A and PHAN AH. Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE Trans Fundamentals. 2009; E92-A: 708-721.
    Full Text
  36. Hojat M, Zuckerman M, Magee M, Mangione S, Nasca T, Vergare M and Gonnella JS. Empathy in medical students as related to specialty interest, personality, and perceptions of mother and father. Personality and Individual Differences. 2005; 39: 1205-1215.
    Full Text
  37. SIMES RJ. An improved Bonferroni procedure for multiple tests of significance. Biometrika. 1986; 73: 751-754.
    Full Text
  38. Laine C and Davidoff F. Patient-centered medicine. A professional evolution. JAMA. 1996; 275: 152-156.
    PubMed
  39. Austin EJ, Evans P, Goldwater R and Potter V. A preliminary study of emotional intelligence, empathy and exam performance in first year medical students. Personality and Individual Differences. 2005; 39: 1395-1405.
    Full Text
  40. Costa P, Alves R, Neto I, Marvão P, Portela M and Costa MJ. Associations between medical student empathy and personality: a multi-institutional study. PLoS ONE. 2014; 9: 89254.
    Full Text PubMed
  41. Cohen EL and Hoffner C. Gifts of giving: the role of empathy and perceived benefits to others and self in young adults' decisions to become organ donors. J Health Psychol. 2013; 18: 128-138.
    Full Text PubMed
  42. Calabrese LH, Bianco JA, Mann D, Massello D and Hojat M. Correlates and changes in empathy and attitudes toward interprofessional collaboration in osteopathic medical students. J Am Osteopath Assoc. 2013; 113: 898-907.
    Full Text PubMed
  43. Pohl CA, Hojat M and Arnold L. Peer nominations as related to academic attainment, empathy, personality, and specialty interest. Acad Med. 2011; 86: 747-751.
    Full Text PubMed
  44. Chen D, Lew R, Hershman W and Orlander J. A cross-sectional measurement of medical student empathy. J Gen Intern Med. 2007; 22: 1434-1438.
    Full Text PubMed
  45. Musson DM. Personality and medical education. Med Educ. 2009; 43: 395-397.
    Full Text PubMed
  46. Ferguson E, James D and Madeley L. Factors associated with success in medical school: systematic review of the literature. BMJ. 2002; 324: 952-957.
    Full Text PubMed
  47. Haque OS and Waytz A. Dehumanization in Medicine. Perspect Psychol Sci. 2012; 7: 176-186.
    Full Text
  48. Faliagka E, Iliadis L, Karydis I, Rigou M, Sioutas S, Tsakalidis A and Tzimas G. On-line consistent ranking on e-recruitment: seeking the truth behind a well-formed CV. Artif Intell Rev. 2014; 42: 515-528.
    Full Text
  49. Guntuku SC, Yaden DB, Kern ML, Ungar LH and Eichstaedt JC. Detecting depression and mental illness on social media: an integrative review. Current Opinion in Behavioral Sciences. 2017; 18: 43-49.
    Full Text