Analyzing U.S. Army Officer Evaluation Reports with Natural Language Processing: A Log-Odds and Latent Dirichlet Allocation Exploration
Main Article Content
Issue section:
Research
Keywords:
Text Mining, Topic Modeling, Latent Dirichlet Allocation, Officer Evaluation Reports (OERs)
Abstract
Each job field (branch) in the Army requires a unique set of skills and talents of the officers assigned. Officers who demonstrate the required skills are often more successful in their assigned branch. To better understand how success is described across branches, research was conducted using text mining and text analysis of a data set of Officer Evaluation Reports (OERs). This research looked for common trends and discrepancies across varying branches and like groups of branches by analyzing the narrative portion of OERs. Text analysis methods examined words and bigrams commonly used to describe varying degrees of performance by officers. Topic modeling using Latent Dirichlet Allocation (LDA) was also conducted on top rated narratives to investigate trends and discrepancies in clustering narratives. Findings show that qualitative narratives for the top two performance designations fail to differentiate between officers’ varying levels of performance regardless of branch.