Deep learning algorithms for automatic classification of electronic medical records starting from free text

Zoom Meeting - Padova - Unit of Biostatistics, Epidemiology and Public Health Biomedicine Data Mining and Artificial Intelligence Labs

09.12.2021

Deep learning algorithms for automatic classification of electronic medical records starting from free text - Techniques, methods, and deep learning procedures


In this presentation, we will go over some automatic text classification techniques, with particular attention to techniques, methods, and deep learning procedures, analyzing the various steps, peculiarities, and opportunities, also in comparison with classic machine learning techniques and more traditional approaches to text analysis. Using the Pedianet project's database, and in particular, a snapshot made up of 6,903,035 medical records showing pediatric visits between January 1, 2004, and August 23, 2017, collected by 144 pediatricians throughout Italy, the case study presented will show how it is possible to take advantage of free-text notes, such as pediatricians' diaries, prescriptions, specialist examination results, and other types of textual notes, to automatically find and classify otitis cases into six categories: non-otitis, non-middle ear infections, non-acute otitis media, acute non-recurrent otitis media (AOM) nor with perforation of the tympanic membrane, AOM with perforation, and recurrent AOM. Classifying ear infections in their types is of great general interest in the pediatric field, especially in Italy. In addition to being the leading cause of antibiotic prescriptions in children; for Italian pediatricians, there is no obligation to use international codes for diseases or other standard codes. Therefore, the diagnoses are mostly reported in free-text notes. Performing research "by hand" on millions of free-text notes is a time-consuming and resource-intensive process that is not practically feasible. A subset of 6,531 records was manually classified by two experts in the pediatrics and then verified by a specialist pediatrician to be used, in one part, as a training set and, in another separate portion, as a set of verification for the developed models. The final wholly automated ensemble model has shown performance on the verification set in line and slightly higher respect to those achieved by both the experts.

Link  https://bit.ly/UBEP-KAROLINSKA-INVITED-ROUNDTABLE
Per informazioni: (DARIO GREGORI).