Estudo de caso de reconhecimento de entidade nomeada na biomedicina

: — — , , (NER)? , , "Machine Learning Deep Learning" .






. , , .





, QuickUMLS. QuickUMLS [1] — — (, , ) , (UMLS). . QuickUMLS . QuickUMLS MedMentions [2].





Figura 1. Uma descrição esquemática de como funciona o QuickUMLS.  Tendo recebido uma string, um banco de dados UMLS transformado em um banco de dados simstring, o modelo retorna correspondências ideais, identificadores de conceito e tipos semânticos
1. , QuickUMLS. , UMLS, simstring, ,

, NER

, , NER. NER (, , , . .) . , , , . , , " ", , — , - . , , "" — , "", , .





NER , , , , "." (hospital), " / " "/" (alcohol). , , , . , "alcohol" " alcohol" [ , , alcohol]. , , , , . NER . Slimmer AI, .





, , , , , . (UMLS), , . , "" "", . , "alcohol" .





UMLS (CUI), , (STY), , , , . , UMLS , , — . UMLS 2020AB, , 3 . , .





MedMentions

MedMentions. 4 392 ( ), Pubmed 2016 ; 352 K ( CUI) UMLS. 34 — 1 % UMLS. , UMLS , .





, MedMentions CUI, . , , UMLS . UMLS 127 , . MedMentions — st21pv, , , 21 .





45,3 F- [2]. , BlueBERT [3] BioBERT [4], 56,3 , [5]. , , , . , . MedMentions.





QuickUMLS:

BERT QuickUMLS , , . , QuickUMLS — , . , , , , . :





  1. . , .





  2. . , , . — zero-shot.





Zero-shot learning (ZSL) — , , , , .





, , MedMentions. , MedMentions UMLS, . , MedMentions , .





QuickUMLS

QuickUMLS . spacy. n-, , , -.  , n-, . [1]. UMLS , , n-. , simstring [6]. QuickUMLS, , UMLS . , “ ”, ( ) 0,7, :





patient:





{‘term’: ‘Inpatient’, ‘cui’: ‘C1548438’, ‘similarity’: 0.71, ‘semtypes’: {‘T078’}, ‘preferred’: 1},
{‘term’: ‘Inpatient’, ‘cui’: ‘C1549404’, ‘similarity’: 0.71, ‘semtypes’: {‘T078’}, ‘preferred’: 1},
{‘term’: ‘Inpatient’, ‘cui’: ‘C1555324’, ‘similarity’: 0.71, ‘semtypes’: {‘T058’}, ‘preferred’: 1},
{‘term’: ‘*^patient’, ‘cui’: ‘C0030705’, ‘similarity’: 0.71, ‘semtypes’: {‘T101’}, ‘preferred’: 1},
{‘term’: ‘patient’, ‘cui’: ‘C0030705’, ‘similarity’: 1.0, ‘semtypes’: {‘T101’}, ‘preferred’: 0},
{‘term’: ‘inpatient’, ‘cui’: ‘C0021562’, ‘similarity’: 0.71, ‘semtypes’: {‘T101’}, ‘preferred’: 0}
      
      



hemmorhage:





{‘term’: ‘No hemorrhage’, ‘cui’: ‘C1861265’, ‘similarity’: 0.72, ‘semtypes’: {‘T033’}, ‘preferred’: 1},
{‘term’: ‘hemorrhagin’, ‘cui’: ‘C0121419’, ‘similarity’: 0.7, ‘semtypes’: {‘T116’, ‘T126’}, ‘preferred’: 1},
{‘term’: ‘hemorrhagic’, ‘cui’: ‘C0333275’, ‘similarity’: 0.7, ‘semtypes’: {‘T080’}, ‘preferred’: 1},
{‘term’: ‘hemorrhage’, ‘cui’: ‘C0019080’, ‘similarity’: 1.0, ‘semtypes’: {‘T046’}, ‘preferred’: 0},
{‘term’: ‘GI hemorrhage’, ‘cui’: ‘C0017181’, ‘similarity’: 0.72, ‘semtypes’: {‘T046’}, ‘preferred’: 0},
{‘term’: ‘Hemorrhages’, ‘cui’: ‘C0019080’, ‘similarity’: 0.7, ‘semtypes’: {‘T046’}, ‘preferred’: 0}
      
      



, “patient” (T101) (C0030705). “” , "No hemmorhage". , , .





QuickUMLS , , 1, . () — (baseline model). seqeval , [5].





╔═══╦══════╦═══════╗
║   ║ BERTQUMLS
╠═══╬══════╬═══════╣P.53.27R.58.36F.56.31
╚═══╩══════╩═══════╝
 1 —   
      
      



, ? , , . , .





QuickUMLS

QuickUMLS . -, , , QuickUMLS, spacy, . . en_core_web_sm. , , . spacy scispacy [7], en_core_sci_sm. - .





╔═══╦══════╦═══════╦═════════╗
║   ║ BERTQUMLS ║ + Spacy
╠═══╬══════╬═══════╬═════════╣P.53.27.29R.58.36.37F.56.31.32
╚═══╩══════╩═══════╩═════════╝
 2 —   scispacy
      
      



, . QuickUMLS , - . , “” : , , , .





QuickUMLS

QuickUMLS 0,7 . , , “Jaccard”, “cosine”, “overlap” “dice”. , . 0,99, , SimString “Jaccard”, . , BERT.





╔═══╦══════╦═══════╦═════════╦════════╗
║   ║ BERTQUMLS ║ + Spacy ║ + Grid
╠═══╬══════╬═══════╬═════════╬════════╣P.53.27.29.37R.58.36.37.37F.56.31.32.37
╚═══╩══════╩═══════╩═════════╩════════╝
 3 —    
      
      



, , , , . , , , “alcohol”. , , , . , , , , . .





, . , , , , , . . , .





╔═══╦══════╦═══════╦═════════╦════════╦══════════╗
║   ║ BERTQUMLS ║ + Spacy ║ + Grid ║ + Priors
╠═══╬══════╬═══════╬═════════╬════════╬══════════╣P.53.27.29.37.39R.58.36.37.37.39F.56.31.32.37.39
╚═══╩══════╩═══════╩═════════╩════════╩══════════╝
 4 —  
      
      



, , , QuickUMLS. , 0,99, , QuickUMLS. , QuickUMLS.





: ?

, . -, , . , , : , , “alcohol” , . -, , . “ ”. . — “ ”, “”. - , , UMLS , . , :





, , QuickUMLS, . , , , . , QuickUMLS , .





, NER . , R&D . QuickUMLS , . , , , . QuickUMLS , github. , , , , , .





— , — : , , , .





, , , — , , "Machine Learning Deep Learning", NVIDIA.





[1] L. Soldaini, and N. Goharian. Quickumls: a fast, unsupervised approach for medical concept extraction, (2016), MedIR workshop, SIGIR





[2] S. Mohan, and D. Li, Medmentions: a large biomedical corpus annotated with UMLS concepts, (2019), arXiv preprint arXiv:1902.09476





[3] Y. Peng, Q. Chen, and Z. Lu, An empirical study of multi-task learning on BERT for biomedical text mining, (2020), arXiv preprint arXiv:2005.02799





[4] J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C.H. So, and J. Kang, BioBERT: a pre-trained biomedical language representation model for biomedical text mining, (2020), Bioinformatics, 36(4)





[5] K.C. Fraser, I. Nejadgholi, B. De Bruijn, M. Li, A. LaPlante and K.Z.E. Abidine, Extracting UMLS concepts from medical text using general and domain-specific deep learning models, (2019), arXiv preprint arXiv:1910.01274.





[6] N. Okazaki, and J.I. Tsujii, Simple and efficient algorithm for approximate dictionary matching, (2010, August), In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)





[7] M. Neumann, D. King, I. Beltagy, and W. Ammar, Scispacy: Fast and robust models for biomedical natural language processing, (2019), arXiv preprint arXiv:1902.07669.





, :





  • Data Scientist





  • Data Analyst





  • Data Engineering









  • Fullstack- Python





  • Java-





  • QA- JAVA





  • Frontend-









  • C++





  • Unity





  • -





  • iOS-





  • Android-









  • Machine Learning





  • "Machine Learning Deep Learning"





  • " Data Science"





  • " Machine Learning Data Science"





  • "Python -"





  • " "









  • DevOps








All Articles