Title Određivanje slijednosti teksta metodama dubokog učenja
Author Dinko Ždravac
Mentor Tomislav Šmuc (mentor)
Committee member Tomislav Šmuc (predsjednik povjerenstva)
Committee member Zlatko Drmač (član povjerenstva)
Committee member Miljenko Marušić (član povjerenstva)
Committee member Dijana Ilišević (član povjerenstva)
Granter University of Zagreb Faculty of Science (Department of Mathematics) Zagreb
Defense date and country 2018-11-28, Croatia
Scientific / art field, discipline and subdiscipline NATURAL SCIENCES Mathematics
Abstract U ovom radu obrađen je problem prepoznavanja slijednosti teksta, kao bitne komponente mnogih drugih problema obrade prirodnog jezika. Opisan je pristup rješavanju problema tehnikama dubokog učenja. Istaknute su prednosti modela dubokog učenja u odnosu na tradicionalno oblikovane modele strojnog učenja s ručnim probirom značajki. Izneseni su osnovni koncepti strojnog učenja potrebni za razumijevanje izloženog. Prezentirana su tri znanstvena rada koji predstavljaju najnovije tehnike rješavanja zadatka slijednosti. Radovi su razloženi po cjelinama, analizirani i uspoređeni međusobno. Time je pružen pregled suvremenih pristupa rješavanju problema slijednosti dubokim učenjem. Posljednji dio predstavlja praktičan aspekt rada, s programskom implementacijom prezentiranog ESIM modela, u drugom programskom okviru (Keras). Kao polazna korištena je tuđa implementacija bazirana na ESIM modelu, koja inicijalno nije bila usklađena s arhitekturom originalnog ESIM-a. Dorađena Keras ESIM implementacija usklađena je s arhitekturom originalnog ESIM-a te postiže rezultate blizu rezultata originalnog modela. Provedena je Bayesova optimizacija hiperparametara nad modelom, što je rezultiralo poboljšanjem performansi. Resultati optimizacije su predstavljeni i analizirani. Također, Bayesova optimizacija istaknula je doprinos pojedinih hiperparametara na uspjeh modela u odnosu na druge, što predstavlja dobru osnovu za nastavak rada nad ovim rezultatima. Model je testiran nad novo objavljenim skupom podataka velikog volumena, MNLI, te su postignuti rezultati blizu onima referentnog ESIM modela iz tog rada. Naposljetku, testiran je prijenos znanja modela na različite skupove i analizirani su rezultati po kategorijama. U usporedbi s referentnim modelom, ovdje korišten Keras ESIM postiže općenito tek nešto niže performanse, prestižući ga u nekim kategorijama. Također, pokazana su poboljšanja ESIM modela treniranog nad združenim MNLI i SNLI skupovima u odnosu na treniranje samo nad jednim skupom. Premda MNLI skup predstavlja bogat i raznolik skup za treniranje dubokih modela u odnosu na SNLI, pokazan je i značajan doprinos SNLI skupa performansama modela u odnosu na treniranje samo nad MNLI-em.
Abstract (english) This thesis presents work on recognizing textual entailment (RTE), a relevant subtask for many other natural language processing tasks. Described herein is the approach using deep learning with emphasized benefits of using deep models in contrast to traditionally designed machine learning models using hand-crafted features. Basic concepts of machine learning are explained that are needed for the understanding of presented matter. Three scientific papers are presented which contain novel techniques for textual entailment recognition using deep learning models. The papers are explained, analysed and compared by parts, giving an overview of contemporary approaches to natural language inference using deep learning. The last part of the thesis represents the practical aspect of work, with an implementation of described ESIM model in a different framework (Keras). Existing implementation of said ESIM model was used as a starting point, which differed from the original code. After additional work of aligning that implementation with the original ESIM model code, resulting Keras ESIM achieved results close to the results of the original ESIM model and was used further on. Performance of the model was improved using Bayes optimization of hyperparameters, results of which were presented and analyzed. Beside improving results, Bayes optimization highlighted some hyperparameters as more contributing than others to the model performance, laying ground for further work upon these results. Model was also tested on newly released dataset for natural language inference, MNLI, achieving results comparable to reference ESIM model from MNLI paper. Lastly, knowledge transfer was tested across multiple datasets with results analyzed by category. Compared to the reference model, this thesis' Keras ESIM achieves only marginally lower performance across the board, even surpassing the reference ESIM in some categories. Also, jointly trained ESIM on a combination of MNLI and SNLI datasets shows performance improvements compared to a model trained on a single dataset. Despite MNLI being richer and more language diverse dataset for training deep models compared to SNLI, a significant improvement of performance is demonstrated using a combination of datasets, showing a contribution of SNLI corpus for model training.
Keywords
problem prepoznavanja slijednosti teksta
obrada prirodnog jezika
duboko učenje
strojno učenje
ESIM model
Bayesova optimizacija
Keywords (english)
recognizing textual entailment
RTE
natural language processing
deep learning
machine learning
ESIM model
Bayes optimization
Language croatian
URN:NBN urn:nbn:hr:217:366777
Study programme Title: Computer Science and Mathematics Study programme type: university Study level: graduate Academic / professional title: magistar/magistra računarstva i matematike (magistar/magistra računarstva i matematike)
Type of resource Text
File origin Born digital
Access conditions Open access
Terms of use
Created on 2019-07-09 10:42:03