Title Generalizirani linearni modeli
Title (english) Generalized linear models
Author Antonija Andrijević
Mentor Miljenko Huzak (mentor)
Committee member Miljenko Huzak (predsjednik povjerenstva)
Committee member Nenad Antonić (član povjerenstva)
Committee member Ozren Perše (član povjerenstva)
Committee member Ilja Gogić (član povjerenstva)
Granter University of Zagreb Faculty of Science (Department of Mathematics) Zagreb
Defense date and country 2024-09-27, Croatia
Scientific / art field, discipline and subdiscipline NATURAL SCIENCES Mathematics
Abstract Za vektor međusobno nezavisnih slučajnih varijabli
\boldsymbol{Y} = (Y_1, \ldots, Y_n)^T za koje pretpostavljamo da ovise o vrijednostima
x_1, \ldots, x_p generalizirane linearne modele definiramo kao
\begin{equation*} \begin{cases} Y_i \sim EFD(\theta_i) \\ \mathbb{E}[Y_i] = \mu_i = q^{-1}(\boldsymbol{x}_i^T\boldsymbol{\beta}), \end{cases} \end{equation*} za
i = 1, \ldots, n . Slučajne varijable
Y_i \sim EFD(\theta_i) dolaze iz eksponencijalne familije distribucija u
... More standardnoj formi, čija gustoća ovisi o parametru \theta_i . Ta familija uključuje brojne poznate statističke distribucije, uključujući binomnu, normalnu, Poissonovu i gama distribuciju te mnoge druge. Na ovaj način generalizirani linearni modeli omogućuju modeliranje zavisne varijable $\boldsymbol{Y}$ koja pripada i drugim distribucijama, ne samo normalnoj. Nadalje, funkcija g iz gornjeg zapisa je monotono diferencijabila funkcija koju nazivamo funkcija poveznica. Ona povezuje distribuciju zavisne varijable, njeno očekivanje i varijancu, s linearnom kombinacijom nezavisnih varijabli \boldsymbol{x}_i^T\boldsymbol{\beta} . Na ovaj nam način generalizirani linearni modeli omogućuju modeliranje i nelinearnih veza. Nepoznate parametre \boldsymbol{\beta} = (\beta_0, \beta_1, \ldots, \beta_p)^T procjenjujemo metodom najveće vjerodostojnosti, tražeći maksimum \boldsymbol{b} = (b_0, b_1, \ldots, b_p)^T funkcije log-vjerodostojnosti na temelju uzorka \boldsymbol{y} . Maksimizacija se svodi na traženje nultočaka parcijalnih derivacija log-vjerodostojnosti. Ovisno o složenosti, nultočke možemo tražiti analitički ili iterativnom težinskom metodom najmanjih kvadrata koja koristi Fisherov algoritam za poboljšanje procjena parametara putem težinske matrice i pseudo-odgovora. Nakon što smo procijenili parametre generaliziranog linearnog modela i dobili jednadžbu modela, želimo provjeriti preciznost modela statističkim inferencijama koje uključuju testiranje statističkih hipoteza o značajnosti parametara, modela te usporedba modela korištenjem asimptotske N(0,1) i \chi^2 distribucije, kao i izračunavanje pouzdanih intervala. Nakon što potvrdimo da smo dobili precizan model, možemo ga koristiti za donošenje odluka. Less
Abstract (english) For a vector of independent random variables
\boldsymbol{Y} = (Y_1, \ldots, Y_n)^T , where we assume that they depend on the values $x_1, \ldots, x_p$, generalized linear models are defined as
\begin{equation*} \begin{cases} Y_i \sim EFD(\theta_i) \\ \mathbb{E}[Y_i] = \mu_i = q^{-1}(\boldsymbol{x}_i^T\boldsymbol{\beta}), \end{cases} \end{equation*} for
i = 1, \ldots, n . The random variables
Y_i \sim EFD(\theta_i) come from the exponential family of distributions in
... More standard form, with the density depending on the parameter \theta_i . This family includes numerous well-known statistical distributions, such as the binomial, normal, Poisson, and gamma distributions, among many others. In this way, generalized linear models allow modeling of the dependent variable \boldsymbol{Y} that belongs to distributions other than just the normal distribution. Furthermore, the function g in the above expression is a monotonic differentiable function called the link function. It connects the distribution of the dependent variable, its expectation, and its variance with the linear combination of the independent variables \boldsymbol{x}_i^T\boldsymbol{\beta} . This way, generalized linear models enable the modeling of nonlinear relationships as well. The unknown parameters \boldsymbol{\beta} = (\beta_0, \beta_1, \ldots, \beta_p)^T are estimated using the method of maximum likelihood by finding the maximum \boldsymbol{b} = (b_0, b_1, \ldots, b_p)^T of the log-likelihood function based on the sample \boldsymbol{y} . Maximization reduces to finding the zeroes of the partial derivatives of the log-likelihood. Depending on the complexity, the zeroes can be found either analytically or by an iterative weighted least squares method, which uses Fisher's scoring algorithm to improve parameter estimates via a weight matrix and pseudo-responses. Once the parameters of the generalized linear model have been estimated and the model equation obtained, we want to assess the accuracy of the model through statistical inferences, which include hypothesis testing for the significance of parameters, testing the overall model, and model comparison using the asymptotic N(0,1) and \chi^2 distributions, as well as the computation of confidence intervals. Once we confirm the precision of the model, it can be used for decision-making. Less
Keywords
slučajne varijable
generalizirani linearni modeli
Keywords (english)
random variables
generalized linear models
Language croatian
URN:NBN urn:nbn:hr:217:531545
Study programme Title: Mathematical Statistics Study programme type: university Study level: graduate Academic / professional title: sveučilišni magistar matematike (sveučilišni magistar matematike)
Type of resource Text
File origin Born digital
Access conditions Open access
Terms of use
Created on 2025-02-05 12:26:27