Artykuł w czasopiśmie
Brak miniatury
Licencja

ClosedAccessDostęp zamknięty
 

Improving Group Lasso for High-Dimensional Categorical Data

cris.lastimport.scopus2024-02-12T20:24:05Z
dc.abstract.enSparse modeling or model selection with categorical data is challenging even for a moderate number of variables, because roughly one parameter is needed to encode one category or level. The Group Lasso is a well known efficient algorithm for selection of continuous or categorical variables, but all estimates related to a selected factor usually differ. Therefore, a fitted model may not be sparse, which makes the model interpretation difficult. To obtain a sparse solution of the Group Lasso, we propose the following two-step procedure: first, we reduce data dimensionality using the Group Lasso; then, to choose the final model, we use an information criterion on a small family of models prepared by clustering levels of individual factors. In the consequence, our procedure reduces dimensionality of the Group Lasso and strongly improves interpretability of the final model. What is important, this reduction results only in the small increase of the prediction error. In the paper we investigate selection correctness of the algorithm in a sparse high-dimensional scenario. We also test our method on synthetic as well as the real data sets and show that it outperforms the state of the art algorithms with respect to the prediction accuracy, model dimension and execution time. Our procedure is contained in the R package DMRnet and available in the CRAN repository.
dc.affiliationUniwersytet Warszawski
dc.conference.countryCzechy
dc.conference.datefinish2023-07-05
dc.conference.datestart2023-07-03
dc.conference.placePrague
dc.conference.seriesInternational Conference on Computational Science
dc.conference.seriesInternational Conference on Computational Science
dc.conference.seriesshortcutICCS
dc.conference.shortcutICCS 2023
dc.conference.weblinkhttps://www.iccs-meeting.org/iccs2023/
dc.contributor.authorSołtys, Agnieszka
dc.contributor.authorRejchel, Wojciech
dc.contributor.authorPokarowski, Piotr
dc.contributor.authorNowakowski, Szymon
dc.date.accessioned2024-01-25T03:59:01Z
dc.date.available2024-01-25T03:59:01Z
dc.date.issued2023
dc.description.financePublikacja bezkosztowa
dc.identifier.doi10.1007/978-3-031-36021-3_47
dc.identifier.urihttps://repozytorium.uw.edu.pl//handle/item/109118
dc.identifier.weblinkhttps://dl.acm.org/doi/abs/10.1007/978-3-031-36021-3_47
dc.languageeng
dc.pbn.affiliationcomputer and information sciences
dc.relation.pages455-470
dc.rightsClosedAccess
dc.sciencecloudnosend
dc.titleImproving Group Lasso for High-Dimensional Categorical Data
dc.typeJournalArticle
dspace.entity.typePublication