Artykuł w czasopiśmie
Brak miniatury
Licencja
Scalable Machine Learning with Granulated Data Summaries: A Case of Feature Selection
Autor
Data publikacji
2017
Abstrakt (EN)
We investigate how to use the histogram-based data summaries that are created and stored by one of the approximate database engines available in the market, for the purposes of redesigning and accelerating machine learning algorithms. As an example, we consider one of popular minimum redundancy maximum relevance (mRMR) feature selection methods based on mutual information. We use granulated data summaries to approximately calculate the entropy-based mutual information scores and observe the mRMR results compared to the case of working with the actual scores derived from the original data.
Słowa kluczowe EN
Data granulation
Approximate query
Feature selection
Dyscyplina PBN
informatyka
Strony od-do
519-529
Licencja otwartego dostępu
Dostęp zamknięty