Please use this identifier to cite or link to this item: https://hdl.handle.net/1889/2296
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorCerioli, Andrea-
dc.contributor.authorMorelli, Gianluca-
dc.date.accessioned2013-07-17T09:47:52Z-
dc.date.available2013-07-17T09:47:52Z-
dc.date.issued2013-04-11-
dc.identifier.urihttp://hdl.handle.net/1889/2296-
dc.description.abstractCluster analysis is the generic name of all those techniques which allow to aggregate n-units into k-groups where k is usually much smaller than n. Classification can be useful in many fields including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics and market research. Generalizing, cluster analysis is peculiar all times when we need to identify groups of units which have similar behaviour. The main objective of this work is to find an effective cluster analysis method which can be applied to different frameworks and in particular to market research. The aim of this work is to present a comparison among different methods to underline, if it exists, the strongest classification method, based on data structure, to get an optimal allocation for each dataset. To achieve this target we compare existing methods with new ones based on robust approaches which have shown high efficiency in many simulations performed so far. For the computational part of the work the software which has been used is MatLab. The structure of the thesis is as follows. The first chapter focuses on the problem of identifying outliers and how they affected the different classification techniques. In particular we consider: a) the method of k-means that represents the reference benchmark given its widespread diffusion in the economic sciences; b) the method of trimmed k-means which constitutes a robustification of the method of k-means, developed in the late 90s; c) the method of TCLUST which is one of the robust methods attracting the main research efforts in the statistical literature; d) the Forward Search, which is a robust method developed in large part within the Department of Economics of University of Parma and the London School of Economics, whose potentiality for classification purposes are still largely unexplored. The second chapter is focused on the tests of the methods introduced on simulated data sets generated by various types of distributions with different degrees of overlapping observations. The purpose is to understand which method and which calibration of the parameters allows to obtain the best classification. The results of the classification are then measured through performance indices of proper allocation which allow to obtain a comparison of the different methods. In the third chapter we will test the methods on a real data set of marketing interest. Finally, the thesis concludes with an appendix that describes the contributions of the work in the field of computing.it
dc.language.isoIngleseit
dc.publisherUniversità di Parma. Dipartimento di Economiait
dc.relation.ispartofseriesDottorato di ricerca in Economiait
dc.rights© Gianluca Morelli, 2013it
dc.subjectClassificationit
dc.subjectGraphical dynamic clusteringit
dc.subjectRobustnessit
dc.titleA comparison of different classification methodsit
dc.typeDoctoral thesisit
dc.subject.soggettarioCampionamento a grappoliit
dc.subject.soggettarioClassificazione - Metodi matematiciit
dc.subject.soggettarioAnalisi multivariatait
dc.subject.miurSECS-S/01it
Appears in Collections:Economia. Tesi di dottorato

Files in This Item:
File Description SizeFormat 
G_Morelli_phd_thesis.pdf
  Until 2101-01-01
Tesi di dottorato6.09 MBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.