Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/1524
Title: Positive region: An enhancement of partitioning attribute based rough set for categorical data
Authors: Baroad, Muftah Mohamed 
Mohd Hashim, Siti Zaiton 
Ahsan, Jamal Uddin 
Zainal, Anazida 
Keywords: Clustering, Rough Set Theory;Performance, Partitioning Categorical Data;Attribute Dependency
Issue Date: 2020
Publisher: Periodicals of Engineering and Natural Sciences
Journal: Periodicals of Engineering and Natural Sciences 
Abstract: 
Datasets containing multi-value attributes are often involved in several domains, like pattern recognition, machine learning and data mining. Data partition is required in such cases. Partitioning attributes is the clustering process for the whole data set which is specified for further processing. Recently, there are already existing prominent rough set-based approaches available for group objects and for handling uncertainty data that use indiscernibility attribute and mean roughness measure to perform attribute partitioning. Nevertheless, most of the partitioning attribute methods for selecting partitioning attribute algorithm for categorical data in clustering datasets are incapable of optimal partitioning. This indiscernibility and mean roughness measures, however, require the calculation of the lower approximation, which has less accuracy and it is an expensive task to compute. This reduces the growth of the set of attributes and neglects the data found within the boundary region. This paper presents a new concept called the Positive Region Based Dependency (PRD), that calculates the attribute dependency. In order to determine the mean dependency of the attributes, that is acceptable for categorical datasets, using a positive region-based mean dependency measure (PRD) defines the method. By avoiding the lower approximation, PRD is an optimal substitute for the conventional dependency measure in partitioning attribute selection. Contrary to traditional RST partitioning methods, the proposed method can be employed as a measure of data output uncertainty and as a tailback for larger and multiple data clustering. The performance of the method presented is evaluated and compared with the algorithms of Information-Theoretical Dependence Roughness (ITDR) and Maximum Indiscernible Attribute (MIA).
Description: 
Others
URI: http://hdl.handle.net/123456789/1524
ISSN: 2303-4521
Appears in Collections:Faculty of Bioengineering and Technology - Other Publication

Show full item record

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.