Analysis and Design of Temporal Data Farming Algorithms
Loading...
Date
item.page.authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Data farming is a process of growing sufficient data for mining and decision making. In
newlinethis thesis, we give temporal data farming method for cardiac patient dataset. Temporal
newlinedata farming methodologies consist of: (i) data fertilization, (ii) data cultivation, (iii) data
newlineplantation and (iv) data harvesting. The goal of the data farming is to increase the
newlineperformance measure (like classification accuracy, cluster density, support and confidence
newlineof any association rule etc.) and reduce the data collection cost. In this thesis, we propose
newlinealgorithms which increase the classification accuracy in farmed dataset compared to the
newlineseed data sample or original dataset. We present analysis of various methods to fertilize
newlineavailable seed data by fill mean, fill median, fill mode and fill by various regressions.
newlineThesis includes an algorithm to farm the prediction vector as dose of medicine called as
newline dobutamine given to heart patients by applying regression and iterative prediction. We
newlinealso propose another algorithm to get the generalized IF-THEN Rules by making the
newline cluster using k-mean clustering; these rules are further used to farm the dataset. After
newlinedata fertilization and data cultivation, we get fertile seed data. For data plantation steps of
newlinethe data farming process, we propose an algorithm which plants these fertile seed data and
newlinefarmed data as crops. The proposed algorithm is implemented on graphical user interface
newlineof MATLAB 7.0. Another algorithm is proposed and analyzed for data plantation and
newlineharvesting steps including the effects of the temporal events of the patient s medical
newlinehistory like (1) diabetic, (2) myocardial infarction (MI) or heart attack, (3)
newlinerevascularization by percutaneous transluminal coronary angioplasty (PTCA) and (4)
newlinecoronary artery bypass grafting surgery (CABG) etc. Proposed algorithm uses a weight
newlinefunction to correctly estimate the effect of these events with the impact of the time of
newlineoccurrence. We further improve the effectiveness of the weight function in such a manner
newlinethat the smaller time