Development of an Efficient Semantic Code Clone Detection Technique
Loading...
Date
item.page.authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Over the last few years, code clones have emerged as an active area of research because of their wide range of applications in di erent domains of software engineering. Code clones are the result of copy paste activities. Similar code fragments that exist at di erent locations are called code clones. Code clones are reported in the form of clone pairs. Clone pairs are further clustered to form code clone groups. Code clones are broadly categorized into four types from Type 1 to 4. In literature, numerous code clone detection techniques exist to nd di erent types of code clones. Knowledge extraction from existing software resources for maintenance, re-engineering and bug removal through code clone detection is an integral part of software systems. Code clone detection techniques are mainly classi ed into text based, token based, tree based, metric based and semantic code clone detection techniques. Most of the existing semantic code clone detection techniques in literature are based on the comparison of program dependence graphs through sub graph isomorphism, which is NP-Complete. Moreover, these techniques for semantic code clone detection are unable to provide heuristic solution for problems such as statement reordering, inversion of control predicates and insertion of irrelevant statements which may cause a performance bottleneck. To address these issues, we proposed a novel approach that nds semantic code clones between code fragments using data ow analysis on the basis of reaching de nition and liveness analysis. The algorithm based on reaching de nition and liveness analysis is designed to nd similar code fragments which are structurally divergent, but semantically equivalent. The results obtained demonstrate that the proiii posed approach using reaching de nition and liveness analysis is e ective in detection of semantic code clones for various applications. Results obtained on subject systems taken from DeCapo Benchmark con rms the e ectiveness of the proposed approach. Further, code clone groups are