Cloud data storage and comparison using advanced enhanced position aware sampling AEPAS algorithm

Abstract

Cloud computing is a Technology that possibly changes a massive portion of newlinethe IT business, building software design considerably more attractive as an newlineadministration and shaping the manner IT hardware is structured and acquired. newlineBecause of the development in the Information Technology, managing the massive newlinedata is very difficult to handle. So users depend on the service providers to store and newlinemanage their data. Because of this, there is a chance for data redundancy in the newlinecloud. In the IDC report, the redundant data stored across the world is almost 75%. newlineBecause many users save the same files in the cloud. This redundant data consumes newlineInformation Technology Resources and also the network bandwidth as accessed newlinethrough the internet. newlineThere are few techniques available in the literature to address the problem of newlinededuplication to eliminate redundant data are mostly at the file level. In this thesis, a newlinenovel similarity detection algorithm based on sampling technique called Advanced newlineEnhanced Position Aware Sampling (AEPAS). This algorithm detects file similarity newlinefor the files in the cloud utilizing the concept of file modulo length. In the existing newlinetechniques, a slight modification in the file made a significant impact on the shifting newlineof sampling bit positions. The proposed AEPAS algorithm samples the data blocks newlineboth from the beginning and end of the files. Furthermore, this thesis described a newlinequery algorithm to decrease the time overhead incurred in detecting the similarity. newlineThe various metrics such as Query time, CPU and Memory Utilization, etc., newlineare used to evaluate the performance of the proposed algorithm with the other newlineexisting similarity detection algorithms such as Shingling, Simhash, Traits, TSA, newlinePAS, and EPAS. The metrics precision, recall, accuracy, and f-measure newlinedemonstrates that AEPAS is very efficient in identifying file similarity in compare newlinevii newlineto the existing algorithms. The experimental results also implies that the time newlineoverhead, CPU and memory utilization of AEPAS are very minimal when newlinecompared to

Description

Keywords

Citation

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced