Time Efficient Space Efficient and Fault Tolerant Social Network Algorithms for Static and Dynamic Graphs

Abstract

The explosive growth of interconnected data has elevated the role of graph analytics in newlinedomains ranging from social media and e-commerce to transportation and biological systems. newlineHowever, the massive scale and dynamic nature of real-world graphs pose substantial newlinechallenges to traditional graph processing methods, which are often sequential, newlinememory-intensive, and ill-suited for rapid updates. This thesis addresses these limitations newlineby developing high-performance, memory-efficient, and fault-tolerant algorithms newlinefor analyzing both static and dynamic graphs, with a focus on community detection, link newlineprediction, and PageRank computation. newlineWe first introduce GVE-Louvain and GVE-Leiden parallel, shared-memory implementations newlinethat significantly accelerate community detection by optimizing both the newlinelocal-moving and aggregation phases. These algorithms employ techniques such as preallocated newlineCSR structures, per-thread hash tables, dynamic OpenMP scheduling, and a newlinerefinement step for Leiden. On a 3.8B-edge graph, GVE-Louvain and GVE-Leiden newlineachieve processing rates of 560M and 403M edges/s, respectively, offering up to 50× newlinemean speedup over existing methods while maintaining or improving modularity. To address newlinememory constraints, we propose weighted-sketch-based variants of Louvain, Leiden, newlineand LPA that replace per-thread hash tables with Misra-Gries and Boyer-Moore newlinesketches. These methods maintain over 99% of the community quality while reducing newlinememory usage to a few kilobytes per thread and incurring only modest runtime overhead. newlineFor link prediction, we introduce DLH (Disregard Large Hubs), a parallel algorithm newlinethat restricts similarity computations to 2-hop neighborhoods and skips high-degree hubs newlineto improve both efficiency and accuracy. DLH achieves up to 1622× mean speedup over newlinebaseline methods and reaches processing rates of 38.1M edges/s on billion-scale graphs. newlineIn the dynamic setting, we develop asynchronous PageRank update strategies (DF newlineand DF-P) that selectively recompute ranks based on local changes, as well as

Description

Keywords

Citation

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced