Find here links to my Google Scholar and DBLP profiles.

- Fast Parallel Algorithms for Counting and Listing Triangles in Big Graphs
- Overcoming MPI Communication Overhead for Distributed Community Detection
- Fast Stochastic Block Partitioning using a Single Commodity Machine
- Distributed Community Detection in Large Networks using An Information-Theoretic Approach
- Detecting Web Spams in Webgraphs with Predictive Model Analysis
- Understanding Performance Bottleneck to Improve Parallel Efficiency of Louvain Algorithm
- Assessing the Dependability of Apache Spark System: Streaming Analytics on Large-scale Ocean Data
- On the Assessment of Security and Performance Bugs in Chromium Open-source Project
- Scalable Mining, Analysis, and Visualization of Protein-protein Interaction Networks
- A Comparative Analysis of Large-scale Network Visualization Tools
- Parallel Algorithms for Mining Large-scale Time-varying (Dynamic) Graphs
- Parallelizing Louvain Algorithm: Distributed Memory Challenges
- Data Mining in-IDE Activities: Why Software Developers Fail
- Scalable Mining and Analysis of Protein-Protein Interaction Networks
- A Space-efficient Parallel Algorithm for Counting Exact Triangles in Massive Networks
- A Fast Parallel Algorithm for Counting Triangles in Graphs using Dynamic Load Balancing
- A Fast Parallel Conversion of Edge List to Adjacency List for Large-scale Graphs
- CINET 2.0: A Cyber-Infrastructure for Network Science
- PATRIC: A Parallel Algorithm for Counting Triangles in Massive Networks
- CINET: A Cyber-Infrastructure for Network Science
- Parallel Mining and Analysis of Triangles and Communities in Big Networks
- Distributed Memory Parallel Algorithms for Massive Graphs
- A Comparative Analysis of Parallel Louvain Algorithms for Community Detection
- Scalable Community Detection Using Parallel Louvain Algorithms
- A Comparative Analysis of Parallel Louvain Algorithms for Community Detection
- Parallel Algorithms for Counting Triangles and Computing Clustering Coefficients
- Distributed Streaming Analytics on Large-scale Oceanographic Data using Apache Spark
- Distributed-Memory Parallel Algorithms for Counting and Listing Triangles in Big Graphs
- Generating a Synthetic Population of the United States
- Parallel Algorithms for Counting Triangles in Networks with Large Degrees

*Keywords:* Parallel algorithms, approximate algorithm, complexity analysis, scalability analysis, time and space efficient algorithms, triangle counting, graph data mining, large-scale graph data

*Keywords:* Parallel algorithms, distributed-memory algorithms, Message Passing Interface (MPI), communication overhead, community detection, Louvain algorithm, graph (network) data mining, large-scale graph data

*Keywords:* Parallel algorithms, stochastic block partitioning, scalability analysis, community detection, graph data mining, large-scale graph data

*Keywords:* Parallel algorithms, complexity analysis, scalability analysis, community detection, graph data mining, large-scale graph data

*Keywords:* graph data mining, large-scale graph data, machine learning, spam detection, predictive analysis

WIP paper in * PDSW workshop in SC19 Conference*, to appear, Nov 2019.

*Keywords:* Parallel algorithms, complexity analysis, scalability analysis, performance analysis, community detection, graph data mining, large-scale graph data

*Keywords:* large-scale analysis, Apache Spark, Ocean data, Streaming analytics

*Keywords:* data analysis, open-source software project, security bugs, performance bugs

*Keywords:* Scalable tool, parallel algorithms, protein interaction network, graph (network) analysis, graph visualization, large-scale datasets, scalable computing

*Keywords:* Big data visualizaiton, graph (network) analysis, graph visualization, large-scale datasets, scalable computing

In PDSW-DISCS Workshop in SC'18, Dallas, TX, USA, Nov 2018

*Keywords:* Parallel algorithms, dynamic graph, temporal graph data, graph (network) analysis, graph visualization, large-scale datasets, scalable computing

In 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing (DASC), page 695-701, Athens, Greece, Aug 2018.

Louvain algorithm is a well-known and efficient method for detecting communities or clusters in social and information networks (graphs). The emergence of large network data necessitates parallelization of this algorithms for high performance computing platforms. There exist several shared-memory based parallel algorithms for Louvain method. However, those algorithms do not scale to a large number of cores and large networks. Distributed memory systems are widely available nowadays, which offer a large number of processing nodes. However, the existing only MPI (message passing interface) based distributed-memory parallel implementation of Louvain algorithm has shown scalability to only 16 processors. In this paper, we implement both shared- and distributed-memory based parallel algorithms and identify issues that hinder scalability. In our shared-memory based algorithm using OpenMP, we get 4-fold speedup for several real-world networks. However, this speedup is limited only by the physical cores available to our system. We then design a distributed-memory based parallel algorithms using message passing interface. Our results demonstrate an scalability to a moderate number of processors. We also provide an empirical analysis that shows how communication overhead poses the most crucial threat for deisgning scalable parallel Louvain algorithm in a distributed-memory setting.

**Bibtex:**

@inproceedings{arifuzzaman-louv18,

author = {Sattar, Naw Safrin and Arifuzzaman, Shaikh},

title = {Parallelizing Louvain Algorithm: Distributed Memory Challenges},

booktitle = {Proceedings of 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing (DASC), Athens, Greece},

pages = {695--701},

year = {2018},

month= {Aug}

}

In proceeding of the 27th International Conference on Software Engineering and Data Engineering (SEDE), pp. 97 - 102, New Orleans, LA, USA, 2018.

This paper addresses the hindrances behind software developers' failures as perceived from software build failures. We capture and correlate the routines and patterns of developers' interactions in IDE, their backgrounds, expertise, and geographic locations with their failure instances. Our study is based on a large dataset of 85 developers' 11 million interactions/events in Microsoft Visual Studio IDE over 15,000 work-hours. The findings from this study will help developers and organizations in shaping their working style for higher success rate.

In Proceeding of the 3rd IEEE International Conference on Big Data Intelligence and Computing (DataCom), Pages 1098--1105, Orlando, FL, Nov 2017.

Protein-protein interaction (PPI) networks are the
networks of protein complexes formed by biochemical events and
electrostatic forces. PPI networks can be used to study diseases
and discover drugs. The causes of diseases are evident on a
protein interaction level. For instance, an elevation of interaction
edge weights of oncogenes is manifested in cancers. Further, the
majority of approved drugs target a particular PPI, and thus
studying PPI networks is vital to drug discovery.
The availability of large datasets and need for efficient analysis
necessitate the design of scalable methods leveraging modern
high-performance computing (HPC) platforms. In this paper,
we design a lighweight framework on a distributed-memory
parallel system, which includes scalable algorithmic and analytic
techniques to study PPI networks and visualize them. Our
study of PPIs will be based on network-centric mining and
analysis approaches. Since PPI networks are signed (labeled)
and weighted, many existing network mining methods working
on simple unweighted networks will be unsuitable to study PPIs.
Further, the large volume and variety of such data limits the
use of sequential tool or methods. Many existing tools also do
not support a convenient workflow starting from automated
data preprocessing to visualizing results and reports for efficient
extraction of intelligence from large-scale PPI networks. Our
framework support automated analytics based on a large range
of extensible methods for extracting signed motifs, computing
centrality, and finding functional units. We design MPI (Message
Passing Interface) based parallel methods and workflow, which
scale to large networks. The framework is also extensible and
sufficiently generic. To the best of our knowledge, all these
capabilities collectively make our tool novel.

**Bibtex:**

@inproceedings{arifuzzaman-ppi-networks,

author = {Arifuzzaman, S. and Pandey, Bikesh},

title = {Scalable Mining and Analysis of Protein-Protein Interaction Networks},

booktitle = {Proceedings of the 3rd IEEE International Conference on Big Data Intelligence and Computing (DataCom), Orlando, FL, USA},

pages = {1098-1105},

year = {2017},

month= {November}

}

In Proc. of the 17th IEEE International Conference on High Performance Computing and Communications (HPCC), pages 527–534, 2015.

In this paper, we present a space-efficient MPI based parallel
algorithm for counting exact number of triangles in massive
networks. The algorithm divides the network into nonoverlapping
partitions. Our results demonstrate up to 25-fold
space saving over the algorithm with overlapping partitions.
This space efficiency allows the algorithm to deal with networks
which are 25 times larger. We present a novel approach that
reduces communication cost drastically (up to 90%) leading to
both a space- and runtime-efficient algorithm. Our adaptation
of a parallel partitioning scheme by computing a novel weight
function adds further to the efficiency of the algorithm.

**Bibtex:**

@inproceedings{arifuzzaman-space-triangle,

author = {Arifuzzaman, S. and Khan, Maleq and Marathe, Madhav},

title = {A Space-efficient Parallel Algorithm for Counting Exact Triangles in Massive Networks},

booktitle = {Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications (HPCC 2015), New York City, USA},

pages = {527--534},

year = {2015},

month= {August}

}

In proc. of 2015 IEEE International Conference on Big Data (BigData), pages 1839–1847, 2015.

In this paper, we present an efficient MPI-based parallel algorithm for counting triangles in large graph. We consider the case where the main memory of each compute node is large enough to contain the entire graph. We observe that for such a case, computation load can be balanced dynamically and present a dynamic load balancing scheme which improves the performance of the algorithm significantly. Our algorithm demonstrates very good speedups and scales to a large number of processors. The algorithm computes the exact number of triangles in a network with 1 billion edges in 2 minutes with only 100 processors. Our results demonstrate that the algorithm is significantly faster than the related algorithms with static partitioning. In fact, for the real-world networks we experimented on, our algorithm achieves at least 2 times runtime efficiency over the fastest algorithm with static load balancing.

**Bibtex:**

@inproceedings{arifuzzaman_triangle_dynamic,

author = {Arifuzzaman, S. and Khan, M and Marathe, M.},

title = {A Fast Parallel Algorithm for Counting Triangles in Graphs using Dynamic Load Balancing},

booktitle = {Proceedings of 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, USA},

pages = {1839--1847},

year = {2015},

month= {October},

location = {CA, USA}

}

In proc. of the 23rd High Performance Computing Symposium (HPC), pages 17–24, 2015.

In this paper, we present efficient MPI-based distributed memory parallel algorithms for converting edge lists to adjacency lists. To the best of our knowledge, this is the first work on this problem. To address the critical load balancing issue, we present a parallel load balancing scheme which improves both time and space efficiency significantly. Our fast parallel algorithm works on massive graphs, achieves very good speedups, and scales to large number of processors. The algorithm can convert an edge list of a graph with 20 billion edges to the adjacency list in less than 2 minutes using 1024 processors. Denoting the number of nodes, edges and processors by n, m, and P, respectively, the time complexity of our algorithm is O(m/p + n + P) which provides a speedup factor of at least Ω(min{P, d_avg}), where davg is the average degree of the nodes. The algorithm has a space complexity of O(m/p), which is optimal.

**Bibtex:**

@inproceedings{graph_conversion,

author = {Arifuzzaman, S. and Khan, M.},

title = {Fast Parallel Conversion of Edge List to Adjacency List for Large-scale Graphs},

booktitle = {Proceedings of the 23rd High Performance Computing Symposium (HPC 2015), Alexandria, VA, USA},

pages = {17--24},

year = {2015},

month= {April},

location = {Alexandria, VA, USA}

}

In proc. of the 10th IEEE International Conference on eScience (eScience), pages 324–331, 2014.

Analysis of structural properties and dynamics of
networks is currently a central topic in many disciplines including
Social Sciences, Biology and Business. CINET, a cyberinfrastructure
for such studies, introduced the concept of supporting
network analysis as a service. The basic idea is to allow experts in
various disciplines to focus on obtaining domain-specific insights
from the results of network analyses instead of worrying about
programming details and allocation of computational resources
needed to carry out the analyses. A basic version of CINET
was released in May 2012. This paper discusses CINET 2.0, a
significantly enhanced version that supports complex network
analyses through a web portal. CINET 2.0 has already been used
for teaching courses related to Network Science at several US
universities. In this paper, we discuss how CINET 2.0 significantly
extends CINET 1.0 through enhancements to some components
and the addition of new components.

**Bibtex:**

@inproceedings{cinet2_2014,

author = {Sherif Hanie El Meligy Abdelhamid and Md. Maksudul Alam and Richard Al{\'{o}} and Shaikh Arifuzzaman and Peter H. Beckman and Tirtha Bhattacharjee and Md Hasanuzzaman Bhuiyan and Keith R. Bisset and Stephen Eubank and Albert C. Esterline and Edward A. Fox and Geoffrey Fox and S. M. Shamimul Hasan and Harshal Hayatnagarkar and Maleq Khan and Chris J. Kuhlman and Madhav V. Marathe and Natarajan Meghanathan and Henning S. Mortveit and Judy Qiu and S. S. Ravi and Zalia Shams and Ongard Sirisaengtaksin and Samarth Swarup and Anil Kumar S. Vullikanti and Tak{-}Lon Wu},

title = {{CINET} 2.0: {A} CyberInfrastructure for Network Science},

booktitle = {Proceedings of the 10th {IEEE} International Conference on e-Science (e-Science 2014), Sao Paulo, Brazil},

pages = {324--331},

year = {2014},

month = {October}

}

In proc. of the 22nd ACM International Conference on Information and Knowledge Management (CIKM), pages 529–538, 2013.

In this paper, we present an efficient MPI-based distributed memory parallel algorithm, called PATRIC, for counting triangles in massive networks. PATRIC scales well to networks with billions of nodes and can compute the exact number of triangles in a network with one billion nodes and 10 billion edges in 16 minutes. Balancing computational loads among processors for a graph problem like counting triangles is a challenging issue. We present and analyze several schemes for balancing load among processors for the triangle counting problem. These schemes achieve very good load balancing. We also show how our parallel algorithm can adapt an existing edge sparsification technique to approximate the number of triangles with very high accuracy. This modification allows us to count triangles in even larger networks.

**Bibtex:**

@inproceedings{arifuzzaman_triangle_cikm13,

author = {Shaikh Arifuzzaman and Maleq Khan and Madhav V. Marathe},

title = {{PATRIC:} a parallel algorithm for counting triangles in massive networks},

booktitle = {Proceedings of the 22nd {ACM} International Conference on Information and Knowledge Management (CIKM 2013), San Francisco, CA, USA},

pages = {529--538},

year = {2013},

month ={October}

}

In proc. of the 8th IEEE International Conference on eScience (eScience), pages 1–8, 2012.

Networks are an effective abstraction for representing real systems. Consequently, network science is increasingly used in academia and industry to solve problems in many fields. Computations that determine structure properties and dynamical behaviors of networks are useful because they give insights into the characteristics of real systems. We introduce a newly built and deployed cyberinfrastructure for network science (CINET) that performs such computations, with the following features: (i) it offers realistic networks from the literature and various random and deterministic network generators; (ii) it provides many algorithmic modules and measures to study and characterize networks; (iii) it is designed for efficient execution of complex algorithms on distributed high performance computers so that they scale to large networks; and (iv) it is hosted with web interfaces so that those without direct access to high performance computing resources and those who are not computing experts can still reap the system benefits. It is a combination of application design and cyberinfrastructure that makes these features possible. To our knowledge, these capabilities collectively make CINET novel. We describe the system and illustrative use cases, with a focus on the CINET user.

**Bibtex:**

@inproceedings{cinet_2012,

author = {Sherif Elmeligy Abdelhamid and Richard Al{\'{o}} and S. M. Arifuzzaman and Peter H. Beckman and Md Hasanuzzaman Bhuiyan and Keith R. Bisset and Edward A. Fox and Geoffrey Charles Fox and Kevin Hall and S. M. Shamimul Hasan and Anurodh Joshi and Maleq Khan and Chris J. Kuhlman and Spencer J. Lee and Jonathan Leidig and Hemanth Makkapati and Madhav V. Marathe and Henning S. Mortveit and Judy Qiu and S. S. Ravi and Zalia Shams and Ongard Sirisaengtaksin and Rajesh Subbiah and Samarth Swarup and Nick Trebon and Anil Vullikanti and Zhao Zhao},

title = {{CINET:} {A} cyberinfrastructure for network science},

booktitle = {Proceedings of the 8th {IEEE} International Conference on e-Science (e-Science 2012), Chicago, IL, USA},

pages = {1--8},

year = {2012},

month = {October}

}

Ph.D. Dissertation, Dept. of Computer Science, Virginia Tech, August 2016.

We design MPI-based distributed-memory parallel algorithms for counting triangles and detecting communities in big networks and present related analysis. The dissertation consists of four parts. In Part I, we devise parallel algorithms for counting and enumerating triangles. The first algorithm employs an overlapping partitioning scheme and novel load-balancing schemes leading to a fast algorithm. We also design a space-efficient algorithm using non-overlapping partitioning and an efficient communication scheme. We then present our third parallel algorithm based on dynamic load balancing. In Part II, we characterize networks by quantifying the number of common neighbors and demonstrate its relationship to community structure of networks. In Part III, we design parallel algorithms for detecting communities in big networks. Finally, in Part IV, we present scalable parallel algorithms for a useful graph preprocessing problem-- converting edge list to adjacency list. We present non-trivial parallelization with efficient HPC-based techniques leading to fast and space-efficient algorithms.

**Bibtex:**

@phdthesis{Arif2016Triangle,

title = {Parallel Mining and Analysis of Triangles and Communities in Big Networks},

school = {Dept. of Computer Science, Virginia Tech},

author = {Shaikh Arifuzzaman},

year = {2016},

month = {August}

}

Book Chapter in *Parallel Graph Algorithms.* Ed. David Bader, Chapman & Hall/ CRC Computational Science, 2015. | **ISBN-10:** 1466573260

Graph theoretic problems are representative of fundamental kernels in traditional and emerging scientific applications, such as complex network analysis, data mining, and computational biology, as well as applications in national security. Graph abstractions are also extensively used to understand and solve challenging problems in scientific computing. Real-world systems, such as the Internet, telephone networks, social interactions, and transportation networks, are analyzed by modeling them as graphs. To efficiently solve large-scale graph problems, it is necessary to design high performance computing systems and novel parallel algorithms. In this book, some of the world’s most renowned experts explore the latest research and applications in this important area.

Keywords: Parallel computing, commmunity detection algorithm, Louvain algorithm

Keywords: Parallel computing, commmunity detection algorithm, Louvain algorithm

Keywords: Parallel computing, commmunity detection algorithm, Louvain algorithm

In 2012 SC Companion: High Performance Computing, Networking Storage and Analysis (SC'12), pages 1448–1449, 2012.

We present MPI-based parallel algorithms for counting triangles and computing clustering coefficients in massive networks. Counting triangles is important in the analysis of various networks, e.g., social, biological, web etc. Emerging massive networks do not fit in the main memory of a single machine and are very challenging to work with. Our distributed-memory parallel algorithm allows us to deal with such massive networks in a time- and space-efficient manner. We were able to count triangles in a graph with 2 billions of nodes and 50 billions of edges in 10 minutes. Our parallel algorithm for computing clustering coefficients uses efficient external memory aggregation. We also show how edge sparsification technique can be used with our parallel algorithm to find approximate number of triangles without sacrificing the accuracy of estimation. In addition, we propose a simple modification of a state-of-the-art sequential algorithm that improves both runtime and space requirement.

**Bibtex:**

@inproceedings{extabst_sc12,

author = {S. M. Arifuzzaman and Maleq Khan and Madhav V. Marathe},

title = {Abstract: Parallel Algorithms for Counting Triangles and Computing Clustering Coefficients},

booktitle = {2012 {SC} Companion: High Performance Computing, Networking Storage and Analysis, Salt Lake City, UT, USA},

pages = {1448--1449},

year = {2012},

month = {November}

}

Keywords: Machine learning, graph mining, web graphs, spam detection

Preprint. 2019.

Keywords: streaming analytics, apache spark, ocean data, large-scale computing

CoRR abs/1706.05151, pages 1-30, 2017.

In this paper, we present two efficient MPI-based distributed memory parallel algorithms for counting triangles in big graphs. The first algorithm employs overlapping partitioning and efficient load balancing schemes to provide a very fast parallel algorithm. The algorithm scales well to networks with billions of nodes and can compute the exact number of triangles in a network with 10 billion edges in 16 minutes. The second algorithm divides the network into non-overlapping partitions leading to a space-efficient algorithm. Our results on both artificial and real-world networks demonstrate a significant space saving with this algorithm. We also present a novel approach that reduces communication cost drastically leading the algorithm to both a space- and runtime-efficient algorithm. Further, we demonstrate how our algorithms can be used to list all triangles in a graph and compute clustering coefficients of nodes. Our algorithm can also be adapted to a parallel approximation algorithm using an edge sparsification method.

**Bibtex:**

@article{Arif2017Triangle,

title = {Distributed-Memory Parallel Algorithms for Counting and Listing Triangles in Big Graphs},

author = {Shaikh Arifuzzaman and Maleq Khan and Madhav Marathe},

journal = {CoRR},

volume = {1706},

issue = {05151},

pages = {1--30},

year = {2017}

}

Technical Report NDSSL 15-009. Network Dynamics and Simulation Science Laboratory

We describe the methodology for generating a synthetic population of the United States. A synthetic population integrates a variety of databases from commercial and public sources into a common architecture for data exchange. The process preserves the confidentiality of the individuals in the original data sets, yet produces realistic attributes and demographics for the synthetic individuals. The synthetic population is a set of synthetic people and households, located geographically, each associated with demographic variables recorded in the census. Joint demographic distributions are reconstructed from the marginal distributions available in typical census data using an iterative proportional fitting (IPF) technique. Each synthetic individual is placed in a household with other synthetic individuals. Each household is located geographically using land-use data and data pertaining to transportation networks. The process guarantees that a census of our synthetic population is statistically indistinguishable from the original census.

CoRR abs/1406.5687, pages 1-10, 2014.

**Bibtex:**

@article{Arifuzzaman_triangle_14,

author = {Shaikh Arifuzzaman and Maleq Khan and Madhav V. Marathe},

title = {Parallel Algorithms for Counting Triangles in Networks with Large Degrees},

journal = {CoRR},

volume = {abs/1406.5687},

pages = {1--10},

year = {2014}

}