Shaikh M. Arifuzzaman

Big Data & Scalable Computing Lab

Algorithms/Sytems for Scalable Computing
Data Science at Scale/Big Data Analytics
Artificial Intelligence/Machine Learning
High Performance Computing
Graph Data Mining, Learning, and Visualization

Research Summary

The Big Data and Scalable Computing Lab (BDSC) at UNO has a research focus at the intersection of large-scale algorithmics, data science/ML, and high performance computing (HPC). Currently, the group has been working on scalable algorithms and machine learning models for graph problems (e.g., enumerating triangles, community detection), methods for spatio-temporal and dynamic data, scaling up AI/ML applicaitons/methods, developing tools for biological data, and designing innovative architectures and predictive models for high-performance data analytics, among others. Currently, the lab consists of 4 PhD students, several MS and a couple of undergraduate students. Dr. Arifuzzaman is the director of BDSC Lab.

Current Ph.D. Students:

1. Naw Safrin Sattar: Joined in 2017, working on big data and AI/ML methods. Expected graduation: Summer 2022.
2. Md Abdul Motaleb Faysal: Joined in 2017, working on big data and HPC. Expected graduation: Fall 2022.
3. Ted Holmberg: Joined in 2019, working on spatio-temporal data analytics. Expected graduation: Spring 2024.
4. Austin Schmidt: Joined in 2021, working on large-scale data analytics. Expected graduation: Spring 2025.

Research Projects

Parallel Algorithms and Machine Learning Models for Scalable Community Detection (Clustering) in Graphs

Keywords: community detection, social networks, biological networks, large-scale graphs, louvain algorithm, InfoMap, label-propagation algorithm
[IEEE BigData 2020], [IEEE DASC 2018]

Complex systems are organized in clusters or communities, each having distinct role or function. In the corresponding network representation, each functional unit (community) appears as a tightly-knit set of nodes having a higher connection inside the set than outside. Finding communities may reveal the organization of complex systems and their function. We are currently working on designing parallel scalable algorithms for detecting communities in large-scale networks.
Parallel and Approximation Algorithms for Counting and Listing Triangles in Massive Graph Data

Keywords: triangle counting, clustering coefficients, distributed-memory algorithms, load balancing, fast and space efficient

Download Code*
[ACM TKDD 2020]

Counting triangles in a network is an important algorithmic problem arising in the study of complex networks. An efficient solution to the triangle counting problem can also lead to efficient solutions for many other graph-theoretic problems, e.g. computation of clustering coefficient, transitivity, and triangular connectivity. Further, triangle counting has important applications in graph analysis. We design efficient parallel algorithms for counting triangles.
* Note that the above code is a research code and is intended for friendly use. The authors will try their best to address any questions/queries/issues. Users are advised to contact with the authors for any newer (or optimized) version of the code. However, for most general use cases, the provided code should suffice.
Innovative Architecture for High Performance Data Analytics

Keywords: scalable algorithms, performance modeling, predictive models, sepcialized hardware architecture
[IEEE HPEC 2021]

This project is an ongoing research collaboration with Lawrence Berkeley National Laboratory/University of California.
Scalable Methods for Mining and Analyzing Dynamic Graphs

Keywords: Parallel algorithms, temporal patterns, community discovery, evolution of structures

We are collaborating with Performance and Algorithms Group at Lawrence Berkeley National Lab on this project. Real complex systems are inherently time-varying and can be modeled as temporal graphs (networks). Examples include social, transportation, and many forms of biological networks. Standard graph metrics introduced so far in complex network theory are mainly suited for static graphs, i.e., graphs in which the links do not change over time. In this work, we aim at designing scalable parallel algorithms for mining large time-varying networks.
Graph-centric Analysis of Human Brain Data (High Performance Human Connectome Network Analysis)

Keywords: Application of graph methods, brain image, image to network, bio/health/medical informatics

We are a multidisciplinary team consisting of faculty from Psychology/Neuroscience and Computer Science working together to extract insights from human brain data. Collaborators: Dr. Elliot Beaton and Dr. Vassil Roussev (UNO). Funded by UNO ORSP Interdisciplinary grant.
Large-scale Graph Visualization

Keywords: Big networks; Visualization; Visual analytics; Network analytics; Graph mining; Scalable algorithms
[IEEE BigData 2020], [IJBDI 2019]

In this project, we identify several popular network visualization tools and provide a comparative analysis based on the features and operations these tools support. We demonstrate empirically how those tools scale to large networks. We also provide several case studies of visual analytics on large network data and assess performances of the tools.
Characterizing Graph Data based on Local Structures and Properties

Keywords: local neighborhood, jaccard coefficient, community structure, triangle-dense graphs
[ACM TKDD 2020]

Characterizing real-world social and information networks based on graph-theoretic metrics or properties has been of growing interest. Among the most explored metrics are degree distribution, number of triangles and clustering coefficients. An important property related to triangles, of many networks, is high transitivity, which states that two nodes (vertices) having common neighbor(s) have an elevated probability of being neighbors to one another. We present a characterization of networks based on a quantification of common neighbors.
Scalable Mining and Analysis of Protein-Protein Interaction Networks

Keywords: PPI networks, functional units, scalable framework, disease analysis, drug discovery
[IJBDI 2019]

We are working to design scalable algorithmic and analytic techniques to study PPI networks. Our study of PPIs will be based on network-centric mining and analysis approaches. We will design specialized methods for extracting signed motifs, computing centrality, and finding functional units in PPI networks.