DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING | DWIGHT LOOK COLLEGE OF ENGINEERING | TEXAS A&M UNIVERSITY

 

HOME

ABOUT

COURSES

PEOPLE

PROJECTS

PUBLICATIONS

CONTACT

LINKS

Motif listing in large graphs

Abstract

Motifs are small subgraphs (e.g., triangles, four-cycles) whose appearance in nature is much more frequent than in classical random graphs. Their discovery (enumeration, or listing) plays an important role in various fields. This project aims to analyze complexity of the involved algorithms and design techniques that can handle trillion-edge graphs with limited RAM.

Conference Papers

 
bullet

D. Xiao, Y. Cui, D.B.H. Cline, and D. Loguinov, "On Asymptotic Cost of Triangle Listing in Random Graphs," ACM PODS, May 2017.

 
 
bullet

Y. Cui, D. Xiao, and D. Loguinov, "On Efficient External-Memory Triangle Listing," IEEE ICDM, December 2016.

PDF, PPT

Datasets

The next four undirected graphs are used in the ICDM 2016 paper. Each consists of two files -- (source node, degree) pairs and all adjacency lists dumped one after the other. All node IDs and degrees are unsigned 4-byte integers, LSB order. The IDs are sequential, with no gaps. Source nodes and neighbor lists are sorted ascending.

bulletIRLbot domain (14 GB): degree, edges
bulletIRLbot host (53 GB): degree, edges
bulletIRLbot IP (6 GB): degree, edges
bulletFull ClueWeb09 webgraph (358 GB): degree, edges


     Copyright 2002-2017 IRL at Texas A&M. All Rights Reserved.