Pattern Detection in Bipartite Temporal Network
Pattern detection in social networks has been of great interest recently because it helps to reveal insights about how people communicate. In graph mining, this is referred to as a frequent sub-graphs mining problem and it has many variations regarding the structure of the network - i.e. how much information the graph holds. As a part of the evolution of the problem, additional information such as data dimensions or more advanced network structure are incorporated to the input to find more interesting hidden patterns. This thesis concerns finding patterns in Twitter data modeled as a bipartite network with an additional temporal dimension. After defining three topology-based types of pattern mathematically, three pattern detection methods are developed, implemented and tested on real-world data collected from Twitter. The results reveal the most popular patterns of each type on Twitter data, a decaying tendency in the replying time as the conversation develops and other interesting observations. They also show that using the difference between two consecutive messages in a pattern (time leap) could be a good alternative time constraint for a time window in pattern detection. Despite being run on a small number of test cases, the results successfully demonstrate the potential of studying chain-like patterns separated from the dominant star-like ones in social networks like Twitter.