Links: ppt
Entities like people, topics, and events are connected via multiple, heterogeneous, hidden links. We are developing appropriate data structures that encodes this information by (a) annotating multiple data sources and group mentions that represent real world entities, and (b) identifying relations between entities, thus providing an analytical framework for knowledge discovery and the generation of hypotheses. This effort reveals hidden correlations or structural events, and mining such hidden networks will have a strong impact on identifying hidden themes as well as analysis of (anti)social networks and other sub-communities.
Stream Data Mining
In many homeland security applications, it is required that the data analysis be performed on-the-fly in the environment of data streams.
We are performing extensive investigation of stream data mining and have developed effective methods for stream data mining, including computing stream data cubes, clustering high-dimensional and evolving data streams, and classification of data streams.
We are extending these methods, as well as spatiotemporal data mining methods for social network analysis, so that the detection of suspicious persons or activities can be performed on-the-fly and be integrated into operational activities.
This research has applications to classification, clustering, pattern discovery, and outlier detection for computer network information streams, power control systems, traffic systems, and multimedia information flow.
The design and development of powerful mechanisms for managing and mining large datasets of moving objects information is an emerging direction in science and informatics. The subject becomes increasingly important as the world, as well as national security threats, become more mobile. We are designing and developing innovative methods for the querying, analyzing, and mining of spatiotemporal information to find typical characteristics of the moving objects trajectories, and uncover suspicious motion in large datasets of moving objects. The moving objects datasets are in the form of either stored data or transient data streams. The project designs and implements a MotionEye system prototype which consists of four subsystems: MotionQuest(DB) and MotionQuest(Stream) for querying and hypothesis validation in moving objects databases and data streams respectively, and MotionMine(DB) and MotionMine(Stream) for data mining in moving objects databases and data streams respectively. Our research investigates efficient and effective approaches to the implementation of these subsystems. The project also strives to ensure that the developed technology will not sacrifice individual privacy. We are enabling the development of more advanced information systems in homeland security, law enforcement, traffic control, and other domains that deal with moving objects.
Link Analysis for Classification, Object Distinction, and Veracity Analysis
Algorithms like PageRank and HITS explore links among Web pages to discover authoritative pages and hubs. Links have also been popularly used in citation analysis and social network analysis. However, there is a lack of systematic treatment on how to fully explore the power of links in scalable data analysis. We show the hidden power of links can be unleashed to improve the effectiveness and efficiency of typical data analysis tasks, including classification, object distinction, and veracity analysis.
Recent Publications