zhu-small
(313) 593-4998
Applications: Bioinformatics, Spatio-Temporal Data Applications, Streaming Data Processing, Vehicular Information Applications Methodologies: Data Integration, Data Mining, Data Privacy, K-nearest Neighbor Search, Multidimensional Indexing, Query Processing and Optimization Relevant Projects: NSF, IBM, Ford Connections:

ACM SIGMOD; IEEE Computer Society; IEEE SEM; International Society for Computers and Their Applications.

Qiang Zhu

Professor, Computer and Information Science

Dr. Zhu’s group conducts research on various topics, ranging from foundational methodologies to challenging applications, in data science. In particular, the group has been investigating the fundamental issues and techniques for supporting various types of queries (including range queries, box queries, k-NN queries, and hybrid queries) on large datasets in a non-ordered discrete data space. A number of novel indexing and searching techniques that utilize the unique characteristics of an NDDS are developed. The group has also been studying the issues and techniques for storing and searching large scale k-mer datasets for various genome sequence analysis applications in bioinformatics. A virtual approximate store approach to supporting repetitive big data in genome sequence analyses and several new sequence analysis techniques are suggested. In addition, the group has been researching the challenges and methods for processing and optimizing a new type of so-called progressive queries that are formulated on the fly by a user in multiple steps. Such queries are widely used in many application domains including e-commerce, social media, business intelligence, and decision support. The other research topics that have been studied by the group include streaming data processing, self-management database, spatio-temporal data indexing, data privacy, Web information management, and vehicle drive-through wireless services.

zhu-image

Using a data-partitioning based index tree (the ND-tree) to find sequences that are similar (with distance 1) to a given query sequence from a large sequence database in a Non-ordered Discrete Data Space (NDDS).