This is Bolin Ding. I am a Senior Staff Engineer in the Data Analytics and Intelligence Lab (DAIL) at Alibaba Group. Prior to joining Alibaba, I worked as a researcher in DMX group at Microsoft Research. I completed my Ph.D. in Computer Science at University of Illinois at Urbana-Champaign under the supervision of Prof. Jiawei Han, my M.Phil. at The Chinese University of Hong Kong, advised by Jeffery Xu Yu, and my B.S. at Renmin University of China, advised by Shan Wang and Qing Zhu.

We are hiring research scientists and engineers in Data Analytics and Intelligence Lab at Alibaba Group!

Research Interests

My research goals and interests center on large-scale data management and analytics, including interactively querying and exploring "big" data, privacy-preserving data analytics, query processing and optimization, and data mining algorithms. I am particularly interested in algorithms which have guarantees in theory, and are effective and implementable in practice. More recently, I enjoy developing algorithms and building systems in the following areas:

Approximate Query Processing: how to process analytical queries on large-scale data (e.g., with over billions of rows) with approximate answers in interactive response time (e.g., one hundred milliseconds). New operators, rewritten query plans, precomputed samples, indexes, etc., are essential to such approximate query processing engines. There are both theory and system challenges (e.g., how to handle frequent data updates and needs form different application pipelines).

Privacy-Preserving Data Analytics: how to process analytical queries and analytics tasks with precision guarantees while protecting data owners’ privacy with formal notations, e.g., differential privacy. My Ph.D. thesis is about building differentially private data cubes as private “APIs” to support OLAP queries. We revise the notion of differential privacy to tune privacy-utility trade-offs. More recently, under the local model of differential privacy, we invent data collection mechanisms for different data types, and pair them with estimation algorithms to support approximate data analytics (e.g., heavy hitters and A/B testing).

Querying and Searching Large-Scale Data: search models, algorithms, and indexes to help people explore large-scale structured or semi-structured data, e.g., text and knowledge graphs; query optimization and query processing, e.g., a fast operator for set intersection and an estimator for query-processing progress.

Data Mining: developing data mining algorithms for various applications, e.g., failure localization and pattern mining.