AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference
Proceedings of the ACM Conference on Management of Data (SIGMOD, CCF-A), 2025
AlayaDB is a cutting-edge vector database system natively architected for efficient and effective long-context inference for Large Language Models (LLMs) at AlayaDB AI. Specifically, it decouples the KV cache and attention computation from the LLM inference systems, and encapsulates them into a novel vector database system. For the Model as a Service providers (MaaS), AlayaDB consumes fewer hardware resources and offers higher generation quality for various workloads with different kinds of Service Level Objectives (SLOs), when comparing with the existing alternative solutions (e.g., KV cache disaggregation, retrieval-based sparse attention). The crux of AlayaDB is that it abstracts the attention computation and cache management for LLM inference into a query processing procedure, and optimizes the performance via a native query optimizer. In this work, we demonstrate the effectiveness of AlayaDB via (i) three use cases from our industry partners, and (ii) extensive experimental results on LLM inference benchmarks.
Tao: Improving Resource Utilization while Guaranteeing SLO in Multi-tenant Relational Database-as-a-Service
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2025
It is an open challenge for cloud database service providers to guarantee tenants' service-level objectives (SLOs) and enjoy high resource utilization simultaneously. In this work, we propose a novel system Tao to overcome it. Tao consists of three key components: (i) tasklet-based DAG generator, (ii) tasklet-based DAG executor, and (iii) SLO-guaranteed scheduler. The core concept in Tao is tasklet, a coroutine-based lightweight execution unit of the physical execution plan. In particular, we first convert each SQL operator in the traditional physical execution plan into a set of fine-grained tasklets by the tasklet-based DAG generator. Then, we abstract the tasklet-based DAG execution procedure and implement the tasklet-based DAG executor using C++20 coroutines. Finally, we introduce the SLO-guaranteed scheduler for scheduling tenants' tasklets across CPU cores. This scheduler guarantees tenants' SLOs with a token bucket model and improves resource utilization with an on-demand core adjustment strategy. We build Tao on an open-sourced relational database, Hyrise, and conduct extensive experimental studies to demonstrate its superiority over existing solutions.
Athena: An Effective Learning-based Framework for Query Optimizer Performance Improvement
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2025
GPH: An Efficient and Effective Perfect Hashing Scheme for GPU Architecture
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2025
Nezha: An Efficient Distributed Graph Processing System on Heterogeneous Hardware
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2025
DiskGNN: Bridging I/O Efficiency and Model Accuracy for Out-of-Core GNN Training
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2025
Graph neural networks (GNNs) are models specialized for graph data and widely used in applications. To train GNNs on large graphs that exceed CPU memory, several systems have been designed to store data on disk and conduct out-of-core processing. However, these systems suffer from either read amplification when conducting random reads for node features that are smaller than a disk page, or degraded model accuracy by treating the graph as disconnected partitions. To close this gap, we build DiskGNN for high I/O efficiency and fast training without model accuracy degradation. The key technique is offline sampling, which decouples graph sampling from model computation. In particular, by conducting graph sampling beforehand for multiple mini-batches, DiskGNN acquires the node features that will be accessed during model computation and conducts pre-processing to pack the node features of each mini-batch contiguously on disk to avoid read amplification for computation. Given the feature access information acquired by offline sampling, DiskGNN also adopts designs including four-level feature store to fully utilize the memory hierarchy of GPU and CPU to cache hot node features and reduce disk access, batched packing to accelerate feature packing during pre-processing, and pipelined training to overlap disk access with other operations. We compare DiskGNN with state-of-the-art out-of-core GNN training systems. The results show that DiskGNN has more than 8× speedup over existing systems while matching their best model accuracy. DiskGNN is open-source at https://github.com/Liu-rj/DiskGNN.
QOVIS: Understanding and Diagnosing Query Optimizer via a Visualization-assisted Approach
Proceedings of the VLDB Endowment (PVLDB, CCF-A), 2025
Efficient and Effective Algorithms for A Family of Influence Maximization Problems with A Matroid Constraint
Proceedings of the VLDB Endowment (PVLDB, CCF-A), 2025
nsDB: Architecting the Next Generation Database by Integrating Neural and Symbolic Systems (vision)
Proceedings of the VLDB Endowment (PVLDB, CCF-A), 2024
CGgraph: An Ultra-fast Graph Processing System on Modern Commodity CPU-GPU Co-processor
Proceedings of the VLDB Endowment (PVLDB, CCF-A), 2024
CoroGraph: Bridging Cache Efficiency and Work Efficiency for Graph Algorithm Execution
Proceedings of the VLDB Endowment (PVLDB, CCF-A), 2024
Marrying Top-k with Skyline Queries: Operators with Relaxed Preference Input and Controllable Output Size
ACM Transactions on Database Systems (TODS, CCF-A), 2024
How Does Software Prefetching Work on GPU Query Processing?
Proceedings of the 20th International Workshop on Data Management on New Hardware (DaMoN, CCF-A), 2024
LearnSC: An Efficient and Unified Learning-based Framework for Subgraph Counting Problem
IEEE International Conference on Data Engineering (ICDE, CCF-A), 2024
QSRP: Efficient Reverse k -Ranks Query Processing on High-dimensional Embeddings
IEEE International Conference on Data Engineering (ICDE, CCF-A), 2024
Fair Top-k Query on Alpha-Fairness
IEEE International Conference on Data Engineering (ICDE, CCF-A), 2024
Information Diffusion Meets Invitation Mechanism
International World Wide Web Conference (WWW, CCF-A), 2024
Debiasing Recommendation with Personal Popularity
International World Wide Web Conference (WWW, CCF-A), 2024
The Design of a Lossless Deduplication Scheme to Eliminate Fine-grained Redundancy for JPEG Image Storage Systems.
IEEE Transactions on Computers (TOC, CCF-A), 2024
CheetahTraj: Efficient Visualization for Large Trajectory Dataset with Quality Guarantee.
IEEE Transactions on Knowledge and Data Engineering (TKDE, CCF-A), 2024
CDSBen: Benchmarking the Performance of Storage Services in Cloud-native Database System at ByteDance.
Proceedings of the VLDB Endowment (PVLDB, CCF-A), 2023
Speeding Up End-to-end Query Execution via Learning-based Progressive Cardinality Estimation.
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2023
EAR-Oracle: On Efficient Indexing for Distance Queries between Arbitrary Points on Terrain Surface.
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2023
Effective and Efficient PageRank-based Positioning for Graph Visualization
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2023
QEVIS: Multi-grained Visualizing of Distributed Query Execution.
IEEE Transactions on Visualization and Computer Graphics (TVCG, CCF-A), 2023
Towards Building The Next Generation Computation Engine
Proceedings of the ACM Turing Award Celebration Conference-China (TURC, CCF-A), 2023
Quantifying the Competitiveness of a Dataset in Relation to General Preferences
The VLDB Journal (VLDBJ, CCF-A), 2023
Capacity Constrained Influence Maximization in Social Networks.
ACM International Conference on Knowledge Discovery and Data Mining (KDD, CCF-A), 2023
Efficient Approximation Algorithms for Spanning Centrality
ACM International Conference on Knowledge Discovery and Data Mining (KDD, CCF-A), 2023
DGI: An Easy and Efficient Framework for GNN Model Evaluation
ACM International Conference on Knowledge Discovery and Data Mining (KDD, CCF-A), 2023
EEPH: An Efficient Extendible Perfect Hashing for Hybrid PMem-DRAM.
IEEE International Conference on Data Engineering (ICDE, CCF-A), 2023
DHive: Query Execution Performance Analysis via Dataflow in Apache Hive.
Proceedings of the VLDB Endowment (PVLDB, CCF-A), 2023
Analyzing and Combating Attribute Bias for Face Restoration
International Joint Conference on Artificial Intelligence (IJCAI, CCF-A), 2023
Extracting Top-k Frequent and Diversified Patterns in Knowledge Graphs
IEEE Transactions on Knowledge and Data Engineering (TKDE, CCF-A), 2023
Data-Scarce Animal Face Alignment via Bi-Directional Cross-Species Knowledge Transfer.
Proceedings of ACM International Conference on Multimedia (ACM MM, CCF-A), 2023
Multi-domain Recommendation with Embedding Disentangling and Domain Alignment.
Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM, CCF-B), 2023
𝜏-LevelIndex: Towards Efficient Query Processing in Continuous Preference Space.
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2022
Spatial Data Quality in the IoT Era: Management and Exploitation.
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2022
GHive: A Demonstration of GPU-Accelerated Query Processing in Apache Hive.
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2022
Efficient and Error-bounded Spatiotemporal Quantile Monitoring in Edge Computing Environments.
Proceedings of the VLDB Endowment (PVLDB, CCF-A), 2022
Manu: A Cloud Native Vector Database Management System.
Proceedings of the VLDB Endowment (PVLDB, CCF-A), 2022
Constructing Compact Time Series Index for Efficient Window Query Processing.
IEEE International Conference on Data Engineering (ICDE, CCF-A), 2022
imDedup: a Lossless Deduplication Scheme to Eliminate Fine-grained Redundancy among Images.
IEEE International Conference on Data Engineering (ICDE, CCF-A), 2022
CheetahKG: A Demonstration for Core-based Top-k Frequent Pattern Discovery on Knowledge Graphs.
IEEE International Conference on Data Engineering (ICDE, CCF-A), 2022
Fast Error-Bounded Distance Distribution Computation(Extended Abstract).
IEEE International Conference on Data Engineering (ICDE, CCF-A), 2022
Face2Exp: Combating Data Biases for Facial Expression Recognition .
IEEE Conference on Computer Vision and Pattern Recognition (CVPR, CCF-A), 2022
Spatial Data Quality in the Internet of Things: Management, Exploitation, and Prospects.
ACM Computing Surveys, 55, 3, Article 57 (February 2022). (ACM Computing Surveys, CCF-A), 2022
GHive: Accelerating Analytical Query Processing in Apache Hive via CPU-GPU Heterogeneous Computing .
Proceedings of Symposium on Cloud Computing (SoCC, CCF-B), 2022
Measuring Friendship Closeness: A Perspective of Social Identity Theory.
Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM, CCF-B), 2022
Automatic Meta-Path Discovery for Effective Graph-Based Recommendation.
Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM, CCF-B), 2022
Rethinking the Use of Network Cycle in Time-Sensitive Networking (TSN) Flow Scheduling.
IFIP International Workshop on QoS (IWQoS, CCF-B), 2022
On m-Impact Regions and Standing Top-k Influence Problems.
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2021
Marrying Top-k with Skyline Queries: Relaxing the Preference Input while Producing Output of Controllable Size .
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2021
Vertex-Centric Visual Programming for Graph Neural Networks.
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2021
Fast Error-Bounded Distance Distribution Computation .
IEEE Transactions on Knowledge and Data Engineering (TKDE, CCF-A), 2021
GAIPS: Accelerating Maximum Inner Product Search with GPU.
ACM International Conference on Research and Development in Information Retrieval (SIGIR, CCF-A), 2021
Fast Core-based Top-k Frequent Pattern Discovery in Knowledge Graphs.
IEEE International Conference on Data Engineering (ICDE, CCF-A), 2021
Towards Efficient MaxBRNN Computation for Streaming Updates .
IEEE International Conference on Data Engineering (ICDE, CCF-A), 2021
GRAB: Finding Time Series Natural Structures via A Novel Graph-Based Scheme .
IEEE International Conference on Data Engineering (ICDE, CCF-A), 2021
On Discovering Motifs and Frequent Patterns in Spatial Trajectories with Discrete Frechet Distance.
GeoInformatica (GeoInformatica , CCF-B), 2021
RCELF: A Residual-based Approach for Influence Maximization Problem.
Information Systems (Inform Systems, None), 2021
Drug repurposing for the treatment of COVID-19: a knowledge graph approach.
Advanced Therapeutics (Advanced Therapeutics, None), 2021
Effective and Efficient Summarization for Non-hierarchical Data
Ivannikov Ispras Open Conference (ISPRAS) (ISPRAS, None), 2021
CheetahVIS: A Visual Analytical System for Large Urban Bus Data.
Proceedings of the VLDB Endowment (PVLDB, CCF-A), 2020
Draft and Edit: Automatic Storytelling Through Multi-Pass Hierarchical Conditional Variational Autoencoder .
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI, CCF-A), 2020
Towards Self-Tuning Parameter Servers
IEEE International Conference on Big Data (IEEE BigData, CCF-C), 2020
CheetahER: A Fast Entity Resolution System for Heterogeneous Camera Data .
SIGMOD2020 Programming Contest Finalist Paper (DI2KG, ), 2020
Creating Top Ranking Options in the Continuous Option and Preference Space.
Proceedings of the VLDB Endowment (PVLDB, CCF-A), 2019
Accelerating Exact Inner Product Retrieval by CPU-GPU System.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR, CCF-A), 2019
Vaite: a Visualization-Assisted Interactive Big Urban Trajectory Data Exploration System.
International Conference on Data Engineering (ICDE, CCF-A), 2019
Insufficient Data Can Also Rock!Learning to Converse Using Smaller Data with Augmentation.
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI, CCF-A), 2019
Find a Reasonable Ending for Stories: Does Logic Relation Help the Story Cloze Test?
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI, CCF-A), 2019
Fast Trajectory Range Query with Discrete Frechet Distance.
Advances in Database Technology (EDBT, CCF-B), 2019
Mining Heterogeneous Urban Data for Retail Store Placement .
Proceedings of the ACM Turing Celebration Conference-China (SIGMOD China, ), 2019
Exact Processing of Uncertain Top-k Queries in Multi-criteria Settings.
Proceedings of the VLDB Endowment (PVLDB, CCF-A), 2018
Deriving Real-time City Crowd Flows by Heterogeneous Big Urban Data.
IEEE International Conference on Big Data (IEEE BigData, CCF-C), 2018
Efficient Longest Streak Discovery in Multidimensional Sequence Data .
Web and Big Data: Second International Joint Conference (APWeb-WAIM, CCF-C), 2018
Joint Face Alignment and 3D Face Reconstruction with Application to Face Recognition
IEEE transactions on pattern analysis and machine intelligence (PAMI, ), 2018
A Five-layer Architecture for Big Data Processing and Analytics.
international Journal of big Data intelligence (IJBDI, ), 2018
Determining the Impact Regions of Competing Options in Preference Space.
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2017
Extracting Top-K Insights from Multi-dimensional Data.
Proceedings of the ACM on Management of Data (SIGMOD, CCF-A), 2017
Efficient Motif Discovery in Spatial Trajectories Using Discrete Fréchet Distance.
Extending Database Technology (EDBT, CCF-B), 2017