Huawei Cloud GaussDB Takes Center Stage at VLDB 2025 with Multiple Papers Accepted
Sep 12, 2025
Recently, the prestigious international database conference, VLDB 2025, was held in London, UK. The event brought together leading experts and scholars from around the world to share cutting-edge research findings and discuss how databases will evolve in the future. Huawei had 18 papers accepted and delivered impressive presentations at multiple sessions.
Wang Lei, Director of Huawei Gauss Laboratory, delivered a keynote titled Unified AI-Native & Cloud-Native Database Platform for the Data + AI Era. He highlighted GaussDB's innovative integration with emerging technologies like AI and cloud native. Wang noted that GaussDB has been harnessing these technologies to develop an intelligent O&M system with automated monitoring, diagnosis, and tuning. The platform has introduced vector databases and smart Q&A systems, leveraging cloud native technologies to implement a transparent multi-write architecture and intelligent routing. Looking ahead, GaussDB aims to create an integrated data platform to help enterprises develop integrated digital and intelligent applications, empowering digital and intelligent transformation across industries.
Among the selected papers, two standouts from GaussDB were GaussDB-Vector: A Large-Scale Persistent Real-Time Vector Database for LLM Applications and GRewriter: Practical Query Rewriting with Automatic Rule Set Expansion in GaussDB.
These papers delve into innovation in vector databases and query rewriting, both of which are key database technologies.
Paper 1: GaussDB-Vector: A Large-Scale Persistent Real-Time Vector Database for LLM Applications
Vector databases have become essential tools for addressing the limitations of large language models (LLMs) and have seen widespread adoption. However, existing vector databases either cater only to niche use cases requiring low-latency in-memory search, or trade off performance for comprehensive data management.
To address these limitations, this paper proposed GaussDB-Vector, a high-performance, real-time persistent vector database, that excels in low-latency scalable search, real-time INSERTs and DELETEs, high availability, large-scale distributed search, and hybrid scalar-vector filtered search. Specifically designed for graph-based vector indexing, these features leverage an innovative storage architecture optimized for I/O operations, adapting to various dataset sizes and dimensions. GaussDB-Vector also provides novel buffering strategies to further minimize I/O overhead. In addition, to further accelerate queries, GaussDB-Vector supports product quantization, parallel search, and hardware acceleration via SIMD, GPUs, and NPUs. Experimental results show that GaussDB-Vector outperforms competitive baselines by a factor of 1 to 5.
Paper 2: GRewriter: Practical Query Rewriting with Automatic Rule Set Expansion in GaussDB
Rewriting complex queries faster is critical for database systems. GaussDB has been experiencing limited extensibility of its existing query rewriter. This doesn't just make it hard to identify generic, broadly applicable rewrites. It is also quite difficult to program them into the system.
This paper presents GRewriter, GaussDB's new extensible query rewriter. GRewriter sits atop the existing optimizer stack to explore useful rewrites, allowing a variety of rules to work together and be selected on a per-query basis. A new rule language, G-DSL, is used to express rewrite rules so that the rewrite engine is not coupled with specific rules. To improve rewrite efficiency, a new rule index structure and a rewrite history cache have been introduced.
Rules in GRewriter are produced by an offline rule generator. With an innovative enumeration technology and a new equivalence theorem, our rule generator can efficiently identify formally verified rules that are much more expressive than prior research prototypes. For operational convenience, GRewriter also supports manual rule authoring and interactive rule management through familiar SQL interfaces.
GRewriter has been integrated into GaussDB and is gradually being rolled out to customers. GRewriter equips GaussDB with over 100 rules with negligible performance overhead (less than 1%). These new rewrite rules have enhanced query performance for two key applications, an ERP system and a banking transaction system, reducing production query latency by 99.9% from 26 seconds to just 17 milliseconds.
During the conference, the first Industrial Data Systems Research (IDSR) Workshop was successfully held. It was organized by the Technical University of Crete, Athena Research Center, Tsinghua University, Microsoft Research, and Huawei Cloud and sponsored by Huawei. The workshop gathered leading industry researchers and practitioners to discuss the latest trends, research findings, current challenges, and future research directions in the field of industrial data systems.
Over the years, GaussDB has steadily increased its influence in the global academic and industrial communities through comprehensive collaboration between industry, academic, and research institutions; and continuous international engagement. Its presence at this event drew significant attention and discussion. Moving forward, Huawei will continue to join hands with global partners to upgrade database technologies and cultivate a robust industrial ecosystem, laying a solid foundation for the digital and intelligent transformation of businesses.