|
曾立
Li Zeng, senior algorithm researcher and programmer in computer science and technology.
Email: bookug AT qq DOT com
GitHub,
Google Scholar
|
Biography
|
I obtained my B.S. degree from PKU in 2016, and studied as a master for two years.
Development and optimization of graph database system gStore is my main work in this period.
Later, I finished Ph.D. in 3 years by researching the optimization of graph algorithms on heterogeneous CPU-GPU platforms.
My current research interests include graph computing, vector database, LLM acceleration and maintenance.
|
Publications
|
|
Li Zeng, et al. WindGP: Efficient Graph Partitioning on Heterogenous Machines.
arXiv, 2024.
|
Li Zeng (second author). LocMoE: A Low-Overhead MoE for Large Language Model Training.
IJCAI, 2024.
|
Li Zeng, et al. KBQA: Accelerate Fuzzy Path Query on Knowledge Graph.
DEXA, 2023.
|
Yu Gao, Meng Qin, Yibin Ding, Li Zeng, et al. RaftGP: Random Fast Graph Partitioning.
IEEE High Performance Extreme Computing, 2023.
(GraphChallenge Innovation Award)
|
Zhijie Sun, Jing Li, Jun Xie, Binfan Zheng, Li Zeng, et al. indexPDT: A High Scalable Distributed Classification Approach with Novel Cache Structure for Geo-location.
HPCC, 2023.
|
Li Zeng, Lei Zou, M. Tamer Özsu. SGSI: A scalable GPU-friendly Subgraph Isomorphism Algorithm.
IEEE Transactions on Knowledge and Data Engineering, 2022.
(CCF A journal)
|
Li Zeng, et al. HTC: Hybrid vertex-parallel and edge-parallel Triangle Counting.
IEEE High Performance Extreme Computing, 2022.
(GraphChallenge Innovation Award)
|
Li Zeng, et al. SQLG+: Efficient k-hop Query Processing on RDBMS.
International Conference on Database Systems for Advanced Applications, 2022.
(CCF B conference)
|
Li Zeng, Yan Jiang, Weixin Lu, Lei Zou. Deep Analysis on Subgraph Isomorphism.
arXiv, 2021. [pdf]
|
Li Zeng, Lei Zou, M. Tamer Özsu, et al. GSI: GPU-friendly Subgraph Isomorphism.
IEEE International Conference on Data Engineering, 2020. [pdf]
(CCF A conference)
|
Fan Zhang, Lei Zou, Li Zeng, Xiangyang Gou. Dolha - an efficient and exact data structure for streaming graphs. The journal of World Wide Web, 2020.
(CCF B journal)
|
Li Zeng, Lei Zou. Redesign of the gStore system.
Frontier of Computer Science, 2018. [pdf] (CCF C journal)
|
Yu Zhang, Li Zeng, Lei Zou. Regular Path Queries on Large Graph Data.
Natural Language Processing and Chinese Computing, 2018. (CCF C conference)
|
张雨,曾立,邹磊。大规模图数据的正则路径查询。北京大学学报自然科学版,2018。
|
|
|
Projects
|
Improvement of Graph Machine Learning System
- Distributed data loading of hybrid features: Redesign the architecture of data loading module to support hybrid features on node/edge, optimize the parsing of feature string and yield >2× speedup.
- Distributed real-time sampling in data loading: Create memory clip module. During data loading, if memory is not enough, a clipping will be performed on graph store. The sampling function used in clipping is defined by users.
Contribution: reduce the memory consumption greatly (31%) with little decrease in model effect (1%).
|
Acceleration of Large Scale Graph Algorithms
- Survey and optimization of subgraph isomorphism on CPU: survey the best solutions of subgraph isomorphism on CPU and propose four general techniques for improvement
- Acceleration of subgraph isomorphism on GPU: novel data structure and join algorithm, >10× speedup
- Acceleration of other graph algorithms on GPU: optimize solutions of shortest path and triangle counting, both achieve >2× speedup
|
Development of Graph Database System
As the primary developer of graph database gStore, cumulatively update 5,000,000 lines of code, greatly improve the performance (100×) and scale (40×).
Meanwhile, accumulate experience of leading a team (>10 people), make them qualified for every module respectively.
- Optimization of query plans: accelerate SPARQL queries, support predicate variable and property path
- Improvement of indices: continually optimize the disk-based key-value indices (specially designed for gStore)
- Others: improve user interface, create web server, standardize the development, add documents of design and usage
|
|
Recommended GitHub Repositories
|
|
- PaperNotes: collect papers, write notes and search quickly based on multiple tags
- LinuxProgramming: basic knowledge and some advanced topics about Linux Programming
- GraphBenchmark: a benchmark for generating all kinds of graphs and queries
- SIEP: the state-of-the-art subgraph matching algorithms on CPU
- GSI: subgraph isomorphism on GPU, also see my implementation of GunrockSM, GpSM, gutil
- gStore: graph database system
|
|
|
Last modified: Jun 1, 2024
|
粤ICP备2022011832号-1
|