LibAlchemy: A Two-Layer Persistent Summary Design for Taming Third-Party Libraries in Static Bug-Finding Systems

Abstract

Despite the benefits of using third-party libraries (TPLs), the misuse of TPL functions raises quality and security concerns. Using traditional static analysis to detect bugs caused by TPL function misuse is non-trivial. One promising solution would be to automatically generate and persist the summaries of TPL functions offline and then reuse these summaries in compositional static analysis online. However, when dealing with millions of lines of TPL code, the summaries designed by existing studies suffer from an unresolved paradox. That is, a highly precise form of summary leads to an unaffordable space and time overhead, while an imprecise one seriously hurts its precision or recall. To address the paradox, we propose a novel two-layer summary design. The first layer utilizes a line-sized program representation known as the program dependence graph to compactly encode path conditions, while the second layer encodes bug-specific properties. We implemented our idea as a tool called LibAlchemy and evaluated it on fifteen mature and extensively checked open-source projects. Experimental results show that LibAlchemy can check over ten million lines of code within ten hours. LibAlchemy has detected 55 true bugs with a high precision of 90.16%, six of which have been assigned CVE IDs. Compared to whole-program analysis and the conventional design of path-sensitively precise summaries, LibAlchemy achieves an 18.56× and 12.77× speedup and saves 91.49% and 90.51% of memory usage, respectively.

Publication
In International Conference on Software Engineering
Rongxin Wu
Rongxin Wu
Associate Professor

I am currently an associate professor in the department of computer science and technology at Xiamen University. My research interests include software security, program analysis, and software engineering.

Yuxuan He
Yuxuan He
MSc, 2021–

My research interests include vulnerability detection and pointer analysis.

Jiafeng Huang
Jiafeng Huang
MSc, 2021–2024

My research interests include code cloning detection, third-party library vulnerability detection, and vulnerability API accessibility analysis.