Int J Performability Eng ›› 2024, Vol. 20 ›› Issue (7): 460-467.doi: 10.23940/ijpe.24.07.p6.460467

Previous Articles    

A Two-Stage Code Generation Method using Large Language Models

Dapeng Zhaoa,* and Tongcheng Gengb   

  1. aCyber Security Bureau of the Ministry of Public Security, Beijing, China;
    bDepartment of Information and Network Security, The State Information Center, Beijing, China
  • Submitted on ; Revised on ; Accepted on
  • Contact: *E-mail address: zdp2019@foxmail.com

Abstract: Large language models are capable of generating source code in a zero-shot manner to develop programs that meet user functional requirements. However, when faced with scenarios involving complex business requirements, the generated source code may fail to satisfy user needs. Addressing the challenge of understanding software requirements, we propose a two-stage code generation approach. Initially, the large language model generates pseudocode based on the user’s functional requirements, refined through an iterative process with user feedback. Subsequently, the model generates source code based on the finalized pseudocode. We conducted empirical studies on an open code generation dataset, and experimental results with models such as GPT-4, Claude Sonnect 3, and Geminipro 1.5 demonstrate that our method outperforms zero-shot prompt learning in scenarios with complex user requirements, with improvements in PASS@K reaching up to 15%.

Key words: large language models, two-stage code generation, zero-shot prompt learning