個人簡介:

王焱林,助理教授,碩士生導師。20227月入選必贏3003no1線路檢測中心百人計劃,加入必贏3003no1線路檢測中心。加入必贏3003no1線路檢測中心前,于微軟亞洲研究院擔任主管研究員。2014年至2019年就讀于香港大學計算機科學系,師從Bruno Oliveira教授,獲得博士學位。2010年至2014年就讀于浙江大學,獲學士學位。主要研究方向為大模型驅動的智能化軟件工程、大模型技術。5年來在國際會議和期刊共發(fā)表30余篇論文,發(fā)表于ICSE、ASE、AAAI、ACLKDD、TKDE、EMSE、EMNLP、CIKMICSME等軟件工程、人工智能、自然語言處理等領域CCF A/B類頂級會議與期刊。

主頁https://yanlin.info/ 

Google Scholar: https://scholar.google.com/citations?hl=en&user=xSCoImAAAAAJ 

 

研究方向

- 大模型驅動的智能化軟件工程包括代碼搜索、代碼補全、代碼生成、代碼摘要、提交消息生成等。

- 大模型技術:包括大模型評測、大模型安全、大模型記憶機制、大模型訓練和高效微調等。

 

研究與招生:

每年招收1至3名推免與統考碩士研究生。組內同學們擁有多種形式的海內外學術交流訪問機會,同時課題組為表現優(yōu)異的本科生、碩士生積極聯絡國內外各大互聯網公司的實習與工作內推機會。歡迎有意向同學與我聯系。同時歡迎有志于繼續(xù)深造的必贏3003no1線路檢測中心大二、大三本科生與我聯系,早日接觸科研工作,發(fā)表高質量論文。

郵箱:wangylin36@mail.sysu.edu.cn

 

學術服務:

期刊會議審稿人: TKDE, TNNLS, CSUR, JSS, JCST, AUSE, FSE industry track, ICSME industry track, FSE IVR, Coling, Scala等國際著名期刊和會議。

 

教學活動:

SSE405 軟件源代碼分析  本科專業(yè)必修課

SSE407軟件源代碼分析  本科專業(yè)必修課

 

代表性論文(CCF A*20):

  • When to Stop? Towards Efficient Code Generation in LLMs with Excess Token Prevention 

    Lianghong Guo, Yanlin Wang*(通訊), Ensheng Shi, Wanjun Zhong, Hongyu Zhang, Jiachi Chen, Ruikai Zhang, Yuchi Ma, Zibin Zheng.

    In The ACM SIGSOFT International Symposium on Software Testing and Analysis. (ISSTA 2024, CCF-A).

  • RLCoder: Reinforcement Learning for Repository-Level Code Completion

    Yanlin Wang, Yanli Wang, Daya Guo, Jiachi Chen, Ruikai Zhang, Yuchi Ma, Zibin Zheng.

    In International Conference on Software Engineering. (ICSE 2025, CCF-A).

  • KADEL: Knowledge-Aware Denoising Learning for Commit Message Generation

    Wei Tao, Yucheng Zhou, Yanlin Wang*(共通), Hongyu Zhang, Haofen Wang, Wenqiang Zhang*

     In ACM Transactions on Software Engineering and Methodology 2024. (TOSEM 2024, CCF-A).

  • Demystifying and Detecting Cryptographic Defects in Ethereum Smart Contracts

    Jiashuo Zhang, Yiming Shen, Jiachi Chen, Jianzhong Su, Yanlin Wang, Ting Chen, Jianbo Gao, Zhong Chen

    In International Conference on Software Engineering. (ICSE 2025, CCF-A).

  • Hyperion: Unveiling DApp Inconsistencies using LLM and Dataflow-Guided Symbolic Execution

    Shuo Yang, Xingwei Lin, Jiachi Chen, Qingyuan Zhong, Lei Xiao, Renke Huang, Yanlin Wang, Zibin Zheng

    In International Conference on Software Engineering. (ICSE 2025, CCF-A).

  • SoTaNa: The Open-Source Software Development Assistant

    Ensheng Shi, Fengji Zhang, Yanlin Wang, Bei Chen, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, Hongbin Sun

  • Beyond Functional Correctness: Investigating Coding Style Inconsistencies in Large Language Models

    Yanlin Wang, Tianyue Jiang, Mingwei Liu, Jiachi Chen, Zibin Zheng

  • CoSQA+: Enhancing Code Search Dataset with Matching Code

    Jing Gong, Yanghui Wu, Linxi Liang, Zibin Zheng, Yanlin Wang*(通訊)

  • Towards an Understanding of Large Language Models in Software Engineering Tasks

    Chong Chen, Jianzhong Su, Jiachi Chen, Yanlin Wang, Tingting Bi, Yanli Wang, Xingwei Lin, Ting Chen, Zibin Zheng

  • A Survey of Large Language Models for Code: Evolution, Benchmarking, and Future Trends

    Zibin Zheng, Kaiwen Ning, Yanlin Wang(通訊), Jingwen Zhang, Dewu Zheng, Mingxi Ye, Jiachi Chen

  • When ChatGPT Meets Smart Contract Vulnerability Detection: How Far Are We?

    Zibin Zheng, Kaiwen Ning, Jiachi Chen, Yanlin Wang, Wenqing Chen, Lianghong Guo, Weicheng Wang

  • YODA: Teacher-Student Progressive Learning for Language Models

    Jianqiao Lu, Wanjun Zhong, Yufei Wang, Zhijiang Guo, Qi Zhu, Wenyong Huang, Yanlin Wang, Fei Mi, Baojun Wang, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu

  • Tackling Long Code Search with Splitting, Encoding, and Aggregating

    Fan Hu, Yanlin Wang*(通訊), Lun Du, Hongyu Zhang, Dongmei Zhang and Xirong Li

    In International Conference on Computational Linguistics, Language Resources and Evaluation. (LREC-COLING 2024, CCF-B).

  • AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models

    Wanjun Zhong, Ruixiang Cui, Yiduo Guo, Yaobo Liang, Shuai Lu, Yanlin Wang, Amin Saied, Weizhu Chen, Nan Duan

  • MoonBit: Explore the Design of an AI-Friendly Programming Language

    Haoxiang Fei, Yu Zhang, Hongbo Zhang, Yanlin Wang, Qing Liu

    IIn LLM4Code Workshop 2024.

  • Adaptive-Solver Framework for Dynamic Strategy Selection in Large Language Model Reasoning

    Jianpeng Zhou, Wanjun Zhong, Yanlin Wang, Jiahai Wang

  • An Empirical Study on Low Code Programming using Traditional vs Large Language Model Support.

    Yongkun Liu, Jiachi Chen, Tingting Bi, John Grundy, Yanlin Wang, Ting Chen, Yutian Tang, Zibin Zheng

  • MemoryBank: Enhancing Large Language Models with Long-Term Memory

    Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, Yanlin Wang(通訊)

     In Proceedings of the 37th AAAI Conference on Artificial Intelligence. (AAAI 2024, CCF-A).

  • SparseCoder: Identifier-Aware Sparse Transformer for File-Level Code Summarization

    Yanlin Wang, Yanxian Huang, Daya Guo, Hongyu Zhang and Zibin Zheng

    The IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2024, CCF-B).

  • Towards Efficient Fine-tuning of Pre-trained Code Models: An Experimental Study and Beyond

    Ensheng Shi, Yanlin Wang*(共同通訊), Hongyu Zhang, Lun Du, Shi Han, Dongmei Zhang, Hongbin Sun*

    In The ACM SIGSOFT International Symposium on Software Testing and Analysis. (ISSTA 2023, CCF-A).

  • Re2BERT: A Two-stage Pre-trained Framework for Automatic Rename Refactoring

    Hao Liu, Yanlin Wang, Zhao Wei, Yong Xu, Juhong Wang, Hui Li, Rongrong Ji

    In The ACM SIGSOFT International Symposium on Software Testing and Analysis. (ISSTA 2023, CCF-A).

  • DeFiTainter: Detecting Price Manipulation Vulnerabilities in DeFi Protocols

    Queping Kong, Jiachi Chen, Yanlin Wang, Zigui Jiang, Zibin Zheng

    In The ACM SIGSOFT International Symposium on Software Testing and Analysis. (ISSTA 2023, CCF-A).

  • Toward Automated Detecting Unanticipated Price Feed in Smart Contract

    Yifan Mo, Jiachi Chen, Yanlin Wang, Zibin Zheng

    In The ACM SIGSOFT International Symposium on Software Testing and Analysis. (ISSTA 2023, CCF-A).

  • Snippet Comment Generation Based on Code Context Expansion

    Hanyang Guo, Xiangping Chen, Yuan Huang, Yanlin Wang, Xi Ding, Zibin Zheng, Xiaocong Zhou, Hong-Ning Dai

    In ACM Transactions on Software Engineering and Methodology (TOSEM 2023, CCF-A)

  • CoCoAST: Representing Source Code via Hierarchical Splitting and Reconstruction of Abstract Syntax Trees

    Ensheng Shi#, Yanlin Wang#, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, Hongbin Sun

    In Empirical Software Engineering (EMSE 2023, CCF-B)

  • You Augment Me: Exploring ChatGPT-based Data Augmentation for Semantic Code Search

    Yanlin Wang, Lianghong Guo, Ensheng Shi, Wenqing Chen, Jiachi Chen, Wanjun Zhong, Menghan Wang, Hui Li, Ziyu Lyu, Hongyu Zhang and Zibin Zheng

    In 39th IEEE International Conference on Software Maintenance and Evolution. (ICSME 2023, CCF-B)

  • PrivateRec: Differentially Private Model Training and Online Serving for Federated News Recommendation

    Ruixuan Liu, Yang Cao, Yanlin Wang, Lingjuan Lyu, Yun Chen, Hong Chen.

    In KDD 2023 Applied Data Science (KDD 2023, CCF-A).

  • Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing

    Lunyiu Nie, Jiuding Sun, Yanlin Wang(通訊), Lun Du, Lei Hou, Juanzi Li, Shi Han, Dongmei Zhang, Jidong Zhai

    In Proceedings of the 36th AAAI Conference on Artificial Intelligence. (AAAI 2023, CCF-A).

  • CoCoSoDa: Effective Contrastive Learning for Code Search

    Ensheng Shi, Yanlin Wang*(共通), Wenchao Gu, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, Hongbin Sun*

    In proceedings of the 45th IEEE/ACM International Conference on Software Engineering (ICSE 2023, CCF-A).

  • A large-scale empiricalstudy of commit message generation: models, datasets and evaluation

    Wei Tao, Yanlin Wang*(通訊), Ensheng Shi, Lun Du, Shi Han, Hongyu Zhang, Dongmei Zhang, Wenqiang Zhang

    In Empirical Software Engineering. (EMSE 2022, CCF-B).

  • Revisiting Code Search in a Two-Stage Paradigm

    Fan Hu, Yanlin Wang*(通訊), Lun Du, Xirong Li, Hongyu Zhang, Shi Han, Dongmei Zhang

    In 15th ACM International WSDM Conference. (WSDM 2023, CCF-B).

  • RACE: Retrieval-Augmented Commit Message Generation

    Ensheng Shi, Yanlin Wang*(共通), Wei Tao, Lun Du, hongyu Zhang, Shi Han, Dongmei Zhang and Hongbin Sun*

    In The 2022 Conference on Empirical Methods in Natural Language Processing. (EMNLP 2022).

  • Exploring Representation-level Augmentation for Code Search

    Haochen Li, Chunyan Miao, Yanxian Huang, Yuan Huang, Hongyu Zhang and Yanlin Wang*(通訊)

    In The 2022 Conference on Empirical Methods in Natural Language Processing. (EMNLP 2022).

  • No One Left Behind: Inclusive Federated Learning over Heterogeneous Devices

    Ruixuan Liu, Fangzhao Wu, Chuhan Wu, Yanlin Wang, Lingjuan Lyu, Hong Chen, Xing Xie

    In ACM SIGKDD 2022 Applied Data Science Track. (KDD 2022).

  • Accelerating Code Search with Deep Hashing and Code Classification

    Wenchao Gu, Yanlin Wang*(通訊), Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, Michael Lyu

    In 60th Annual Meeting of the Association for Computational Linguistics. (ACL 2022).

  • UniXcoder: Unified Cross-Modal Pre-training for Code Representation

    Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang*(通訊), Ming Zhou, Jian Yin

    In 60th Annual Meeting of the Association for Computational Linguistics. (ACL 2022).

  • On the Evaluation of Neural Code Summarization

    Ensheng Shi, Yanlin Wang*(共通), Lun Du, Junjie Chen, Shi Han, Hongyu Zhang, Dongmei Zhang, Hongbin Sun*

    In International Conference on Software Engineering. (ICSE 2022).

  • LibDB: An Effective and Efficient Framework for Detecting Third-Party Libraries in Binaries

    Wei Tang, Yanlin Wang*(通訊), Hongyu Zhang, Shi Han, Ping Luo, Dongmei Zhang

    In Mining Software Repositories 2022. (MSR 2022).

  • CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees

    Ensheng Shi#, Yanlin Wang#, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, Hongbin Sun

    In 2021 Conference on Empirical Methods in Natural Language Processing. (EMNLP 2021).

  • Is a Single Model Enough? MuCoS: A Multi-Model Ensemble Learning for Semantic Code Search

    Lun Du, Xiaozhou Shi, Yanlin Wang, Ensheng Shi, Shi Han, Dongmei Zhang

    In 30th ACM International Conference on Information and Knowledge Management. (CIKM 2021).

  • On the Evaluation of Commit Message Generation Models: An Experimental Study

    Wei Tao, Yanlin Wang*(通訊), Ensheng Shi, Lun Du, Shi Han, Hongyu Zhang, Dongmei Zhang and Wenqiang Zhang

    In 37th International Conference on Software Maintenance and Evolution. (ICSME 2021).

  • Code Completion by Modeling Flattened Abstract Syntax Trees as Graphs.

    Yanlin Wang and Hui Li

    In Proceedings of the 35th AAAI Conference on Artificial Intelligence. (AAAI 2021).

  • CoCoSUM: Contextual Code Summarization with Multi-Relational Graph Neural Network

    Yanlin Wang, Ensheng Shi, Lun Du, Xiaodi Yang, Yuxuan Hu, Shi Han, Hongyu Zhang, Dongmei Zhang

    In Microsoft Research Technical Report.

  • Multi-task Learning for Recommendation over Heterogeneous Information Network.

    Hui Li, Yanlin Wang*(通訊), Ziyu Lyu and Jieming Shi

    In IEEE Transactions on Knowledge and Data Engineering (TKDE 2020).

  • FHJ: A Formal Model for Hierarchical Dispatching and Overriding.

    Yanlin Wang, Haoyuan Zhang, Bruno C. d. S. Oliveira and Marco Servetto

    In Proceedings of the 32nd European Conference on Object-Oriented Programming. (ECOOP 2018).

  • Classless Java.

    Yanlin Wang, Haoyuan Zhang, Marco Servetto and Bruno C. d. S. Oliveira

    In International Conference on Generative Programming: Concepts and Experiences. (GPCE 2016).

  • The Expression Problem, Trivially!

    Yanlin Wang, Bruno C. d. S. Oliveira

    In Proceedings of the 15th International Conference on Modularity. (Modularity 2016Best Paper Award).

  • Product Lines of Interpreters Using Truffle with Object Algebras.

    Yanlin Wang, Tomas Tauber and Bruno C. d. S. Oliveira

    In Proceedings of the 1st Truffle/Graal Languages Workshop, 29th European Conference on Object-Oriented Programming. (Truffle@ECOOP 2015).

 

學生發(fā)表論文情況:

研究生:

  • 伍嘉棟 (22級碩士): [Internetware'23]
  • 黃炎賢 (22級碩士): [EMNLP'22] [SANER'24]
  • 郭亮宏 (23級碩士): [AAAI'24] [ICSME'24] [ISSTA'24]
  • 王巖立 (23級碩士): [ICSE'25] [BlockSys'23]

本科生

  • 李一凡 (20級本科生): [Internetware'24]

聯合指導學生

  • 石恩升 (19級博士): (4*CCF-A, 8*CCF-B) 12篇論文
  • 陶瑋 (19級博士): (1*CCF-A, 5*CCF-B, 1*SCI) 7篇論文 

 

項目:

2023 - 2024:CCF-華為胡楊林基金,主持

2023 - 2024:必贏3003no1線路檢測中心青年教師培育項目,主持

2023 - 2023:智能審計系統,主持