Skip to content

HICAI-ZJU/K-MSE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Boosting LLM’s Molecular Structure Elucidation with Knowledge Enhanced Tree Search Reasoning

This repository is the official implementation of K-MSE, which is accepted by ACL 2025.

🌟 Overview

In this work, we introduce a Knowledge-enhanced reasoning framework for Molecular Structure Elucidation (K-MSE), leveraging Monte Carlo Tree Search for test-time scaling as a plugin. Specifically, we construct an external molecular substructure knowledge base to extend the LLMs' coverage of the chemical structure space. Furthermore, we design a specialized molecule-spectrum scorer to act as a reward model for the reasoning process, addressing the issue of inaccurate solution evaluation in LLMs.

🛠️ Environment

The environment setup is detailed in the environment.yml file.

🚀 QuickStart

  • The implementation of K-MSE is in the k-mse folder.
  • The baseline implementation is in the baseline folder.

🌻 Acknowledgement

We gratefully acknowledge the use of code from the following projects: MolPuzzle and MCTSr.

📝 Citation

Please cite this work as follows:

@inproceedings{zhuang-etal-2025-boosting,
    title = "Boosting {LLM}{'}s Molecular Structure Elucidation with Knowledge Enhanced Tree Search Reasoning",
    author = "Zhuang, Xiang  and
      Wu, Bin  and
      Cui, Jiyu  and
      Feng, Kehua  and
      Li, Xiaotong  and
      Xing, Huabin  and
      Ding, Keyan  and
      Zhang, Qiang  and
      Chen, Huajun",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.acl-long.1100/",
    pages = "22561--22576",
    ISBN = "979-8-89176-251-0",
    abstract = "Molecular structure elucidation involves deducing a molecule{'}s structure from various types of spectral data, which is crucial in chemical experimental analysis. While large language models (LLMs) have shown remarkable proficiency in analyzing and reasoning through complex tasks, they still encounter substantial challenges in molecular structure elucidation. We identify that these challenges largely stem from LLMs' limited grasp of specialized chemical knowledge. In this work, we introduce a Knowledge-enhanced reasoning framework for Molecular Structure Elucidation (K-MSE), leveraging Monte Carlo Tree Search for test-time scaling as a plugin. Specifically, we construct an external molecular substructure knowledge base to extend the LLMs' coverage of the chemical structure space. Furthermore, we design a specialized molecule-spectrum scorer to act as a reward model for the reasoning process, addressing the issue of inaccurate solution evaluation in LLMs. Experimental results show that our approach significantly boosts performance, particularly gaining more than 20{\%} improvement on both GPT-4o-mini and GPT-4o."
}

About

[ACL 2025] Boosting LLM’s Molecular Structure Elucidation with Knowledge Enhanced Tree Search Reasoning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors