Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines
This repository contains the models and datasets used in the paper "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines".
Models
The ckpt
folder contains 16 LoRA adapters that were fine-tuned for this research:
- 6 Basic Executors
- 3 Executor Composers
- 7 Aligners
The base model used for fine-tuning all of the above is LLaMA 3.1-8B.
Datasets
The datasets used for evaluating all models can be found in the datasets/raw
folder.
Usage
Please refer to GitHub page for details.
Citation
If you use CAEF for your research, please cite our paper:
@misc{lai2024executing,
title={Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines},
author={Junyu Lai and Jiahe Xu and Yao Yang and Yunpeng Huang and Chun Cao and Jingwei Xu},
year={2024},
eprint={2410.07896},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2410.07896},
}