Safetensors

Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines

This repository contains the models and datasets used in the paper "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines".

Models

The ckpt folder contains 16 LoRA adapters that were fine-tuned for this research:

  • 6 Basic Executors
  • 3 Executor Composers
  • 7 Aligners

The base model used for fine-tuning all of the above is LLaMA 3.1-8B.

Datasets

The datasets used for evaluating all models can be found in the datasets/raw folder.

Usage

Please refer to GitHub page for details.

Citation

If you use CAEF for your research, please cite our paper:

@misc{lai2024executing,
      title={Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines}, 
      author={Junyu Lai and Jiahe Xu and Yao Yang and Yunpeng Huang and Chun Cao and Jingwei Xu},
      year={2024},
      eprint={2410.07896},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2410.07896}, 
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .