Introduction
RAG-Instructis a method for generating diverse and high-quality RAG instruction data. It synthesizes instruction datasets based on any source corpus, leveraging the following approaches:
- Five RAG paradigms, which represent diverse query-document relationships to enhance model generalization across tasks.
- Instruction simulation, which enriches instruction diversity and quality by utilizing the strengths of existing instruction datasets.
Using this approach, we constructed RAG-Instruct, covering a wide range of RAG scenarios and tasks.
Our RAG-Instruct-Llama3-3B is trained on RAG-Instruct data, which significantly enhances the RAG ability of LLMs, demonstrating remarkable improvements in RAG performance across various tasks.
Model | WQA (acc) | PQA (acc) | TQA (acc) | OBQA (EM) | Pub (EM) | ARC (EM) | 2WIKI (acc) | HotP (acc) | MSQ (acc) | CFQA (EM) | PubMed (EM) |
---|---|---|---|---|---|---|---|---|---|---|---|
Llama3.2-3B | 58.7 | 61.8 | 69.7 | 77.0 | 55.0 | 66.8 | 55.6 | 40.2 | 13.2 | 46.8 | 70.3 |
Llama3.2-3B + RAG-Instruct | 65.3 | 64.0 | 77.0 | 81.2 | 66.4 | 73.0 | 72.9 | 52.7 | 25.0 | 50.3 | 72.6 |
Usage
You can deploy it with tools like vllm or Sglang, or perform direct inference:
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the model and tokenizer
model = AutoModelForCausalLM.from_pretrained("FreedomIntelligence/RAG-Instruct-Llama3-3B",torch_dtype="auto",device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("FreedomIntelligence/RAG-Instruct-Llama3-3B")
# Example input
input_text = """### Paragraph:
[1] structure is at risk from new development...
[2] as Customs and Excise stores...
[3] Powis Street is partly underway...
...
### Instruction:
Which organization is currently using a building in Woolwich that holds historical importance?
"""
# Tokenize and prepare input
messages = [{"role": "user", "content": input_text}]
inputs = tokenizer(tokenizer.apply_chat_template(messages, tokenize=False,add_generation_prompt=True), return_tensors="pt").to(model.device)
# Generate output
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Citation
@misc{liu2024raginstructboostingllmsdiverse,
title={RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions},
author={Wanlong Liu and Junying Chen and Ke Ji and Li Zhou and Wenyu Chen and Benyou Wang},
year={2024},
eprint={2501.00353},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.00353},
}
- Downloads last month
- 62
Model tree for FreedomIntelligence/RAG-Instruct-Llama3-3B
Base model
meta-llama/Llama-3.2-3B