We fine-tuned our ChemGPT2-QA-72B based on the Qwen2-72B-Instruct model. Our training data, ChemGPT-2.0-Data, has been open-sourced and is available at https://huggingface.co/datasets/ALmonster/ChemGPT-2.0-Data. We evaluated our model on the three chemistry tasks of C-Eval and compared it with GPT-3.5 and GPT-4. The results are as follows:
C-Eval
Models | college_chemistry | high_school_chemistry | middle_school_chemistry | AVG |
---|---|---|---|---|
GPT-3.5 | 0.397 | 0.529 | 0.714 | 0.54666667 |
GPT4 | 0.594 | 0.558 | 0.811 | 0.65433333 |
chemgpt | 0.71 | 0.936 | 0.995 | 0.88033333 |
Quickstart
Here provides a code snippet with apply_chat_template
to show you how to load the tokenizer and model and how to generate contents.
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto
model = AutoModelForCausalLM.from_pretrained(
"ALmonster/ChemGPT2-QA-72B",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("ALmonster/ChemGPT2-QA-72B")
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
VLLM
We recommend deploying our model using 4 A100 GPUs. You can run the vllm server-side with the following code in terminal:
python -m vllm.entrypoints.openai.api_server --served-model-name chemgpt --model path/to/chemgpt --gpu-memory-utilization 0.98 --tensor-parallel-size 4 --port 6000
Then, you can use the following code to deploy client-side:
import requests
import json
def general_chemgpt_stream(inputs,history):
url = 'http://loaclhost:6000/v1/chat/completions'
history+=[{"role": "user", "content": inputs},]
data = {
"model": "chemgpt",
"messages": history,
}
headers = {
'Content-Type': 'application/json'
}
response = requests.post(url, headers=headers, data=json.dumps(data))
headers = {"User-Agent": "vLLM Client"}
pload = {
"model": "chemgpt",
"stream": True,
"messages": history
}
response = requests.post(url,
headers=headers,
json=pload,
stream=True)
for chunk in response.iter_lines(chunk_size=1,
decode_unicode=False,
delimiter=b"\n"):
if chunk:
string_data = chunk.decode("utf-8")
try:
json_data = json.loads(string_data[6:])
delta_content = json_data["choices"][0]["delta"]["content"]
assistant_reply+=delta_content
yield delta_content
except KeyError as e:
delta_content = json_data["choices"][0]["delta"]["role"]
except json.JSONDecodeError as e:
history+=[{
"role": "assistant",
"content": assistant_reply,
"tool_calls": []
},]
delta_content='[DONE]'
assert '[DONE]'==chunk.decode("utf-8")[6:]
inputs='介绍一下NaoH'
history_chem=[]
for response_text in general_chemgpt_stream(inputs,history_chem):
print(response_text,end='')
- Downloads last month
- 17