Taiwan Words Translator 繁體中文台灣化翻譯器 by LLMs
https://github.com/SuJiaKuan/llm_tw_word
The model supports translation that converts text with China words to text with only Taiwan words. Example:
- Input:
這個軟件的質量真高啊
- Output:
這個軟體的品質真高啊
This Model
This model is fine-tuned from TinyLlama/TinyLlama-1.1B-Chat-v1.0 (by applying Instruction Finetuning). The dataset is collected from MBZUAI/Bactrian-X and automatically labeled by 繁化姬.
How to use
You can follow the example usage below, or see here to know how to integrate the model into a Python class.
import torch
from transformers import pipeline
SYSTEM_PROMPT = """\
對於輸入內容的中文文字,請將中國用語轉成台灣的用語,其他非中文文字或非中國用語都維持不變。
範例:
Input: ```這個視頻的質量真高啊```
Output: ```這個影片的品質真高啊```\
"""
text_trad = "這個軟件的質量真高啊"
pipeline = pipeline(
"text-generation",
model="feabries/TaiwanWordTranslator-v0.1",
torch_dtype=torch.bfloat16,
device_map="auto",
)
prompt = "Input: ```{}```".format(text_trad)
messages = [{
"role": "system",
"content": SYSTEM_PROMPT,
}, {
"role": "user",
"content": prompt,
}]
input_text = pipeline.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
outputs = pipeline(
input_text,
do_sample=False,
max_new_tokens=2048,
)
print(outputs[0]["generated_text"])
# <|system|>
# 對於輸入內容的中文文字,請將中國用語轉成台灣的用語,其他非中文文字或非中國用語都維持不變。
#
# 範例:
# Input: ```這個視頻的質量真高啊```
# Output: ```這個影片的品質真高啊```</s>
# <|user|>
# Input: ```這個軟件的質量真高啊```</s>
# <|assistant|>
# Output: ```這個軟體的品質真高啊```
- Downloads last month
- 32
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.