RWKV EagleX 7B v2 Model

!Important!: This is not meant to be used with huggingface transformers library

The following is the raw representation of the EagleX 7B Chat V0.5 model. For use with our own set of inference libraries and tools

This IS the chat/instruct tune (up to 100k context length)!!

Chat Format

The chat format is identical to the ChatML, the only difference is that the <|im_start|> and <|im_end|> special tokens have been replaced with the unused ASCII codes "\x16" and "\x17" respectively. This was done to ensure complete compatibility with the standard RWKV World tokenizer.

Example:

\x16system
You are Eagle, a very smart, creative and helpful personal assistant. Assist the user in any way you can with detailed and smart responses.\x17
\x16user
I want to cheer up my friend who's having a rough day. Can you suggest a couple short and sweet text messages to go with a kitten gif?\x17
\x16assistant
Sure, I can help you with that! Here are a couple of suggestions for text messages:

"You're the cutest thing ever! I hope you have a great day!"

"I know you're having a tough day, but you're amazing. Hang in there!"

"Thinking of you and sending you lots of love and light. You're amazing just the way you are."\x17
\x16user
Have you got anything even more sympathetic?\x17
\x16assistant
Sure, here are a couple of additional suggestions:

"I'm so sorry you're having a tough day. You're an amazing person, and I'm so proud of you."

"You're a wonderful person, and you're going through a tough time right now. Just remember that you're strong and capable, and you can get through this."

"I'm here for you, and I'm going to do everything I can to support you. You're amazing, and you're going to get through this."

"I'm so sorry you're having a tough day. You're an amazing person, and I'm so proud of you. You're going to get through this, and you're going to be even stronger and happier in the future."\x17

Quickstart with the PIP library

https://pypi.org/project/rwkv/


from rwkv.model import RWKV
from rwkv.utils import PIPELINE, PIPELINE_ARGS

# download models: https://huggingface.co/BlinkDL
model = RWKV(model='/fsx/BlinkDL/HF-MODEL/rwkv-4-pile-169m/RWKV-4-Pile-169M-20220807-8023', strategy='cpu fp32')
pipeline = PIPELINE(model, "20B_tokenizer.json") # 20B_tokenizer.json is in https://github.com/BlinkDL/ChatRWKV
# use pipeline = PIPELINE(model, "rwkv_vocab_v20230424") for rwkv "world" models

ctx = "\nIn a shocking finding, scientist discovered a herd of dragons living in a remote, previously unexplored valley, in Tibet. Even more surprising to the researchers was the fact that the dragons spoke perfect Chinese."
print(ctx, end='')

def my_print(s):
    print(s, end='', flush=True)

# For alpha_frequency and alpha_presence, see "Frequency and presence penalties":
# https://platform.openai.com/docs/api-reference/parameter-details

args = PIPELINE_ARGS(temperature = 1.0, top_p = 0.7, top_k = 100, # top_k = 0 then ignore
                     alpha_frequency = 0.25,
                     alpha_presence = 0.25,
                     alpha_decay = 0.996, # gradually decay the penalty
                     token_ban = [0], # ban the generation of some tokens
                     token_stop = [], # stop generation whenever you see any token here
                     chunk_len = 256) # split input into chunks to save VRAM (shorter -> slower)

pipeline.generate(ctx, token_count=200, args=args, callback=my_print)
print('\n')

out, state = model.forward([187, 510, 1563, 310, 247], None)
print(out.detach().cpu().numpy())                   # get logits
out, state = model.forward([187, 510], None)
out, state = model.forward([1563], state)           # RNN has state (use deepcopy to clone states)
out, state = model.forward([310, 247], state)
print(out.detach().cpu().numpy())                   # same result as above
print('\n')

Ramblings

Several new techniques were used to build the instruct dataset including the following:

  • Smart packing of the instruct pairs (to improve long context multi turn conversation)
  • Smart grouping of different context lengths and data categories/priorities (to improve training efficiency)
  • Variable context length training (courtesy of https://github.com/RWKV/RWKV-infctx-trainer)
  • A bunch of synthetic data to increase long context usage and reasoning (to be released soon...)

Acknowledgement

We are grateful for the help and support from the following key groups:

  • Recursal.ai team for financing the GPU resources, and managing the training of this model - you can run the Eagle line of RWKV models on their cloud / on-premise platform today.
  • Dataset built and model finetuned by @m8than
  • EleutherAI for their support, especially in the v5/v6 Eagle/Finch paper
  • Linux Foundation AI & Data group for supporting and hosting the RWKV project
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Collection including RWKV/EagleX-7B-Chat-V0.5-pth