-
Moral Foundations of Large Language Models
Paper • 2310.15337 • Published • 1 -
Specific versus General Principles for Constitutional AI
Paper • 2310.13798 • Published • 2 -
Contrastive Prefence Learning: Learning from Human Feedback without RL
Paper • 2310.13639 • Published • 24 -
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Paper • 2309.00267 • Published • 47
Collections
Discover the best community collections!
Collections including paper arxiv:2310.20587
-
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Paper • 2310.20587 • Published • 16 -
SELF: Language-Driven Self-Evolution for Large Language Model
Paper • 2310.00533 • Published • 2 -
Bigger, Better, Faster: Human-level Atari with human-level efficiency
Paper • 2305.19452 • Published • 4 -
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Paper • 2408.08152 • Published • 53
-
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Paper • 2310.20587 • Published • 16 -
MedAlpaca -- An Open-Source Collection of Medical Conversational AI Models and Training Data
Paper • 2304.08247 • Published • 2 -
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Paper • 2311.03285 • Published • 28 -
WavLLM: Towards Robust and Adaptive Speech Large Language Model
Paper • 2404.00656 • Published • 11
-
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Paper • 2310.20587 • Published • 16 -
SELF: Language-Driven Self-Evolution for Large Language Model
Paper • 2310.00533 • Published • 2 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 47 -
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper • 2309.14717 • Published • 44
-
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Paper • 2310.20587 • Published • 16 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 105 -
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Paper • 2403.15042 • Published • 26 -
LIMA: Less Is More for Alignment
Paper • 2305.11206 • Published • 21
-
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Paper • 2310.20587 • Published • 16 -
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention
Paper • 2310.00535 • Published • 2 -
Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Paper • 2307.09458 • Published • 10 -
The Impact of Depth and Width on Transformer Language Model Generalization
Paper • 2310.19956 • Published • 9
-
Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance
Paper • 2310.10021 • Published • 2 -
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Paper • 2310.20587 • Published • 16 -
Discovering Adaptable Symbolic Algorithms from Scratch
Paper • 2307.16890 • Published • 6 -
DragAPart: Learning a Part-Level Motion Prior for Articulated Objects
Paper • 2403.15382 • Published • 10
-
An Interdisciplinary Comparison of Sequence Modeling Methods for Next-Element Prediction
Paper • 1811.00062 • Published • 2 -
mT5: A massively multilingual pre-trained text-to-text transformer
Paper • 2010.11934 • Published • 4 -
Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance
Paper • 2310.10021 • Published • 2 -
Gemma: Open Models Based on Gemini Research and Technology
Paper • 2403.08295 • Published • 47