Jie Fu's picture

3 19 7

Jie Fu

bigaidream

·

https://bigaidream.github.io/

AI & ML interests

LLM, Reinforcement Learning, System-2 Deep Learning (Reasoning, Planning), Automatic Theorem Proving, AI Safety

Organizations

None yet

bigaidream's activity

upvoted a paper 3 months ago

PositionID: LLMs can Control Lengths, Copy and Paste with Explicit Positional Awareness

Paper • 2410.07035 • Published Oct 9, 2024 • 17

upvoted a paper 5 months ago

Layerwise Recurrent Router for Mixture-of-Experts

Paper • 2408.06793 • Published Aug 13, 2024 • 32

upvoted 4 papers 7 months ago

A Closer Look into Mixture-of-Experts in Large Language Models

Paper • 2406.18219 • Published Jun 26, 2024 • 16

Unlocking Continual Learning Abilities in Language Models

Paper • 2406.17245 • Published Jun 25, 2024 • 29

Efficient Continual Pre-training by Mitigating the Stability Gap

Paper • 2406.14833 • Published Jun 21, 2024 • 20

VCR: Visual Caption Restoration

Paper • 2406.06462 • Published Jun 10, 2024 • 10

upvoted 2 papers 8 months ago

LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters

Paper • 2405.16287 • Published May 25, 2024 • 10

Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

Paper • 2405.15319 • Published May 24, 2024 • 26

upvoted 3 papers 9 months ago

Long-context LLMs Struggle with Long In-context Learning

Paper • 2404.02060 • Published Apr 2, 2024 • 36

COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning

Paper • 2403.18058 • Published Mar 26, 2024 • 4

CodeEditorBench: Evaluating Code Editing Capability of Large Language Models

Paper • 2404.03543 • Published Apr 4, 2024 • 16

upvoted 5 papers 11 months ago

Think Before You Act: Decision Transformers with Internal Working Memory

Paper • 2305.16338 • Published May 24, 2023 • 3

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper • 2402.16153 • Published Feb 25, 2024 • 57

StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

Paper • 2402.16671 • Published Feb 26, 2024 • 27

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

Paper • 2402.14658 • Published Feb 22, 2024 • 82

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

Paper • 2402.12226 • Published Feb 19, 2024 • 41

upvoted 2 papers 12 months ago

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

Paper • 2401.11944 • Published Jan 22, 2024 • 27

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models

Paper • 2401.06951 • Published Jan 13, 2024 • 25

upvoted a paper about 1 year ago

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Paper • 2311.16502 • Published Nov 27, 2023 • 35