Category: Foundation Models | Hyacehila's Blog

Hyacehila's Blog

HOME
ARCHIVES
ME
PROJECT
ABOUT
- FOOTPRINTS
- FRIENDS
- CV

HOME
ARCHIVES
ME
PROJECT
ABOUT

FOOTPRINTS

FRIENDS

CV
Murmur
Categories
Tags

Foundation Models

2026 10

JoyAI-VL-Interaction: From Chat Back to Continuous Interaction
Bad Is Good: Why DeepSeek Did Not Use an n-Gram Structure
Reading Model States Through Newline Tokens: A Note from Word Salad Chopper
MineCLIP, Visual Signals, and Reward Design
Reward and Training Loops in Real Agents: From Data Governance to Online RL
Agentic RL: Why the Training Loop Matters More Than the Algorithm
Reward Hacking: When Optimizers Reverse-Search the Reward Signal
The Evolution of Reward Design: From RLHF to RLVR
Reinforcement Learning in LLM Alignment: From Reward Signals to Advantage Estimation
Parameter-Efficient Fine-Tuning (PEFT): From Adapter to LoRA

12 3

© 2025 - 2026 Hyacehila

103 posts in total

VISITOR COUNT TOTAL PAGE VIEWS

POWERED BY Hexo THEME Redefine v2.9.0

Blog up for days hrs Min Sec

EXIF