Skip to main content

🏢 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University

What Rotary Position Embedding Can Tell Us: Identifying Query and Key Weights Corresponding to Basic Syntactic or High-level Semantic Information
·1978 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University
LLM fine-tuning made easy! This paper reveals how analyzing weight vector angles in RoPE positional embeddings helps optimize LLMs, reducing parameter count and improving efficiency.
Unveiling The Matthew Effect Across Channels: Assessing Layer Width Sufficiency via Weight Norm Variance
·2478 words·12 mins· loading · loading
Machine Learning Deep Learning 🏢 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University
Neural network efficiency is improved by analyzing weight norm variance across channels to identify optimal layer widths, resulting in reduced parameters and boosted performance.
FlexPlanner: Flexible 3D Floorplanning via Deep Reinforcement Learning in Hybrid Action Space with Multi-Modality Representation
·3516 words·17 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University
FlexPlanner: Deep reinforcement learning solves flexible 3D floorplanning, improving wirelength and alignment significantly.
Boundary Matters: A Bi-Level Active Finetuning Method
·2351 words·12 mins· loading · loading
Computer Vision Active Learning 🏢 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University
Bi-Level Active Finetuning Framework (BiLAF) revolutionizes sample selection for efficient model finetuning. Unlike existing methods, BiLAF incorporates both global diversity and local decision bounda…