🏢 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University

What Rotary Position Embedding Can Tell Us: Identifying Query and Key Weights Corresponding to Basic Syntactic or High-level Semantic Information

26 September 2024·1978 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University

LLM fine-tuning made easy! This paper reveals how analyzing weight vector angles in RoPE positional embeddings helps optimize LLMs, reducing parameter count and improving efficiency.

Unveiling The Matthew Effect Across Channels: Assessing Layer Width Sufficiency via Weight Norm Variance

26 September 2024·2478 words·12 mins· loading · loading

Machine Learning Deep Learning 🏢 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University

Neural network efficiency is improved by analyzing weight norm variance across channels to identify optimal layer widths, resulting in reduced parameters and boosted performance.

FlexPlanner: Flexible 3D Floorplanning via Deep Reinforcement Learning in Hybrid Action Space with Multi-Modality Representation

26 September 2024·3516 words·17 mins· loading · loading

AI Generated Machine Learning Reinforcement Learning 🏢 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University

FlexPlanner: Deep reinforcement learning solves flexible 3D floorplanning, improving wirelength and alignment significantly.

Boundary Matters: A Bi-Level Active Finetuning Method

26 September 2024·2351 words·12 mins· loading · loading

Computer Vision Active Learning 🏢 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University

Bi-Level Active Finetuning Framework (BiLAF) revolutionizes sample selection for efficient model finetuning. Unlike existing methods, BiLAF incorporates both global diversity and local decision bounda…