🏢 Peking University
MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model
·1886 words·9 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Peking University
MaVEn: A novel multi-granularity hybrid visual encoding framework significantly boosts MLLM’s multi-image reasoning capabilities by combining discrete and continuous visual representations.
LSH-MoE: Communication-efficient MoE Training via Locality-Sensitive Hashing
·2125 words·10 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Peking University
LSH-MoE accelerates Mixture-of-Experts training by 1.28x-2.2x via Locality-Sensitive Hashing, significantly reducing communication costs.
Long-Range Feedback Spiking Network Captures Dynamic and Static Representations of the Visual Cortex under Movie Stimuli
·2020 words·10 mins·
loading
·
loading
Computer Vision
Video Understanding
🏢 Peking University
Long-range feedback spiking network (LoRaFB-SNet) surpasses other models in capturing dynamic and static visual cortical representations under movie stimuli, advancing our understanding of visual syst…
LM-HT SNN: Enhancing the Performance of SNN to ANN Counterpart through Learnable Multi-hierarchical Threshold Model
·1826 words·9 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Peking University
LM-HT SNN: A learnable multi-hierarchical threshold model dramatically improves SNN performance, achieving near-ANN accuracy through dynamic current regulation and seamless ANN-SNN conversion.
Learning to Balance Altruism and Self-interest Based on Empathy in Mixed-Motive Games
·2604 words·13 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Peking University
AI agents learn to balance helpfulness and self-preservation using empathy to gauge social relationships and guide reward sharing.
Learning from Pattern Completion: Self-supervised Controllable Generation
·3650 words·18 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 Peking University
Self-Supervised Controllable Generation (SCG) framework achieves brain-like associative generation by using a modular autoencoder with equivariance constraints and a self-supervised pattern completion…
Key-Grid: Unsupervised 3D Keypoints Detection using Grid Heatmap Features
·2426 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Peking University
Key-Grid: An unsupervised 3D keypoint detector achieving state-of-the-art semantic consistency and accuracy for both rigid and deformable objects using novel grid heatmap features.
Infinite-Dimensional Feature Interaction
·1877 words·9 mins·
loading
·
loading
Computer Vision
Image Classification
🏢 Peking University
InfiNet achieves state-of-the-art results by enabling feature interaction in an infinite-dimensional space using RBF kernels, surpassing models limited to finite-dimensional interactions.
Improving Generalization and Convergence by Enhancing Implicit Regularization
·2134 words·11 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Peking University
IRE framework expedites the discovery of flat minima in deep learning, enhancing generalization and convergence. By decoupling the dynamics of flat and sharp directions, IRE boosts sharpness reduction…
HonestLLM: Toward an Honest and Helpful Large Language Model
·3514 words·17 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Peking University
HonestLLM boosts LLM honesty & helpfulness by 65.3% (Llama3-8b) and 124.7% (Mistral-7b) using training-free and fine-tuning methods, establishing principles and a new dataset (HONESET) for honesty eva…
HiCoM: Hierarchical Coherent Motion for Dynamic Streamable Scenes with 3D Gaussian Splatting
·2356 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Peking University
HiCoM, a novel framework, achieves high-fidelity streamable dynamic scene reconstruction by using a hierarchical coherent motion mechanism and parallel processing to significantly reduce training time…
GS-Hider: Hiding Messages into 3D Gaussian Splatting
·2889 words·14 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Peking University
GS-Hider: A novel framework secures 3D Gaussian Splatting by embedding messages in a coupled, secured feature attribute, enabling invisible data hiding and accurate extraction.
GraphMorph: Tubular Structure Extraction by Morphing Predicted Graphs
·2370 words·12 mins·
loading
·
loading
Computer Vision
Image Segmentation
🏢 Peking University
GraphMorph: revolutionizing tubular structure extraction by morphing predicted graphs for superior topological accuracy.
GarmentLab: A Unified Simulation and Benchmark for Garment Manipulation
·2482 words·12 mins·
loading
·
loading
AI Applications
Robotics
🏢 Peking University
GarmentLab: A new benchmark and simulation platform tackles garment manipulation challenges by offering realistic simulations, diverse assets, and tasks bridging the sim-to-real gap for more robust AI…
Functional Gradient Flows for Constrained Sampling
·3022 words·15 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
🏢 Peking University
Constrained sampling solved! New functional gradient flow method (CFG) efficiently samples from constrained probability distributions via a novel boundary condition for gradient flows, achieving prov…
Fight Back Against Jailbreaking via Prompt Adversarial Tuning
·2100 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Peking University
Prompt Adversarial Tuning (PAT) defends against LLM jailbreaking by training a protective prompt prefix. PAT uses adversarial and benign prompts to optimize this prefix, significantly reducing succes…
Exploring Molecular Pretraining Model at Scale
·2151 words·11 mins·
loading
·
loading
AI Generated
Machine Learning
Self-Supervised Learning
🏢 Peking University
Uni-Mol2, a groundbreaking 1.1B parameter molecular pretraining model, reveals power-law scaling in molecular representation learning, achieving significant performance improvements on downstream task…
Expert-level protocol translation for self-driving labs
·2789 words·14 mins·
loading
·
loading
AI Applications
Manufacturing
🏢 Peking University
This research introduces a novel, automated protocol translation framework for self-driving labs, tackling the challenge of converting human-readable experimental protocols into machine-interpretable …
EnOF-SNN: Training Accurate Spiking Neural Networks via Enhancing the Output Feature
·1417 words·7 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Peking University
EnOF-SNN boosts spiking neural network (SNN) accuracy by enhancing output feature representation using a novel knowledge distillation method and ReLU activation, outperforming current state-of-the-art…
EGODE: An Event-attended Graph ODE Framework for Modeling Rigid Dynamics
·1865 words·9 mins·
loading
·
loading
AI Applications
Robotics
🏢 Peking University
EGODE, a novel framework, leverages coupled graph ODEs and an event module to accurately model continuous and instantaneous changes in rigid body dynamics, outperforming existing methods.