🏢 Peking University

MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model

26 September 2024·1886 words·9 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Peking University

MaVEn: A novel multi-granularity hybrid visual encoding framework significantly boosts MLLM’s multi-image reasoning capabilities by combining discrete and continuous visual representations.

LSH-MoE: Communication-efficient MoE Training via Locality-Sensitive Hashing

26 September 2024·2125 words·10 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Peking University

LSH-MoE accelerates Mixture-of-Experts training by 1.28x-2.2x via Locality-Sensitive Hashing, significantly reducing communication costs.

Long-Range Feedback Spiking Network Captures Dynamic and Static Representations of the Visual Cortex under Movie Stimuli

26 September 2024·2020 words·10 mins· loading · loading

Computer Vision Video Understanding 🏢 Peking University

Long-range feedback spiking network (LoRaFB-SNet) surpasses other models in capturing dynamic and static visual cortical representations under movie stimuli, advancing our understanding of visual syst…

LM-HT SNN: Enhancing the Performance of SNN to ANN Counterpart through Learnable Multi-hierarchical Threshold Model

26 September 2024·1826 words·9 mins· loading · loading

Machine Learning Deep Learning 🏢 Peking University

LM-HT SNN: A learnable multi-hierarchical threshold model dramatically improves SNN performance, achieving near-ANN accuracy through dynamic current regulation and seamless ANN-SNN conversion.

Learning to Balance Altruism and Self-interest Based on Empathy in Mixed-Motive Games

26 September 2024·2604 words·13 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Peking University

AI agents learn to balance helpfulness and self-preservation using empathy to gauge social relationships and guide reward sharing.

Learning from Pattern Completion: Self-supervised Controllable Generation

26 September 2024·3650 words·18 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Peking University

Self-Supervised Controllable Generation (SCG) framework achieves brain-like associative generation by using a modular autoencoder with equivariance constraints and a self-supervised pattern completion…

Key-Grid: Unsupervised 3D Keypoints Detection using Grid Heatmap Features

26 September 2024·2426 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 Peking University

Key-Grid: An unsupervised 3D keypoint detector achieving state-of-the-art semantic consistency and accuracy for both rigid and deformable objects using novel grid heatmap features.

Infinite-Dimensional Feature Interaction

26 September 2024·1877 words·9 mins· loading · loading

Computer Vision Image Classification 🏢 Peking University

InfiNet achieves state-of-the-art results by enabling feature interaction in an infinite-dimensional space using RBF kernels, surpassing models limited to finite-dimensional interactions.

Improving Generalization and Convergence by Enhancing Implicit Regularization

26 September 2024·2134 words·11 mins· loading · loading

Machine Learning Deep Learning 🏢 Peking University

IRE framework expedites the discovery of flat minima in deep learning, enhancing generalization and convergence. By decoupling the dynamics of flat and sharp directions, IRE boosts sharpness reduction…

HonestLLM: Toward an Honest and Helpful Large Language Model

26 September 2024·3514 words·17 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Peking University

HonestLLM boosts LLM honesty & helpfulness by 65.3% (Llama3-8b) and 124.7% (Mistral-7b) using training-free and fine-tuning methods, establishing principles and a new dataset (HONESET) for honesty eva…

HiCoM: Hierarchical Coherent Motion for Dynamic Streamable Scenes with 3D Gaussian Splatting

26 September 2024·2356 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 Peking University

HiCoM, a novel framework, achieves high-fidelity streamable dynamic scene reconstruction by using a hierarchical coherent motion mechanism and parallel processing to significantly reduce training time…

GS-Hider: Hiding Messages into 3D Gaussian Splatting

26 September 2024·2889 words·14 mins· loading · loading

Computer Vision 3D Vision 🏢 Peking University

GS-Hider: A novel framework secures 3D Gaussian Splatting by embedding messages in a coupled, secured feature attribute, enabling invisible data hiding and accurate extraction.

GraphMorph: Tubular Structure Extraction by Morphing Predicted Graphs

26 September 2024·2370 words·12 mins· loading · loading

Computer Vision Image Segmentation 🏢 Peking University

GraphMorph: revolutionizing tubular structure extraction by morphing predicted graphs for superior topological accuracy.

GarmentLab: A Unified Simulation and Benchmark for Garment Manipulation

26 September 2024·2482 words·12 mins· loading · loading

AI Applications Robotics 🏢 Peking University

GarmentLab: A new benchmark and simulation platform tackles garment manipulation challenges by offering realistic simulations, diverse assets, and tasks bridging the sim-to-real gap for more robust AI…

Functional Gradient Flows for Constrained Sampling

26 September 2024·3022 words·15 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 Peking University

Constrained sampling solved! New functional gradient flow method (CFG) efficiently samples from constrained probability distributions via a novel boundary condition for gradient flows, achieving prov…

Fight Back Against Jailbreaking via Prompt Adversarial Tuning

26 September 2024·2100 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Peking University

Prompt Adversarial Tuning (PAT) defends against LLM jailbreaking by training a protective prompt prefix. PAT uses adversarial and benign prompts to optimize this prefix, significantly reducing succes…

Exploring Molecular Pretraining Model at Scale

26 September 2024·2151 words·11 mins· loading · loading

AI Generated Machine Learning Self-Supervised Learning 🏢 Peking University

Uni-Mol2, a groundbreaking 1.1B parameter molecular pretraining model, reveals power-law scaling in molecular representation learning, achieving significant performance improvements on downstream task…

Expert-level protocol translation for self-driving labs

26 September 2024·2789 words·14 mins· loading · loading

AI Applications Manufacturing 🏢 Peking University

This research introduces a novel, automated protocol translation framework for self-driving labs, tackling the challenge of converting human-readable experimental protocols into machine-interpretable …

EnOF-SNN: Training Accurate Spiking Neural Networks via Enhancing the Output Feature

26 September 2024·1417 words·7 mins· loading · loading

Machine Learning Deep Learning 🏢 Peking University

EnOF-SNN boosts spiking neural network (SNN) accuracy by enhancing output feature representation using a novel knowledge distillation method and ReLU activation, outperforming current state-of-the-art…

EGODE: An Event-attended Graph ODE Framework for Modeling Rigid Dynamics

26 September 2024·1865 words·9 mins· loading · loading

AI Applications Robotics 🏢 Peking University

EGODE, a novel framework, leverages coupled graph ODEs and an event module to accurately model continuous and instantaneous changes in rigid body dynamics, outperforming existing methods.