🏢 Peking University

SpikeReveal: Unlocking Temporal Sequences from Real Blurry Inputs with Spike Streams

26 September 2024·2631 words·13 mins· loading · loading

Image Generation 🏢 Peking University

SpikeReveal: Self-supervised learning unlocks sharp video sequences from blurry, real-world spike camera data, overcoming limitations of prior supervised approaches.

Spatio-Temporal Interactive Learning for Efficient Image Reconstruction of Spiking Cameras

26 September 2024·2395 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 Peking University

STIR: A novel spatio-temporal network reconstructs high-quality images from spiking camera data by jointly refining motion and intensity information for efficient and accurate high-speed imaging.

SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization

26 September 2024·1927 words·10 mins· loading · loading

Machine Learning Meta Learning 🏢 Peking University

SPARKLE: A single-loop primal-dual framework unifies decentralized bilevel optimization, enabling flexible heterogeneity-correction and mixed update strategies for improved convergence.

SMART: Towards Pre-trained Missing-Aware Model for Patient Health Status Prediction

26 September 2024·2281 words·11 mins· loading · loading

AI Generated AI Applications Healthcare 🏢 Peking University

SMART: a novel self-supervised model tackles missing EHR data, improving patient health status prediction via missing-aware attention and robust pre-training.

SILENCE: Protecting privacy in offloaded speech understanding on resource-constrained devices

26 September 2024·2275 words·11 mins· loading · loading

Natural Language Processing Speech Recognition 🏢 Peking University

SILENCE, a novel lightweight system, protects user privacy in offloaded speech understanding on resource-constrained devices by selectively masking short-term audio details without impacting long-term…

Sharing Key Semantics in Transformer Makes Efficient Image Restoration

26 September 2024·3184 words·15 mins· loading · loading

Computer Vision Image Restoration 🏢 Peking University

SemanIR boosts image restoration efficiency by cleverly sharing key semantic information within Transformer layers, achieving state-of-the-art results across multiple tasks.

SfPUEL: Shape from Polarization under Unknown Environment Light

26 September 2024·2725 words·13 mins· loading · loading

Computer Vision 3D Vision 🏢 Peking University

SfPUEL: A novel end-to-end SfP method achieves robust single-shot surface normal estimation under diverse lighting, integrating PS priors and material segmentation.

Separation and Bias of Deep Equilibrium Models on Expressivity and Learning Dynamics

26 September 2024·2192 words·11 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Peking University

Deep Equilibrium Models (DEQs) outperform standard neural networks, but lack theoretical understanding. This paper provides general separation results showing DEQ’s superior expressivity and character…

SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow

26 September 2024·2658 words·13 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Peking University

SemFlow: A unified framework uses rectified flow to seamlessly bridge semantic segmentation and image synthesis, achieving competitive results and offering reversible image-mask transformations.

SegVol: Universal and Interactive Volumetric Medical Image Segmentation

26 September 2024·2947 words·14 mins· loading · loading

Image Segmentation 🏢 Peking University

SegVol: A universal, interactive 3D medical image segmentation model achieving state-of-the-art performance across diverse anatomical categories.

Seek Commonality but Preserve Differences: Dissected Dynamics Modeling for Multi-modal Visual RL

26 September 2024·2815 words·14 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Peking University

Dissected Dynamics Modeling (DDM) excels at multi-modal visual reinforcement learning by cleverly separating and integrating common and unique features across different sensory inputs for more accurat…

Scalable Constrained Policy Optimization for Safe Multi-agent Reinforcement Learning

26 September 2024·1664 words·8 mins· loading · loading

AI Generated Machine Learning Reinforcement Learning 🏢 Peking University

Scalable MAPPO-L: Decentralized training with local interactions ensures safe, high-reward multi-agent systems, even with limited communication.

Robust and Faster Zeroth-Order Minimax Optimization: Complexity and Applications

26 September 2024·1498 words·8 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Peking University

ZO-GDEGA: A unified algorithm achieves faster, more robust zeroth-order minimax optimization with lower complexity and weaker conditions, solving stochastic nonconvex-concave problems.

RoboMamba: Efficient Vision-Language-Action Model for Robotic Reasoning and Manipulation

26 September 2024·1948 words·10 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Peking University

RoboMamba: a novel robotic VLA model efficiently combines reasoning and action, achieving high speeds and accuracy while requiring minimal fine-tuning.

Richelieu: Self-Evolving LLM-Based Agents for AI Diplomacy

26 September 2024·2475 words·12 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Peking University

Richelieu: a self-evolving LLM-based AI agent masters Diplomacy, a complex game requiring strategic planning and negotiation, without human data, by integrating self-play for continuous improvement.

Revisiting Adversarial Patches for Designing Camera-Agnostic Attacks against Person Detection

26 September 2024·1724 words·9 mins· loading · loading

Computer Vision Object Detection 🏢 Peking University

Researchers developed Camera-Agnostic Patch (CAP) attacks, improving adversarial patch reliability by simulating camera image processing in attacks against person detectors.

ReVideo: Remake a Video with Motion and Content Control

26 September 2024·2423 words·12 mins· loading · loading

Computer Vision Video Understanding 🏢 Peking University

ReVideo enables precise local video editing by independently controlling content and motion, overcoming limitations of existing methods and paving the way for advanced video manipulation.

Retrieval-Augmented Diffusion Models for Time Series Forecasting

26 September 2024·2360 words·12 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 Peking University

Boosting time series forecasting accuracy, Retrieval-Augmented Diffusion Models (RATD) leverage relevant historical data to guide the diffusion process, overcoming limitations of existing models and d…

Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity

26 September 2024·1788 words·9 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Peking University

Reinforcement learning paradigms exhibit a representation complexity hierarchy: models are easiest, then policies, and value functions are hardest to approximate.

ReEvo: Large Language Models as Hyper-Heuristics with Reflective Evolution

26 September 2024·3978 words·19 mins· loading · loading

AI Theory Optimization 🏢 Peking University

ReEvo, a novel integration of evolutionary search and LLM reflections, generates state-of-the-art heuristics for combinatorial optimization problems, demonstrating superior sample efficiency.