Skip to main content

Posters

2024

Hallo3D: Multi-Modal Hallucination Detection and Mitigation for Consistent 3D Content Generation
·2871 words·14 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Chinese Academy of Sciences
Hallo3D: a tuning-free method resolving 3D generation hallucinations via multi-modal inconsistency detection and mitigation for consistent 3D content.
HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach
·2844 words·14 mins· loading · loading
Computer Vision Image Generation 🏢 HSE University
HairFastGAN achieves realistic and robust hairstyle transfer in near real-time using a novel encoder-based approach, significantly outperforming optimization-based methods.
HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion
·3966 words·19 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Shenzhen University
HairDiffusion uses latent diffusion models and a multi-stage blending technique to achieve vivid, multi-colored hair editing in images, preserving other facial features.
GVKF: Gaussian Voxel Kernel Functions for Highly Efficient Surface Reconstruction in Open Scenes
·2497 words·12 mins· loading · loading
Computer Vision 3D Vision 🏢 Hong Kong University of Science and Technology
GVKF: A novel method achieves highly efficient and accurate 3D surface reconstruction in open scenes by integrating fast 3D Gaussian splatting with continuous scene representation using kernel regres…
Guiding Neural Collapse: Optimising Towards the Nearest Simplex Equiangular Tight Frame
·3208 words·16 mins· loading · loading
Machine Learning Deep Learning 🏢 Australian National University
Researchers devised a novel method to accelerate neural network training by guiding the optimization process toward a Simplex Equiangular Tight Frame, exploiting the Neural Collapse phenomenon to enha…
Guided Trajectory Generation with Diffusion Models for Offline Model-based Optimization
·3001 words·15 mins· loading · loading
Machine Learning Optimization 🏢 Korea Advanced Institute of Science and Technology (KAIST)
GTG, a novel conditional generative modeling approach, leverages diffusion models to generate high-scoring design trajectories for offline model-based optimization, outperforming existing methods on b…
GUIDE: Real-Time Human-Shaped Agents
·2015 words·10 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 Duke University
GUIDE: Real-time human-shaped AI agents achieve up to 30% higher success rates using continuous human feedback, boosted by a parallel training model that mimics human input for continued improvement.
GuardT2I: Defending Text-to-Image Models from Adversarial Prompts
·3130 words·15 mins· loading · loading
AI Generated Multimodal Learning Vision-Language Models 🏢 Tsinghua University
GuardT2I: A novel framework defends text-to-image models against adversarial prompts by translating latent guidance embeddings into natural language, enabling effective adversarial prompt detection wi…
GTBench: Uncovering the Strategic Reasoning Capabilities of LLMs via Game-Theoretic Evaluations
·2898 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Drexel University
GTBENCH reveals LLMs’ strategic reasoning weaknesses via game-theoretic evaluations, showing strengths in probabilistic scenarios but struggles with deterministic ones; code-pretraining helps.
GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning
·3982 words·19 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 KAIST
Generative Trajectory Augmentation (GTA) significantly boosts offline reinforcement learning by generating high-reward trajectories using a conditional diffusion model, enhancing algorithm performance…
GSGAN: Adversarial Learning for Hierarchical Generation of 3D Gaussian Splats
·2282 words·11 mins· loading · loading
Computer Vision 3D Vision 🏢 SungKyunKwan University
GSGAN introduces a hierarchical 3D Gaussian representation for faster, high-quality 3D model generation in GANs, achieving 100x speed improvement over existing methods.
GSDF: 3DGS Meets SDF for Improved Neural Rendering and Reconstruction
·2215 words·11 mins· loading · loading
Computer Vision 3D Vision 🏢 Shanghai Artificial Intelligence Laboratory
GSDF: A novel dual-branch neural scene representation elegantly resolves the rendering-reconstruction trade-off by synergistically combining 3D Gaussian Splatting and Signed Distance Fields via mutual…
GS-Hider: Hiding Messages into 3D Gaussian Splatting
·2889 words·14 mins· loading · loading
Computer Vision 3D Vision 🏢 Peking University
GS-Hider: A novel framework secures 3D Gaussian Splatting by embedding messages in a coupled, secured feature attribute, enabling invisible data hiding and accurate extraction.
Group-wise oracle-efficient algorithms for online multi-group learning
·316 words·2 mins· loading · loading
AI Theory Fairness 🏢 Columbia University
Oracle-efficient algorithms conquer online multi-group learning, achieving sublinear regret even with massive, overlapping groups, paving the way for fair and efficient large-scale online systems.
Group Robust Preference Optimization in Reward-free RLHF
·2045 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏢 University College London (UCL)
Group Robust Preference Optimization (GRPO) enhances reward-free RLHF by aligning LLMs to diverse group preferences, maximizing worst-case performance, and significantly improving fairness.
Group and Shuffle: Efficient Structured Orthogonal Parametrization
·2149 words·11 mins· loading · loading
AI Generated Machine Learning Deep Learning 🏢 HSE University
Group-and-Shuffle (GS) matrices enable efficient structured orthogonal parametrization, improving parameter and computational efficiency in orthogonal fine-tuning for deep learning.
GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation
·2589 words·13 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 KAIST
GrounDiT: Training-free spatial grounding for text-to-image generation using Diffusion Transformers and a novel noisy patch transplantation technique for precise object placement.
Grounding Multimodal Large Language Models in Actions
·3629 words·18 mins· loading · loading
AI Generated Multimodal Learning Embodied AI 🏢 Apple
Researchers unveil unified architecture for grounding multimodal large language models in actions, showing superior performance with learned tokenization for continuous actions and semantic alignment …
Grounded Answers for Multi-agent Decision-making Problem through Generative World Model
·2428 words·12 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 National Key Laboratory of Human-Machine Hybrid Augmented Intelligence
Generative world models enhance multi-agent decision-making by simulating trial-and-error learning, improving answer accuracy and explainability.
Grokking of Implicit Reasoning in Transformers: A Mechanistic Journey to the Edge of Generalization
·2486 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏢 the Ohio State University
Transformers can learn implicit reasoning through ‘grokking’, achieving high accuracy in composition and comparison tasks; however, generalization varies across reasoning types.