Posters
2024
Taming Heavy-Tailed Losses in Adversarial Bandits and the Best-of-Both-Worlds Setting
·418 words·2 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Virginia Tech
This paper proposes novel algorithms achieving near-optimal regret in adversarial and logarithmic regret in stochastic multi-armed bandit settings with heavy-tailed losses, relaxing strong assumptions…
Taming Generative Diffusion Prior for Universal Blind Image Restoration
·4450 words·21 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 Fudan University
BIR-D tames generative diffusion models for universal blind image restoration, dynamically updating parameters to handle various complex degradations without assuming degradation model types.
Taming Diffusion Prior for Image Super-Resolution with Domain Shift SDEs
·2142 words·11 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Advanced Micro Devices Inc.
DoSSR: A novel SR model boosts efficiency by 5-7x, achieving state-of-the-art performance with only 5 sampling steps by cleverly integrating a domain shift equation into pretrained diffusion models.
Taming Cross-Domain Representation Variance in Federated Prototype Learning with Heterogeneous Data Domains
·3467 words·17 mins·
loading
·
loading
AI Generated
Machine Learning
Federated Learning
🏢 University of Florida
FedPLVM tames cross-domain variance in federated prototype learning using dual-level clustering and an a-sparsity loss, achieving superior performance.
Taming 'data-hungry' reinforcement learning? Stability in continuous state-action spaces
·358 words·2 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 New York University
Reinforcement learning achieves unprecedented fast convergence rates in continuous state-action spaces by leveraging novel stability properties of Markov Decision Processes.
Talking Heads: Understanding Inter-Layer Communication in Transformer Language Models
·3768 words·18 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Brown University
Transformer Language Models’ (LMs) sensitivity to seemingly arbitrary prompt changes is explained by identifying low-rank communication channels between layers. By decomposing attention heads, resear…
Take A Shortcut Back: Mitigating the Gradient Vanishing for Training Spiking Neural Networks
·1272 words·6 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Peking University
Shortcut back-propagation and an evolutionary training framework conquer gradient vanishing in spiking neural networks, drastically improving training and achieving state-of-the-art accuracy.
TAIA: Large Language Models are Out-of-Distribution Data Learners
·2712 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Fudan University
LLMs struggle with downstream tasks using mismatched data. TAIA, a novel inference-time method, solves this by selectively using only attention parameters during inference after training all parameter…
Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation
·2449 words·12 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Carnegie Mellon University
Tactile DreamFusion: High-resolution tactile sensing enhances 3D generation, creating realistic geometric details previously unattainable.
Tackling Uncertain Correspondences for Multi-Modal Entity Alignment
·1671 words·8 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Hong Kong University of Science and Technology
TMEA: A novel approach significantly boosts multi-modal entity alignment accuracy by effectively handling uncertain correspondences between modalities, improving data integration for diverse knowledge…
TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
·2422 words·12 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 University of Science and Technology of China
TabPedia: a novel large vision-language model, achieves superior visual table understanding by seamlessly integrating diverse tasks via a concept synergy mechanism and a new benchmark.
TableRAG: Million-Token Table Understanding with Language Models
·2446 words·12 mins·
loading
·
loading
Natural Language Processing
Question Answering
🏢 National Taiwan University
TableRAG, a novel Retrieval-Augmented Generation framework, achieves state-of-the-art performance in large-scale table understanding by efficiently integrating schema and cell retrieval with language …
TabEBM: A Tabular Data Augmentation Method with Distinct Class-Specific Energy-Based Models
·8456 words·40 mins·
loading
·
loading
AI Generated
Machine Learning
Generative Models
🏢 University of Cambridge
TabEBM: Class-specific EBMs boost tabular data augmentation, improving classification accuracy, especially on small datasets, by generating high-quality synthetic data.
T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback
·3387 words·16 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 UC Santa Barbara
T2V-Turbo breaks the quality bottleneck of video consistency models by integrating mixed reward feedback during consistency distillation, enabling high-quality video generation with significantly fast…
Synthetic Programming Elicitation for Text-to-Code in Very Low-Resource Programming and Formal Languages
·1817 words·9 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 UC Berkeley
LLMs struggle with very low-resource programming languages. SPEAC, a novel synthetic programming elicitation and compilation approach, uses an intermediate language to enable LLMs to generate syntact…
Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Models
·1594 words·8 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Texas at Austin
The Synthesize-Partition-Adapt (SPA) framework leverages synthetic data to generate diverse, high-quality responses from foundation models, enriching user experience.
Synergistic Dual Spatial-aware Generation of Image-to-text and Text-to-image
·2896 words·14 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Tianjin University
Synergistic Dual Spatial-aware Generation boosts image-to-text and text-to-image accuracy using a novel 3D scene graph and dual learning framework.
SyncVIS: Synchronized Video Instance Segmentation
·2160 words·11 mins·
loading
·
loading
Computer Vision
Video Understanding
🏢 University of Hong Kong
SyncVIS: A new framework for video instance segmentation achieves state-of-the-art results by synchronously modeling video and frame-level information, overcoming limitations of asynchronous approache…
SyncTweedies: A General Generative Framework Based on Synchronized Diffusions
·4065 words·20 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 KAIST
SyncTweedies: a zero-shot diffusion synchronization framework generates diverse visual content (images, panoramas, 3D textures) by synchronizing multiple diffusion processes without fine-tuning, demon…
Synatra: Turning Indirect Knowledge into Direct Demonstrations for Digital Agents at Scale
·2845 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Carnegie Mellon University
Synatra synthesizes high-quality digital agent training data from online tutorials and web pages, significantly improving agent performance on complex web-based tasks at a fraction of the cost of huma…