Posters

Segment, Shuffle, and Stitch: A Simple Layer for Improving Time-Series Representations

26 September 2024·3043 words·15 mins· loading · loading

Machine Learning Representation Learning 🏢 Queen's University

Boost time-series model accuracy with Segment, Shuffle, and Stitch (S3)! This simple layer shuffles data segments to enhance representation learning, improving classification, forecasting, and anomaly…

Segment Anything without Supervision

26 September 2024·1959 words·10 mins· loading · loading

Computer Vision Image Segmentation 🏢 UC Berkeley

Unsupervised SAM (UnSAM) achieves competitive image segmentation results without human annotation, surpassing previous unsupervised methods and even improving supervised SAM’s accuracy.

Segment Any Change

26 September 2024·2244 words·11 mins· loading · loading

Computer Vision Image Segmentation 🏢 Stanford University

AnyChange achieves zero-shot image change detection by adapting the Segment Anything Model (SAM) via a training-free bitemporal latent matching method, significantly outperforming previous state-of-th…

SEEV: Synthesis with Efficient Exact Verification for ReLU Neural Barrier Functions

26 September 2024·1687 words·8 mins· loading · loading

AI Theory Safety 🏢 Washington University in St. Louis

SEEV framework efficiently verifies ReLU neural barrier functions by reducing activation regions and using tight over-approximations, significantly improving verification efficiency without sacrificin…

Seek Commonality but Preserve Differences: Dissected Dynamics Modeling for Multi-modal Visual RL

26 September 2024·2815 words·14 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Peking University

Dissected Dynamics Modeling (DDM) excels at multi-modal visual reinforcement learning by cleverly separating and integrating common and unique features across different sensory inputs for more accurat…

Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment

26 September 2024·3346 words·16 mins· loading · loading

AI Generated Multimodal Learning Vision-Language Models 🏢 ByteDance Inc.

Boosting vision-language model performance, Contrastive ALignment (CAL) prioritizes visually correlated text tokens during training via a simple, computationally efficient re-weighting strategy, signi…

Seeing Beyond the Crop: Using Language Priors for Out-of-Bounding Box Keypoint Prediction

26 September 2024·2045 words·10 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 University of Waterloo

TokenCLIPose leverages language priors to predict human keypoints beyond bounding boxes, improving pose estimation accuracy significantly on ice hockey, lacrosse and CrowdPose datasets.

Secret Collusion among AI Agents: Multi-Agent Deception via Steganography

26 September 2024·5189 words·25 mins· loading · loading

AI Generated AI Theory Safety 🏢 UC Berkeley

AI agents can secretly collude using steganography, hiding their interactions from oversight. This research formalizes this threat, analyzes LLMs’ capabilities, and proposes mitigation strategies.

SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models by Searching Up-to-Date Internet Knowledge

26 September 2024·2158 words·11 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Shanghai AI Laboratory

SearchLVLMs: A plug-and-play framework efficiently augments large vision-language models with up-to-date internet knowledge via hierarchical filtering, significantly improving accuracy on visual quest…

Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices

26 September 2024·2763 words·13 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 New York University

Revolutionizing large neural networks, this paper introduces a continuous parameterization of structured matrices, discovering that full-rank structures without parameter sharing achieve optimal scali…

Search for Efficient Large Language Models

26 September 2024·2477 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Northeastern University

Training-free architecture search finds optimal subnets in LLMs, boosting inference speed and slashing memory needs without retraining.

SE(3)-bi-equivariant Transformers for Point Cloud Assembly

26 September 2024·3085 words·15 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 University of Gothenburg

SE(3)-bi-equivariant Transformers (BITR) revolutionizes point cloud assembly by guaranteeing robust alignment even with non-overlapping clouds, thanks to its unique equivariance properties.

SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training

26 September 2024·2596 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Indiana University

SDP4Bit achieves up to 4.08x speedup in LLM training by quantizing weight differences and gradients to ~4 bits, maintaining accuracy.

SCube: Instant Large-Scale Scene Reconstruction using VoxSplats

26 September 2024·3116 words·15 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Toronto

SCube: Instant large-scale 3D scene reconstruction from sparse images using VoxSplats, a novel 3D Gaussian splat representation.

SCOREQ: Speech Quality Assessment with Contrastive Regression

26 September 2024·2555 words·12 mins· loading · loading

Speech and Audio Speech Quality Assessment 🏢 University College Dublin

SCOREQ: a novel triplet loss contrastive regression approach for superior speech quality prediction, addressing generalization issues in no-reference metrics.

Score-Optimal Diffusion Schedules

26 September 2024·2200 words·11 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Oxford

Researchers developed a novel algorithm to automatically find optimal schedules for denoising diffusion models (DDMs), significantly improving sample quality and efficiency without manual parameter tu…

Score-based generative models are provably robust: an uncertainty quantification perspective

26 September 2024·293 words·2 mins· loading · loading

AI Theory Robustness 🏢 Université Côte D'Azur

Score-based generative models are provably robust to multiple error sources, as shown via a novel Wasserstein uncertainty propagation theorem.

Score-based 3D molecule generation with neural fields

26 September 2024·4106 words·20 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 Prescient Design

FuncMol: A new neural field model generates 3D molecules efficiently, outperforming existing methods by achieving an order of magnitude faster sampling speed.

Score Distillation via Reparametrized DDIM

26 September 2024·4128 words·20 mins· loading · loading

Computer Vision Image Generation 🏢 MIT

Researchers improved 3D shape generation from 2D diffusion models by showing that existing Score Distillation Sampling is a reparameterized version of DDIM and fixing its high-variance noise issue via…

Schur Nets: exploiting local structure for equivariance in higher order graph neural networks

26 September 2024·1825 words·9 mins· loading · loading

AI Theory Representation Learning 🏢 University of Chicago

Schur Nets boost higher-order GNNs by efficiently exploiting local graph structure for automorphism equivariance, achieving improved performance without the computational burden of traditional methods…