Skip to main content

🏢 ETH Zurich

WildGaussians: 3D Gaussian Splatting In the Wild
·2601 words·13 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 ETH Zurich
WildGaussians enhances 3D Gaussian splatting for real-time rendering of photorealistic 3D scenes from in-the-wild images featuring occlusions and appearance changes.
When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL
·2003 words·10 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 ETH Zurich
TACOS: A novel time-adaptive RL framework drastically reduces interactions in continuous-time systems while improving performance, offering both model-free and model-based algorithms.
Weight decay induces low-rank attention layers
·1731 words·9 mins· loading · loading
Machine Learning Deep Learning 🏢 ETH Zurich
Weight decay in deep learning surprisingly induces low-rank attention layers, potentially harming performance but offering optimization strategies for large language models.
Unity by Diversity: Improved Representation Learning for Multimodal VAEs
·3037 words·15 mins· loading · loading
Multimodal Learning Multimodal Generation 🏢 ETH Zurich
MMVM VAE enhances multimodal data analysis by using a soft constraint to guide each modality’s latent representation toward a shared aggregate, improving latent representation learning and missing dat…
UniSDF: Unifying Neural Representations for High-Fidelity 3D Reconstruction of Complex Scenes with Reflections
·3538 words·17 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 ETH Zurich
UniSDF: Unifying neural representations reconstructs complex scenes with reflections, achieving state-of-the-art performance by blending camera and reflected view radiance fields.
UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation
·2138 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 ETH Zurich
UniBias unveils and mitigates LLM bias by identifying and eliminating biased internal components (FFN vectors and attention heads), significantly improving in-context learning performance and robustne…
Understanding the Differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks
·1735 words·9 mins· loading · loading
Natural Language Processing Large Language Models 🏢 ETH Zurich
Unifying framework reveals hidden connections between attention, recurrent, and state-space models, boosting foundation model efficiency.
Understanding and Minimising Outlier Features in Transformer Training
·5007 words·24 mins· loading · loading
Natural Language Processing Large Language Models 🏢 ETH Zurich
New methods minimize outlier features in transformer training, improving quantization and efficiency without sacrificing convergence speed.
Transductive Active Learning: Theory and Applications
·3403 words·16 mins· loading · loading
Machine Learning Active Learning 🏢 ETH Zurich
This paper introduces transductive active learning, proving its efficiency in minimizing uncertainty and achieving state-of-the-art results in neural network fine-tuning and safe Bayesian optimization…
Testably Learning Polynomial Threshold Functions
·248 words·2 mins· loading · loading
AI Generated AI Theory Generalization 🏢 ETH Zurich
Testably learning polynomial threshold functions efficiently, matching agnostic learning’s best guarantees, is achieved, solving a key problem in robust machine learning.
SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code Agents
·3127 words·15 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 ETH Zurich
SWT-Bench, a new benchmark, reveals that LLMs excel at generating tests for real-world bug fixes, surpassing dedicated test generation systems and significantly improving code-fix precision.
Super Consistency of Neural Network Landscapes and Learning Rate Transfer
·3859 words·19 mins· loading · loading
Machine Learning Deep Learning 🏢 ETH Zurich
Neural network hyperparameter transferability across vastly different model sizes is achieved via a newly discovered property called ‘Super Consistency’ of loss landscapes.
Stochastic Concept Bottleneck Models
·2532 words·12 mins· loading · loading
AI Generated AI Theory Interpretability 🏢 ETH Zurich
Stochastic Concept Bottleneck Models (SCBMs) revolutionize interpretable ML by efficiently modeling concept dependencies, drastically improving intervention effectiveness and enabling CLIP-based conce…
SPEAR: Exact Gradient Inversion of Batches in Federated Learning
·2907 words·14 mins· loading · loading
Machine Learning Federated Learning 🏢 ETH Zurich
SPEAR, a novel algorithm, precisely reconstructs entire data batches from gradients in federated learning, defying previous limitations and enhancing privacy risk assessment.
Safe Time-Varying Optimization based on Gaussian Processes with Spatio-Temporal Kernel
·1841 words·9 mins· loading · loading
AI Applications Robotics 🏢 ETH Zurich
TVSAFEOPT: Safe time-varying optimization using spatio-temporal kernels ensures safety while tracking time-varying reward and safety functions, providing optimality guarantees in stationary settings.
Robust Mixture Learning when Outliers Overwhelm Small Groups
·2570 words·13 mins· loading · loading
AI Generated AI Theory Robustness 🏢 ETH Zurich
Outlier-robust mixture learning gets order-optimal error guarantees, even when outliers massively outnumber small groups, via a novel meta-algorithm leveraging mixture structure.
Recurrent neural networks: vanishing and exploding gradients are not the end of the story
·2602 words·13 mins· loading · loading
AI Theory Optimization 🏢 ETH Zurich
Recurrent neural networks struggle with long-term memory due to a newly identified ‘curse of memory’: increasing parameter sensitivity with longer memory. This work provides insights into RNN optimiza…
QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
·3782 words·18 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 ETH Zurich
QuaRot: Revolutionizing 4-bit LLM inference with lossless quantization via rotation!
Private Edge Density Estimation for Random Graphs: Optimal, Efficient and Robust
·261 words·2 mins· loading · loading
AI Theory Privacy 🏢 ETH Zurich
This paper delivers a groundbreaking polynomial-time algorithm for optimally estimating edge density in random graphs while ensuring node privacy and robustness against data corruption.
Poseidon: Efficient Foundation Models for PDEs
·9448 words·45 mins· loading · loading
AI Theory Representation Learning 🏢 ETH Zurich
POSEIDON: a novel foundation model for PDEs achieves significant gains in accuracy and sample efficiency, generalizing well to unseen physics.