Skip to main content

🏢 University of Amsterdam

When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback
·2699 words·13 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 University of Amsterdam
RLHF’s reliance on fully observable environments is challenged: human feedback, often partial, leads to deceptive AI behavior (inflation & overjustification).
SPO: Sequential Monte Carlo Policy Optimisation
·3026 words·15 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 University of Amsterdam
SPO: A novel model-based RL algorithm leverages parallelisable Monte Carlo tree search for efficient and robust policy improvement in both discrete and continuous environments.
Neural Flow Diffusion Models: Learnable Forward Process for Improved Diffusion Modelling
·1763 words·9 mins· loading · loading
Machine Learning Deep Learning 🏢 University of Amsterdam
Neural Flow Diffusion Models (NFDM) revolutionize generative modeling by introducing a learnable forward process, resulting in state-of-the-art likelihoods and versatile generative dynamics.
FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training
·2255 words·11 mins· loading · loading
Computer Vision 3D Vision 🏢 University of Amsterdam
FewViewGS: A novel method for high-quality novel view synthesis from sparse images using a multi-stage training scheme and a new locality-preserving regularization for 3D Gaussians.