Skip to main content

🏢 Indian Institute of Technology Kanpur

Sample-Efficient Constrained Reinforcement Learning with General Parameterization
·263 words·2 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 Indian Institute of Technology Kanpur
Accelerated Primal-Dual Natural Policy Gradient (PD-ANPG) algorithm achieves a theoretical lower bound sample complexity for solving general parameterized CMDPs, improving state-of-the-art by a factor…
COLD: Causal reasOning in cLosed Daily activities
·3472 words·17 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Indian Institute of Technology Kanpur
COLD framework rigorously evaluates LLMs’ causal reasoning in everyday scenarios using 9 million causal queries derived from human-generated scripts of daily activities.