↓Skip to main content

🏢 Shanghai University of Finance and Economics

Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning

26 September 2024·1675 words·8 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Shanghai University of Finance and Economics

Two-way Deconfounder tackles off-policy evaluation challenges by introducing a novel two-way unmeasured confounding assumption and a neural-network-based deconfounder, achieving consistent policy valu…

Safe and Sparse Newton Method for Entropic-Regularized Optimal Transport

26 September 2024·2040 words·10 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Shanghai University of Finance and Economics

A novel safe & sparse Newton method (SSNS) for entropic-regularized optimal transport boasts strict error control, avoids singularity, needs no hyperparameter tuning, and offers rigorous convergence a…

Faster Accelerated First-order Methods for Convex Optimization with Strongly Convex Function Constraints

26 September 2024·1492 words·8 mins· loading · loading

AI Theory Optimization 🏢 Shanghai University of Finance and Economics

Faster primal-dual algorithms achieve order-optimal complexity for convex optimization with strongly convex constraints, improving convergence rates and solving large-scale problems efficiently.

Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models

26 September 2024·1676 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Shanghai University of Finance and Economics

CherryQ, a novel quantization method, leverages parameter heterogeneity in LLMs to achieve superior performance by selectively quantizing less critical parameters while preserving essential ones.