🏢 EPFL

Why the Metric Backbone Preserves Community Structure

26 September 2024·2073 words·10 mins· loading · loading

AI Theory Optimization 🏢 EPFL

Metric backbone graph sparsification surprisingly preserves community structure, offering an efficient and robust method for analyzing large networks.

Why Do We Need Weight Decay in Modern Deep Learning?

26 September 2024·3285 words·16 mins· loading · loading

AI Theory Optimization 🏢 EPFL

Weight decay’s role in modern deep learning is surprisingly multifaceted, impacting optimization dynamics rather than solely regularization, improving generalization and training stability.

SuperDeepFool: a new fast and accurate minimal adversarial attack

26 September 2024·4315 words·21 mins· loading · loading

AI Generated AI Theory Robustness 🏢 EPFL

SuperDeepFool: a fast, accurate algorithm generating minimal adversarial perturbations, significantly improving deep learning model robustness evaluation and adversarial training.

SGD vs GD: Rank Deficiency in Linear Networks

26 September 2024·381 words·2 mins· loading · loading

AI Theory Optimization 🏢 EPFL

SGD surprisingly diminishes network rank, unlike GD, due to a repulsive force between eigenvalues, offering insights into deep learning generalization.

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

26 September 2024·4063 words·20 mins· loading · loading

Large Language Models 🏢 EPFL

Revolutionizing LLM training: Constant learning rate with cooldown replaces cosine schedule, enabling cost-effective scaling experiments!

SAMPa: Sharpness-aware Minimization Parallelized

26 September 2024·2453 words·12 mins· loading · loading

Machine Learning Optimization 🏢 EPFL

SAMPa: Parallelizing gradient computations in Sharpness-Aware Minimization (SAM) achieves a 2x speedup and superior generalization.

Revisiting Ensembling in One-Shot Federated Learning

26 September 2024·3849 words·19 mins· loading · loading

AI Generated Machine Learning Federated Learning 🏢 EPFL

FENS: a novel federated ensembling scheme that boosts one-shot federated learning accuracy to near iterative FL levels, while maintaining low communication costs.

Local to Global: Learning Dynamics and Effect of Initialization for Transformers

26 September 2024·2433 words·12 mins· loading · loading

AI Generated Natural Language Processing Text Generation 🏢 EPFL

Transformers’ learning dynamics depend heavily on initialization and Markovian data properties, leading to either global or local minima; this paper proves this, offers initialization guidelines, and …

Implicit Bias of Mirror Flow on Separable Data

26 September 2024·1523 words·8 mins· loading · loading

AI Theory Optimization 🏢 EPFL

Mirror descent’s implicit bias on separable data is formally characterized, revealing convergence towards a maximum margin classifier determined by the potential’s ‘horizon function’.

Graph Edit Distance with General Costs Using Neural Set Divergence

26 September 2024·3177 words·15 mins· loading · loading

Machine Learning Deep Learning 🏢 EPFL

GRAPHEDX, a novel neural network, accurately estimates graph edit distance with varying operation costs, outperforming existing methods.

Generative Modelling of Structurally Constrained Graphs

26 September 2024·5840 words·28 mins· loading · loading

AI Generated AI Applications Healthcare 🏢 EPFL

ConStruct: Generating realistic graphs with guaranteed structural properties via constrained diffusion.

Fine-Tuning Personalization in Federated Learning to Mitigate Adversarial Clients

26 September 2024·1780 words·9 mins· loading · loading

Machine Learning Federated Learning 🏢 EPFL

Fine-tune personalization in federated learning to beat adversarial clients; collaboration level depends on data heterogeneity and adversary fraction.

DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging

26 September 2024·3222 words·16 mins· loading · loading

Natural Language Processing Large Language Models 🏢 EPFL

DenseFormer enhances transformers by adding a depth-weighted averaging step, improving data efficiency and outperforming baselines in memory and inference time without increasing model size.

CoBo: Collaborative Learning via Bilevel Optimization

26 September 2024·1628 words·8 mins· loading · loading

Machine Learning Federated Learning 🏢 EPFL

CoBo: A novel bilevel optimization algorithm for collaborative learning surpasses existing methods by efficiently selecting helpful clients, resulting in superior performance and scalability.