Главная
Study mode:
on
1
Intro
2
Statistical Machine Translation
3
Motivation
4
Grid Search
5
Method Overview
6
Common Regularization
7
Objective Function
8
Proximal Gradient Methods
9
Experiments: 5-gram Language Modeling
10
5-gram Perplexity
11
Behavior During Training
12
Key Takeaways
13
Optimal Hyperparameters Not Universal
14
Auto-Sizing Transformer Layers
15
Pytorch Implementation
16
Beam Search
17
Perceptron Tuning
18
Experiment: Tuned Reward
19
Questions?
Description:
Explore neural network hyperparameter optimization for machine translation in this 52-minute conference talk by Kenton Murray, a PhD candidate at the University of Notre Dame. Dive into methods for improving hyperparameter selection without extensive grid searches, focusing on techniques that learn optimal parameters during the training process. Examine common regularization techniques, objective functions, and proximal gradient methods. Analyze experiments in 5-gram language modeling and auto-sizing transformer layers. Discover key takeaways about the non-universality of optimal hyperparameters and the potential of perceptron tuning for beam search. Gain insights into implementing these techniques in PyTorch and their application to low-resource and morphologically rich language pairs.

Learning Neural Network Hyperparameters for Machine Translation - 2019

Center for Language & Speech Processing(CLSP), JHU
Add to list
0:00 / 0:00