Zusammenfassung der Ressource
Trans-dimensional Random
Fields for Language Modeling
- Introduction
- language Modelling (LM)
- Joint probability of words
- Dominant
- Conditional approach
- Represent joint
probability in
terms of
conditionals
- Alternatives
- Random field
(RF)
- Used in
- Whole-sentence
maximum entropy
(WSME LMs)
- Is a Markov Random Field
- ?
- Challenge in Fitting
- Evaluating
gradient of log
likelihood
- ?
- Approximate
- Used Sampling methods
- Gibbs
- ?
- Independent
Metropolis-hashing
- ?
- Importance
- ?
- Can not work
efficiently with
complex
high-dimensional
distributions
- Empirical results
- Not satisfactory
- Poor fitted to the data
- Exact
- Requires
high-dimensional
integration
- ?
- Poor fitted to the data
- Potential benefits
- Naturally express
sentence level
phenomena
- ?
- Integrate features
from variety
knowledge sources
- ?
- Crucial for
- Computational linguistics
- Speech recognition
- Information retrieval
- Etc.
- Research
- Revisit
- Random field
(RF)
- Innovations
- Propose
- Trans Dimensional
RF model (TDRF)
- Idea
- Take
account of
empirical
distributions
of lengths
- ?
- Allows to develop
- Markov Chain Monte
Carlo technique
- Trans-dimensional
mixture sampling
- Develop
- Training Algorithm
- Using
- Trans-dimensional
mixture sampling
- ?
- Stochastic
Approximation
(SA) framework
- ?
- Additional innovations
- Estimation of
Diagonal
elements of
hessian matrix.
- Estimated during SA
iterations to rescale
the gradient
- Improves convergence
- Word classing
- ?
- Accelerate sampling
- Improve the
smoothing
behavior
- Sharing statistical
strength between
similar woords
- Using multiple CPUs
- Parallelize training of RF model
- Fitting to the data
- Simultaneously update
- Model parameters
- Normalizing constants
- Experiments
- Wall Street
Journal 92 data
- 1000 best lists
- Oracle WER is 3.4%
- Kaldi toolkit,
DNN acoustic
model
- MORE: In paper!
- Comparison
- TDRF
- Performance
- As Good As
- Recurrent neural networks
- Computational
- More efficient than
- Recurrent neural networks
- Computing
sentence
probability
- Recurrent neural networks
- ?