MXNet的Model API
MXNet的APImxnet里面的model API不是真的API,它只不过是一个对ndarray的一个封装,使其更容易使用。训练一个模型为了训练一个模型,你需要遵循以下两步,第一步是使用symbol来构造,然后调用model.Feedforward.create这个方法来创建一个model。下面的代码创建了一个两层的神经网络。# configure a two layer n
MXNet的API
训练一个模型
保存模型
阶段性的点检测(Checkpoint)
使用多个设备
模型API
MXNet模型模块
-
alias of
BatchEndParams
-
mxnet.model.
BatchEndParam
¶
mxnet.model.
save_checkpoint
(
prefix,
epoch,
symbol,
arg_params,
aux_params
)
Checkpoint the model data into file.
Parameters: |
|
---|
Notes
prefix-symbol.json
will be saved for symbol.prefix-epoch.params
will be saved for parameters.
-
Notes
prefix-symbol.json
will be saved for symbol.prefix-epoch.params
will be saved for parameters.
mxnet.model.
load_checkpoint
(
prefix,
epoch
)
Load model checkpoint from file.
Parameters: |
|
---|---|
Returns: |
|
mxnet.model.
FeedForward
(
symbol,
ctx=None,
num_epoch=None,
epoch_size=None,
optimizer='sgd',
initializer=<mxnet.initializer.Uniform object>,
numpy_batch_size=128,
arg_params=None,
aux_params=None,
allow_extra_params=False,
begin_epoch=0,
**kwargs
)
¶
Model class of MXNet for training and predicting feedforward nets. This class is designed for a single-data single output supervised network.
Parameters: |
|
---|
predict
(
X,
num_batch=None,
return_data=False,
reset=True
)
¶
Run the prediction, always only use one device. :param X: :type X: mxnet.DataIter :param num_batch: the number of batch to run. Go though all batches if None :type num_batch: int or None
Returns: | y – The predicted value of the output. |
---|---|
Return type: | numpy.ndarray or a list of numpy.ndarray if the network has multiple outputs. |
score
(
X,
eval_metric='acc',
num_batch=None,
batch_end_callback=None,
reset=True
)
Run the model on X and calculate the score with eval_metric :param X: :type X: mxnet.DataIter :param eval_metric: The metric for calculating score :type eval_metric: metric.metric :param num_batch: the number of batch to run. Go though all batches if None :type num_batch: int or None
Returns: | s – the final score |
---|---|
Return type: | float |
fit
(
X,
y=None,
eval_data=None,
eval_metric='acc',
epoch_end_callback=None,
batch_end_callback=None,
kvstore='local',
logger=None,
work_load_list=None,
monitor=None,
eval_batch_end_callback=None
)
Fit the model.
Parameters: |
|
---|
-
Checkpoint the model checkpoint into file. You can also use pickle to do the job if you only work on python. The advantage of load/save is the file is language agnostic. This means the file saved using save can be loaded by other language binding of mxnet. You also get the benefit being able to directly load/save from cloud storage(S3, HDFS)
Parameters: prefix (str) – Prefix of model name. Notes
prefix-symbol.json
will be saved for symbol.prefix-epoch.params
will be saved for parameters.
save
(
prefix,
epoch=None
)
-
static
-
Load model checkpoint from file.
Parameters: - prefix (str) – Prefix of model name.
- epoch (int) – epoch number of model we would like to load.
- ctx (Context or list of Context, optional) – The device context of training and prediction.
- kwargs (dict) – other parameters for model, including num_epoch, optimizer and numpy_batch_size
Returns: model – The loaded model that can be used for prediction.
Return type:
load
(
prefix,
epoch,
ctx=None,
**kwargs
)
create
(
symbol,
X,
y=None,
ctx=None,
num_epoch=None,
epoch_size=None,
optimizer='sgd',
initializer=<mxnet.initializer.Uniform object>,
eval_data=None,
eval_metric='acc',
epoch_end_callback=None,
batch_end_callback=None,
kvstore='local',
logger=None,
work_load_list=None,
eval_batch_end_callback=None,
**kwargs
)
¶
Functional style to create a model. This function will be more consistent with functional languages such as R, where mutation is not allowed.
Parameters: |
|
---|
接下去的这些API不常用到
初使化的API参考
-
Base class for Initializer.
-
Override () function to do Initialization
Parameters: - name (str) – name of corrosponding ndarray
- arr (NDArray) – ndarray to be Initialized
__call__
( name, arr ) -
-
Initialize by loading pretrained param from file or dict
Parameters: - param (str or dict of str->NDArray) – param file or dict mapping name to NDArray.
- default_init (Initializer) – default initializer when name is not found in param.
- verbose (bool) – log source when initializing.
-
Initialize with mixed Initializer
Parameters: - patterns (list of str) – list of regular expression patterns to match parameter names.
- initializers (list of Initializer) – list of Initializer corrosponding to patterns
-
Initialize the weight with uniform [-scale, scale]
Parameters: scale (float, optional) – The scale of uniform distribution -
Initialize the weight with normal(0, sigma)
Parameters: sigma (float, optional) – Standard deviation for gaussian distribution. -
Intialize weight as Orthogonal matrix
Parameters: - scale (float optional) – scaling factor of weight
- rand_type (string optional) – use “uniform” or “normal” random number to initialize weight
- Reference –
- --------- –
- solutions to the nonlinear dynamics of learning in deep linear neural networks(Exact) –
- preprint arXiv (arXiv) –
-
Initialize the weight with Xavier or similar initialization scheme.
Parameters: - rnd_type (str, optional) – Use
`gaussian`
or`uniform`
to init - factor_type (str, optional) – Use
`avg`
,`in`
, or`out`
to init - magnitude (float, optional) – scale of random number range
- rnd_type (str, optional) – Use
-
-
-
class
mxnet.initializer.
Initializer
¶
-
class
mxnet.initializer.
Load
(
param,
default_init=None,
verbose=False
)
-
class
mxnet.initializer.
Mixed
(
patterns,
initializers
)
-
class
mxnet.initializer.
Uniform
(
scale=0.07
)
-
class
mxnet.initializer.
Normal
(
sigma=0.01
)
-
class
mxnet.initializer.
Orthogonal
(
scale=1.414,
rand_type='uniform'
)
-
class
mxnet.initializer.
Xavier
(
rnd_type='uniform',
factor_type='avg',
magnitude=3
)
Online evaluation metric module.
-
Check to see if the two arrays are the same size.
mxnet.metric.
check_label_shapes
(
labels,
preds,
shape=0
)
-
class
-
Base class of all evaluation metrics.
-
Update the internal evaluation.
Parameters: - labels (list of NDArray) – The labels of the data.
- preds (list of NDArray) – Predicted values.
update
( label, pred )-
Clear the internal statistics to initial state.
reset
( )-
Get the current evaluation result.
Returns: - name (str) – Name of the metric.
- value (float) – Value of the evaluation.
get
( )-
Get zipped name and value pairs
get_name_value
( ) -
mxnet.metric.
EvalMetric
(
name,
num=None
)
-
class
-
Manage multiple evaluation metrics.
-
Add a child metric.
add
( metric )-
Get a child metric.
get_metric
( index ) -
mxnet.metric.
CompositeEvalMetric
(
**kwargs
)
-
class
-
Calculate accuracy
mxnet.metric.
Accuracy
-
class
-
Calculate top k predictions accuracy
mxnet.metric.
TopKAccuracy
(
**kwargs
)
-
class
-
Calculate the F1 score of a binary classification problem.
mxnet.metric.
F1
-
class
-
Calculate Mean Absolute Error loss
mxnet.metric.
MAE
-
class
-
Calculate Mean Squared Error loss
mxnet.metric.
MSE
-
class
-
Calculate Root Mean Squred Error loss
mxnet.metric.
RMSE
-
class
-
Calculate Cross Entropy loss
mxnet.metric.
CrossEntropy
-
class
-
Dummy metric for torch criterions
mxnet.metric.
Torch
-
class
-
Custom evaluation metric that takes a NDArray function.
Parameters: - feval (callable(label, pred)) – Customized evaluation function.
- name (str, optional) – The name of the metric
- allow_extra_outputs (bool) – If true, the prediction outputs can have extra outputs. This is useful in RNN, where the states are also produced in outputs for forwarding.
mxnet.metric.
CustomMetric
(
feval,
name=None,
allow_extra_outputs=False
)
-
Create a customized metric from numpy function.
Parameters: - numpy_feval (callable(label, pred)) – Customized evaluation function.
- name (str, optional) – The name of the metric.
- allow_extra_outputs (bool) – If true, the prediction outputs can have extra outputs. This is useful in RNN, where the states are also produced in outputs for forwarding.
mxnet.metric.
np
(
numpy_feval,
name=None,
allow_extra_outputs=False
)
-
Create an evaluation metric.
Parameters: metric (str or callable) – The name of the metric, or a function providing statistics given pred, label NDArray
mxnet.metric.
create
(
metric,
**kwargs
)
Common Optimization algorithms with regularizations.
-
class
-
Base class of all optimizers.
-
static
-
Register optimizers to the optimizer factory
register
( klass )-
static
-
Create an optimizer with specified name.
Parameters: - name (str) – Name of required optimizer. Should be the name of a subclass of Optimizer. Case insensitive.
- rescale_grad (float) – Rescaling factor on gradient.
- kwargs (dict) – Parameters for optimizer
Returns: opt – The result optimizer.
Return type:
create_optimizer
( name, rescale_grad=1, **kwargs )-
Create additional optimizer state such as momentum. override in implementations.
create_state
( index, weight )-
Update the parameters. override in implementations
update
( index, weight, grad, state )-
set lr scale is deprecated. Use set_lr_mult instead.
set_lr_scale
( args_lrscale )-
Set individual learning rate multipler for parameters
Parameters: args_lr_mult (dict of string/int to float) – set the lr multipler for name/index to float. setting multipler by index is supported for backward compatibility, but we recommend using name and symbol.
set_lr_mult
( args_lr_mult )-
Set individual weight decay multipler for parameters. By default wd multipler is 0 for all params whose name doesn’t end with _weight, if param_idx2name is provided.
Parameters: args_wd_mult (dict of string/int to float) – set the wd multipler for name/index to float. setting multipler by index is supported for backward compatibility, but we recommend using name and symbol.
set_wd_mult
( args_wd_mult ) -
mxnet.optimizer.
Optimizer
(
rescale_grad=1.0,
param_idx2name=None,
wd=0.0,
clip_gradient=None,
learning_rate=0.01,
lr_scheduler=None,
sym=None
)
-
Register optimizers to the optimizer factory
mxnet.optimizer.
register
(
klass
)
-
class
-
A very simple SGD optimizer with momentum and weight regularization.
Parameters: - learning_rate (float, optional) – learning_rate of SGD
- momentum (float, optional) – momentum value
- wd (float, optional) – L2 regularization coefficient add to all the weights
- rescale_grad (float, optional) – rescaling factor of gradient.
- clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
- param_idx2name (dict of string/int to float, optional) – special treat weight decay in parameter ends with bias, gamma, and beta
-
Create additional optimizer state such as momentum.
Parameters: weight (NDArray) – The weight data
create_state
( index, weight )
mxnet.optimizer.
SGD
(
momentum=0.0,
**kwargs
)
-
class
-
SGD with nesterov It is implemented according to https://github.com/torch/optim/blob/master/sgd.lua
mxnet.optimizer.
NAG
(
**kwargs
)
-
class
-
Stochastic Langevin Dynamics Updater to sample from a distribution.
Parameters: - learning_rate (float, optional) – learning_rate of SGD
- wd (float, optional) – L2 regularization coefficient add to all the weights
- rescale_grad (float, optional) – rescaling factor of gradient.
- clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
- param_idx2name (dict of string/int to float, optional) – special treat weight decay in parameter ends with bias, gamma, and beta
-
Create additional optimizer state such as momentum.
Parameters: weight (NDArray) – The weight data
create_state
( index, weight )
mxnet.optimizer.
SGLD
(
**kwargs
)
-
class
-
A very simple SGD optimizer with momentum and weight regularization. Implemented in C++.
Parameters: - learning_rate (float, optional) – learning_rate of SGD
- momentum (float, optional) – momentum value
- wd (float, optional) – L2 regularization coefficient add to all the weights
- rescale_grad (float, optional) – rescaling factor of gradient.
- clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
mxnet.optimizer.
ccSGD
(
momentum=0.0,
**kwargs
)
-
class
-
Adam optimizer as described in [King2014].
[King2014] Diederik Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization,http://arxiv.org/abs/1412.6980 the code in this class was adapted from https://github.com/mila-udem/blocks/blob/master/blocks/algorithms/__init__.py#L765
Parameters: - learning_rate (float, optional) – Step size. Default value is set to 0.002.
- beta1 (float, optional) – Exponential decay rate for the first moment estimates. Default value is set to 0.9.
- beta2 (float, optional) – Exponential decay rate for the second moment estimates. Default value is set to 0.999.
- epsilon (float, optional) – Default value is set to 1e-8.
- decay_factor (float, optional) – Default value is set to 1 - 1e-8.
- wd (float, optional) – L2 regularization coefficient add to all the weights
- rescale_grad (float, optional) – rescaling factor of gradient.
- clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
-
Create additional optimizer state: mean, variance
Parameters: weight (NDArray) – The weight data
create_state
( index, weight )
mxnet.optimizer.
Adam
(
learning_rate=0.001,
beta1=0.9,
beta2=0.999,
epsilon=1e-08,
decay_factor=0.99999999,
**kwargs
)
-
class
-
AdaGrad optimizer of Duchi et al., 2011,
This code follows the version in http://arxiv.org/pdf/1212.5701v1.pdf Eq(5) by Matthew D. Zeiler, 2012. AdaGrad will help the network to converge faster in some cases.
Parameters: - learning_rate (float, optional) – Step size. Default value is set to 0.05.
- wd (float, optional) – L2 regularization coefficient add to all the weights
- rescale_grad (float, optional) – rescaling factor of gradient.
- eps (float, optional) – A small float number to make the updating processing stable Default value is set to 1e-7.
- clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
mxnet.optimizer.
AdaGrad
(
eps=1e-07,
**kwargs
)
-
class
-
RMSProp optimizer of Tieleman & Hinton, 2012,
This code follows the version in http://arxiv.org/pdf/1308.0850v5.pdf Eq(38) - Eq(45) by Alex Graves, 2013.
Parameters: - learning_rate (float, optional) – Step size. Default value is set to 0.002.
- gamma1 (float, optional) – decay factor of moving average for gradient, gradient^2. Default value is set to 0.95.
- gamma2 (float, optional) – “momentum” factor. Default value if set to 0.9.
- wd (float, optional) – L2 regularization coefficient add to all the weights
- rescale_grad (float, optional) – rescaling factor of gradient.
- clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
-
Create additional optimizer state: mean, variance :param weight: The weight data :type weight: NDArray
create_state
( index, weight )
mxnet.optimizer.
RMSProp
(
gamma1=0.95,
gamma2=0.9,
**kwargs
)
-
class
-
AdaDelta optimizer as described in Zeiler, M. D. (2012). ADADELTA: An adaptive learning rate method.
http://arxiv.org/abs/1212.5701
Parameters: - rho (float) – Decay rate for both squared gradients and delta x
- epsilon (float) – The constant as described in the thesis
- wd (float) – L2 regularization coefficient add to all the weights
- rescale_grad (float, optional) – rescaling factor of gradient.
- clip_gradient (float, optional) – clip gradient in range [-clip_gradient, clip_gradient]
mxnet.optimizer.
AdaDelta
(
rho=0.9,
epsilon=1e-05,
**kwargs
)
-
class
-
For test use
-
Create a state to duplicate weight
create_state
( index, weight )-
performs w += rescale_grad * grad
update
( index, weight, grad, state ) -
mxnet.optimizer.
Test
(
**kwargs
)
-
Create an optimizer with specified name.
Parameters: - name (str) – Name of required optimizer. Should be the name of a subclass of Optimizer. Case insensitive.
- rescale_grad (float) – Rescaling factor on gradient.
- kwargs (dict) – Parameters for optimizer
Returns: opt – The result optimizer.
Return type:
mxnet.optimizer.
create
(
name,
rescale_grad=1,
**kwargs
)
-
Return a clossure of the updater needed for kvstore
Parameters: optimizer (Optimizer) – The optimizer Returns: updater – The clossure of the updater Return type: function
mxnet.optimizer.
get_updater
(
optimizer
)
更多推荐
所有评论(0)