"Mxnet"의 두 판 사이의 차이

2017년 7월 4일 (화) 19:06 기준 최신판

뭘 이렇게들 만들어 대는지. tf가 맘에 안들기는 하지만.

etc

batch normalization example

def ConvFactory(data, num_filter, kernel, 
                stride=(1,1), pad=(0, 0),name=None, suffix=''):
    conv = mx.sym.Convolution(data=data,
                  num_filter=num_filter,
                  kernel=kernel,
                  stride=stride,
                  pad=pad,
                  name='conv_%s%s' %(name, suffix))
    bn = mx.sym.BatchNorm(data=conv, name='bn_%s%s' %(name, suffix))
    act = mx.sym.Activation(data=bn, act_type='relu', name='relu_%s%s'
                  %(name, suffix))
    return act

feature extraction
- Predict with pre-trained models
- How do I fine-tune pre-trained models to a new dataset?

@@ 3번째 줄: / 3번째 줄: @@
 뭘 이렇게들 만들어 대는지. [https://www.tensorflow.org tf]가 맘에 안들기는 하지만.
-==Basics==
+[[Mxnet/Basics]]
-걍 numpy를 쓰지 않는 이유는 cpu, gpu등 자유로이 알아서(?) 처리해주고, 병렬까지도 알아서(?) 한다고 함. [http://mxnet.io/tutorials/basic/ndarray.html] <del>tf도 해주지 않냐?</del>
-{{c|broadcast}}[http://mxnet.io/tutorials/basic/ndarray.html#broadcast]: {{c|rep}}같은건가봄.
+[[Mxnet/Training and Inference]]
-{{c|pickle.dump}}말고 {{c|mx.nd.load, mx.nd.save}}를 쓸 수 있다. [http://mxnet.io/tutorials/basic/ndarray.html#serialize-from-to-distributed-filesystems]
+==etc==
+* batch normalization example
-symbolic api를 설명[http://mxnet.io/tutorials/basic/symbol.html]하면서 중간에 장점이 하나 나오는데 이런게 있었네 싶었음. ㅎㅎ : 미리 그래프를 짜 놓으면 나중에 어떤 결과값이 필요할지 미리 알 수 있어서 계산중간값들을 모두 저장해둘 필요가 없다. 메모리가 절약됨.<br>
+<pre>def ConvFactory(data, num_filter, kernel,
-관련해서, symbolic programming을 declarative programming이라고도 하고 이 반대를 imperative programming이라고 하는 모양. <del>imperative programming은 단어만 보면 이게 도대체 뭔소린가 싶었다.</del> Declarative programming의 예: regular expression, SQL.
+                stride=(1,1), pad=(0, 0),name=None, suffix=''):
+    conv = mx.sym.Convolution(data=data,
-매뉴얼 따라하다가 {{c|graphviz}}때문에 에러남
+                  num_filter=num_filter,
- ExecutableNotFound: failed to execute ['dot', '-Tsvg'], make sure the Graphviz executables are on your systems' PATH
+                  kernel=kernel,
-맥이라 걍 포기. {{c|brew}}하면 된다는데 걍 안하고 원격 리눅스에서나. 시스템에도 있어야 하고, pip로도 있어야 한다.(우분투에서 {{c|apt ~}} 랑 {{c|pip install ~}} 다 해줘야 한다는 얘기)
+                  stride=stride,
+                  pad=pad,
-bind → forward해서 output을 얻지 않고, 바로 eval할 수도 있다.
+                  name='conv_%s%s' %(name, suffix))
+    bn = mx.sym.BatchNorm(data=conv, name='bn_%s%s' %(name, suffix))
-<u>{{c|tojson()}}</u>
+    act = mx.sym.Activation(data=bn, act_type='relu', name='relu_%s%s'
- print(c.tojson())
+                  %(name, suffix))
- c.save('symbol-c.json')
+    return act</pre>
- c2 = mx.sym.load('symbol-c.json')
+* feature extraction
+**[https://github.com/dmlc/mxnet/blob/master/docs/tutorials/python/predict_image.md Predict with pre-trained models]
-type cast
+** [http://mxnet.io/how_to/finetune.html#train How do I fine-tune pre-trained models to a new dataset?]
- a = mx.sym.Variable('data')
- b = mx.sym.cast(data=a, dtype='float16’)
-===Module===
-====Creation====
-<pre>mod = mx.mod.Module(symbol=net,
-                    context=mx.cpu(),
-                    data_names=['data'],
-                    label_names=['softmax_label' ])</pre>
-====Intermediate-level Interface====
-먼저 대강
- mx.test_utils.download
- np.genfromtxt
- ... # data와 label분리.
- mx.io.NDArrayIter # batch 분리
- ... # net만들고
- mod = mx.mod.Module(symbol=net, context=mx.cpu(), ...)
-한 다음,
-<pre># memory alloc
-mod.bind(data_shapes=train_iter.provide_data, label_shapes=train_iter.provide_label)
-mod.init_params(initializer=mx.init.Uniform(scale=.1))
-mod.init_optimizer(optimizer='sgd', optimizer_params=(('learning_rate', 0.1), ))
-metric = mx.metric.create('acc')
-for epoch in range(5):
-    train_iter.reset()
-    metric.reset()
-    for batch in train_iter:
-        mod.forward(batch, is_train=True)       # (1)
-        mod.update_metric(metric, batch.label)  # (2)
-        mod.backward()                          # (3)
-        mod.update()
-    print('Epoch %d, Training %s' % (epoch, metric.get()))</pre>
-* (2)의 {{c|batch.label}}이 어디서 나오나 했는데 다음 section([http://mxnet.io/tutorials/basic/data.html iterators])에 보면 나온다.
-** [http://mxnet.io/api/python/io.html#mxnet.io.DataBatch API문서]도 있음.
-* (2)에서 {{c|mod}}에 metric을 알려주므로, (3)에서 {{c|backward}}만 불러도, gradient계산한다.
-* metric.create에서 여러가지 할 수 있는데, [http://mxnet.io/api/python/metric.html mxnet.metric api]에서 볼 수 있다.
-** {{c|Accuracy, TopKAccuracy, F1, Perplexity, MAE, MSE, RMSE, CrossEntropy, Loss, Torch, Caffe<ref>{{c|Loss, Torch, Caffe}}는 Dummy metrics</ref>, CustomMetric, np<ref>numpy array를 입력으로 받는 custom metric</ref>}}가 있다.
-** {{c|acc}}등으로 줄여쓸 수 있는것 같은데 그것도 문서화 안된것 같다. {{c|acc}}(accuracy), {{c|top_k_acc}}(top-k-accuracy), {{c|ce}}(CrossEntropy) 가능함.[http://mxnet.io/tutorials/basic/module.html#predict-and-evaluate]
-* (1)의 {{c|forward}}에서 {{c|is_train}}은, undocumented인것 같다. 걍 train할때는 무조건 {{c|True}}주기로. 기본값은 {{c|None}}. [http://mxnet.io/api/python/module.html?highlight=module.for#mxnet.module.BaseModule.forward] [https://github.com/dmlc/mxnet/issues/1822]
-====High-level Interface====
-<pre>train_iter.reset()
-mod = mx.mod.Module(symbol=net, context=mx.cpu(), data_names=['data'], label_names=['softmax_label'])
-mod.fit(train_iter,
-        eval_data=val_iter,
-        optimizer='sgd',
-        optimizer_params={'learning_rate':0.1},
-        eval_metric='acc',
-        num_epoch=8)
-</pre>
-<pre>y = mod.predict(val_iter)</pre>
-predict결과 없이 그냥 evaluation만 하려면,
- score = mod.score(val_iter, ['mse', ‘acc'])
-====Save and Load====
-체크포인트 설정
- model_prefix = 'mx_mlp'
- checkpoint = mx.callback.do_checkpoint(model_prefix)
- mod = mx.mod.Module(symbol=net)
- mod.fit(train_iter, num_epoch=5, epoch_end_callback=checkpoint)
-불러오기
- sym, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, 3) # 3:epoch
-불러와서 다시 설정
- mod.set_params(arg_params, aux_params)
-Simply resume training.
-<pre>mod = mx.mod.Module(symbol=sym)
-mod.fit(train_iter,
-        num_epoch=8,
-        arg_params=arg_params,
-        aux_params=aux_params,
-        begin_epoch=3)
-</pre>
-===Iterators - Loading data===
-* Data iterator는 [http://mxnet.io/api/python/io.html#mxnet.io.DataBatch {{c|DataBatch}}]를 반환한다.
-* csv파일로부터 읽기: [http://mxnet.io/api/python/io.html#mxnet.io.CSVIter {{c|CSVIter}}]
-* custom iterator도 지원한다.
-* {{c|mx.image}}쓰려면 OpenCV있어야 한다.(CV2 아니다)
-* {{c|ImageRecordIter}}나 {{c|ImageIter}}를 사용해서 이미지파일 데이터를 학습데이터로 쓸 수 있다. 이때 {{c|RecoredIO}}파일 미리 있어야 한다.
-* 학습할때 뿐 아니라 score계산할때도 iterator쓴다.
-<pre>eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False)
-model.score(eval_iter, metric)</pre>
-==Training and Inference==
-===Linear Regression===
-[http://mxnet.io/tutorials/python/linear-regression.html#linear-regression 원문]
-<br>전체 소스
-<pre>import mxnet as mx
-import numpy as np
-#Training data
-train_data = np.random.uniform(0, 1, [100, 2])
-train_label = np.array([train_data[i][0] + 2 * train_data[i][1] for i in range(100)])
-batch_size = 1
-#Evaluation Data
-eval_data = np.array([[7,2],[6,10],[12,2]])
-eval_label = np.array([11,26,16])
-train_iter = mx.io.NDArrayIter(train_data,train_label, batch_size, shuffle=True,label_name='lin_reg_label')           #(1)
-eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False)
-X = mx.sym.Variable('data')
-Y = mx.symbol.Variable('lin_reg_label')         #(2)
-fully_connected_layer  = mx.sym.FullyConnected(data=X, name='fc1', num_hidden = 1)
-lro = mx.sym.LinearRegressionOutput(data=fully_connected_layer, label=Y, name="lro")             #(3)
-model = mx.mod.Module(
-    symbol = lro ,
-    data_names=['data'],
-    label_names = ['lin_reg_label']# network structure      (4)
-)
-mx.viz.plot_network(symbol=lro)
-# (5)
-model.fit(train_iter, eval_iter,
-            optimizer_params={'learning_rate':0.005, 'momentum': 0.9},
-            num_epoch=1000,
-            batch_end_callback = mx.callback.Speedometer(batch_size, 2))
-model.predict(eval_iter).asnumpy()
-metric = mx.metric.MSE()
-model.score(eval_iter, metric)
-eval_data = np.array([[7,2],[6,10],[12,2]])
-eval_label = np.array([11.1,26.1,16.1]) #Adding 0.1 to each of the values
-eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False)
-model.score(eval_iter, metric)</pre>
-(3) {{c| mx.sym.LinearRegressionOutput}}은 l2 loss계산함. <br>
-<i>(1),(2)에서 {{c|lin_reg_label}}이라고 준 것과 {{c|NDArrayIter}}에서 이름이 일치해야 한다.</i> <code>train_iter = mx.io.NDArrayIter(..., label_name='<span style="color:red">lin_reg_label</span>'  ) </code> <i> 다시말해, 입력단의 이름이 일치해야 한다는 얘기. 결국 (4)에 나오는 이름까지 일치해야 해서 같은 이름이 세번 나온다</i>
-(5)의 {{c|model.fit}}에서 {{c|batch_end_callback}}으로 <code>[http://mxnet.io/api/python/callback.html?highlight=ck.speedometer#mxnet.callback.Speedometer mx.callback.Speedometer]</code>줄 수 있다.
-*<code>Speedometer(batch_size, frequent=50, auto_reset=True)</code> : 배치 50번마다 로깅하고, 로깅 후 reset할것. 아래를 미리 해줘야 stdout에 보인다.
-<pre>import logging
-logging.getLogger().setLevel(logging.DEBUG)</pre>
-실행해보면,
-<pre>>>> # Print training speed and evaluation metrics every ten batches. Batch size is one.
->>> module.fit(iterator, num_epoch=n_epoch,
-... batch_end_callback=mx.callback.Speedometer(1, 10))
-Epoch[0] Batch [10] Speed: 1910.41 samples/sec  Train-accuracy=0.200000
-Epoch[0] Batch [20] Speed: 1764.83 samples/sec  Train-accuracy=0.400000
-Epoch[0] Batch [30] Speed: 1740.59 samples/sec  Train-accuracy=0.500000</pre>
-===Handwritten Digit Recognition===
-[http://mxnet.io/tutorials/python/mnist.html#handwritten-digit-recognition 원문]
-<div class="toccolours mw-collapsible mw-collapsed">
-====full src====
-<div class=mw-collapsible-content>
-<pre>
-import mxnet as mx
-mnist = mx.test_utils.get_mnist()
-batch_size = 100
-train_iter = mx.io.NDArrayIter(mnist['train_data'], mnist['train_label'], batch_size, shuffle=True)
-val_iter = mx.io.NDArrayIter(mnist['test_data'], mnist['test_label'], batch_size)
-data = mx.sym.var('data')
-# Flatten the data from 4-D shape into 2-D (batch_size, num_channel*width*height)
-data = mx.sym.flatten(data=data)
-# The first fully-connected layer and the corresponding activation function
-fc1  = mx.sym.FullyConnected(data=data, num_hidden=128)
-act1 = mx.sym.Activation(data=fc1, act_type="relu")
-# The second fully-connected layer and the corresponding activation function
-fc2  = mx.sym.FullyConnected(data=act1, num_hidden = 64)
-act2 = mx.sym.Activation(data=fc2, act_type="relu")
-# MNIST has 10 classes
-fc3  = mx.sym.FullyConnected(data=act2, num_hidden=10)
-# Softmax with cross entropy loss
-mlp  = mx.sym.SoftmaxOutput(data=fc3, name='softmax')
-import logging
-logging.getLogger().setLevel(logging.DEBUG)  # logging to stdout
-# create a trainable module on CPU
-mlp_model = mx.mod.Module(symbol=mlp, context=mx.cpu())
-mlp_model.fit(train_iter,  # train data
-              eval_data=val_iter,  # validation data
-              optimizer='sgd',  # use SGD to train
-              optimizer_params={'learning_rate':0.1},  # use fixed learning rate
-              eval_metric='acc',  # report accuracy during training
-              batch_end_callback = mx.callback.Speedometer(batch_size, 100), # output progress for each 100 data batches
-              num_epoch=10)  # train for at most 10 dataset passes
-test_iter = mx.io.NDArrayIter(mnist['test_data'], None, batch_size)
-prob = mlp_model.predict(test_iter)
-assert prob.shape == (10000, 10)
-test_iter = mx.io.NDArrayIter(mnist['test_data'], mnist['test_label'], batch_size)
-# predict accuracy of mlp
-acc = mx.metric.Accuracy()
-mlp_model.score(test_iter, acc)
-print(acc)
-assert acc.get()[1] > 0.96
-data = mx.sym.var('data')
-# first conv layer
-conv1 = mx.sym.Convolution(data=data, kernel=(5,5), num_filter=20)
-tanh1 = mx.sym.Activation(data=conv1, act_type="tanh")
-pool1 = mx.sym.Pooling(data=tanh1, pool_type="max", kernel=(2,2), stride=(2,2))
-# second conv layer
-conv2 = mx.sym.Convolution(data=pool1, kernel=(5,5), num_filter=50)
-tanh2 = mx.sym.Activation(data=conv2, act_type="tanh")
-pool2 = mx.sym.Pooling(data=tanh2, pool_type="max", kernel=(2,2), stride=(2,2))
-# first fullc layer
-flatten = mx.sym.flatten(data=pool2)
-fc1 = mx.symbol.FullyConnected(data=flatten, num_hidden=500)
-tanh3 = mx.sym.Activation(data=fc1, act_type="tanh")
-# second fullc
-fc2 = mx.sym.FullyConnected(data=tanh3, num_hidden=10)
-# softmax loss
-lenet = mx.sym.SoftmaxOutput(data=fc2, name='softmax')
-# create a trainable module on GPU 0
-lenet_model = mx.mod.Module(symbol=lenet, context=mx.cpu())
-# train with the same
-lenet_model.fit(train_iter,
-                eval_data=val_iter,
-                optimizer='sgd',
-                optimizer_params={'learning_rate':0.1},
-                eval_metric='acc',
-                batch_end_callback = mx.callback.Speedometer(batch_size, 100),
-                num_epoch=10)
-test_iter = mx.io.NDArrayIter(mnist['test_data'], None, batch_size)
-prob = lenet_model.predict(test_iter)
-test_iter = mx.io.NDArrayIter(mnist['test_data'], mnist['test_label'], batch_size)
-# predict accuracy for lenet
-acc = mx.metric.Accuracy()
-lenet_model.score(test_iter, acc)
-print(acc)
-assert acc.get()[1] > 0.98
-</pre></div></div>
-====Loading data====
-<pre>
-mx.test_utils.get_mnist()
-</pre>
-이 뒤로는 trivial.
-===Predict with pre-trained models===
-===Large Scale Image Classification===