TensorFlow的高阶接口Estimator的使用(3)——自定义RNN模型

我们在上一篇文章中讲了怎样用Estimator构建CNN的模型,这篇文章主要是怎样用Estimator构建RNN模型,对应《TensorFlow机器学习项目实战》7.2, 单变量时间序列预测。这应该是本系列的最后一篇了。

事实上,在TensorFlow 1.4下已经update了time series的库,如果用那个会方便很多。但是因为我们是基于《TensorFlow机器学习项目实战》来做,所以我们尽量保持代码一致。而且这样也能让我们更好的学习RNN和LSTM。

在[1]中,有update过的sine函数预测的例子,但是例子依然使用的是learn。而且还用了SKCompat来warp现在的Estimator。在TensorFlow的文档中有说这种方式会被砍掉。所以我们还是直接使用最新的Estimator接口。其实差别很小,最大的差别还是input函数,而在前两篇文章中,我们已经重点介绍了这个功能,相信你看这段代码能够很容易看懂。

另外一个主要参考对象就是tensorflow的官方例子[2]。这个例子主要是文本分类的,我们主要要将其分类问题换成回归问题,难度也不大。

跟前两篇文章相同,本文尽量小的改动作者的代码,使得代码可以良好运行。


完整代码如下:

import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.contrib import rnn
from matplotlib import pyplot as plt


TIMESTEPS = 5
DENSE_LAYERS = None
TRAINING_STEPS = 10000
BATCH_SIZE = 128
PRINT_STEPS = TRAINING_STEPS / 100
learning_rate=0.03
X_FEATURE='x'




def lstm_model(mode, features, labels, params):
    def lstm_cells( ):
        #我为了方便只是简单的堆积了两层
        return [rnn.BasicLSTMCell(TIMESTEPS, state_is_tuple=True) for _ in range(2)]


    def dnn_layers(input_layers, layers):
            return input_layers

    def _lstm_model(features, mode,labels):
        stacked_lstm =  rnn.MultiRNNCell(lstm_cells(), state_is_tuple=True)
        X=features[X_FEATURE];
        x_ =  tf.unstack(X, num=TIMESTEPS, axis=1)
        output, layers = rnn.static_rnn(stacked_lstm, x_, dtype=np.float32)
        output = dnn_layers(output[-1], [])

        
        predictions=tf.layers.dense(output,1)
        #我本来想用自带的model.linear_regression,但是在predict的时候总出错
        #return learn.models.linear_regression(output, y)
        # Provide an estimator spec for `ModeKeys.PREDICT`.
        if mode == tf.estimator.ModeKeys.PREDICT:
            return tf.estimator.EstimatorSpec(
                    mode=mode,
                    predictions={"elec": predictions})
    
		# Calculate loss using mean squared error
        loss = tf.losses.mean_squared_error(labels, predictions)
        #此处选用GD做优化器
        optimizer = tf.train.GradientDescentOptimizer(
          learning_rate=learning_rate)
        train_op = optimizer.minimize(
          loss=loss, global_step=tf.train.get_global_step())
    
		# Calculate root mean squared error as additional eval metric
        eval_metric_ops = {
          "accuracy": tf.metrics.root_mean_squared_error(
              tf.cast(labels, tf.float32), predictions)
				}
    
		#Provide an estimator spec for `ModeKeys.EVAL` and `ModeKeys.TRAIN` modes.
        return tf.estimator.EstimatorSpec(
          mode=mode,
          loss=loss,
          train_op=train_op,
          eval_metric_ops=eval_metric_ops)

    return _lstm_model(features, mode,labels)


df = pd.read_csv("data/elec_load.csv", error_bad_lines=False)


print( df.describe())
array=(df.values- 147.0) /339.0


listX = []
listy = []
X={}
y={}

for i in range(0,len(array)-TIMESTEPS):
    listX.append(array[i:i+TIMESTEPS].reshape([TIMESTEPS,1]))
    #作者好像有点理解错了,索引取值冒号后面是不包含的
    listy.append(array[i+TIMESTEPS])

arrayX=np.array(listX).astype(np.float32)
arrayy=np.array(listy).astype(np.float32)


X['train']=arrayX[0:12000]
X['test']=arrayX[12000:13000]
X['val']=arrayX[13000:14000]

y['train']=arrayy[0:12000]
y['test']=arrayy[12000:13000]
y['val']=arrayy[13000:14000]


train_input_fn = tf.estimator.inputs.numpy_input_fn(
      x={X_FEATURE: X['train']},
      y=y['train'],
      batch_size=100,
      num_epochs=30,
      shuffle=True)
test_input_fn = tf.estimator.inputs.numpy_input_fn(
      x={X_FEATURE: X['test']},
     # y=y['test'],
      num_epochs=1,
      batch_size=100,
      shuffle=False)


eva_input_fn = tf.estimator.inputs.numpy_input_fn(
      x={X_FEATURE: X['val']},
      y=y['val'],
      num_epochs=1,
      shuffle=False)

regressor=tf.estimator.Estimator(model_fn=lstm_model)

regressor.train(input_fn=train_input_fn)

score =regressor.evaluate(input_fn=eva_input_fn)
print ('Accuracy: {0:f}'.format(score['accuracy']))



predictions=regressor.predict(input_fn=test_input_fn)


predicted=np.zeros(y['test'].shape,dtype=np.float32)
for i, p in enumerate(predictions):
  predicted[i]=p['elec']


plt.subplot()
plot_predicted, = plt.plot(predicted, label='predicted')

plot_test, = plt.plot(y['test'], label='test')
plt.legend(handles=[plot_predicted, plot_test])


结果如图,欢迎讨论

reference

[1] LSTM-Time-Series-Analysis-using-Tensorflow

[2] Character-level Convolutional Networks for Text Classification

[3] 时间序列接口