目标:在20 epochs内达到0.99的准确率。
Epoch 1/20 2021-12-22 12:21:24.605430: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2) 80/80 [==============================] - 1s 2ms/step - loss: 0.2506 - accuracy: 0.4859 - val_loss: 0.2500 - val_accuracy: 0.4920 Epoch 2/20 80/80 [==============================] - 0s 783us/step - loss: 0.2500 - accuracy: 0.5015 - val_loss: 0.2500 - val_accuracy: 0.4920 Epoch 3/20 80/80 [==============================] - 0s 757us/step - loss: 0.2500 - accuracy: 0.5015 - val_loss: 0.2500 - val_accuracy: 0.4920 Epoch 4/20 80/80 [==============================] - 0s 770us/step - loss: 0.2500 - accuracy: 0.5015 - val_loss: 0.2500 - val_accuracy: 0.4920 Epoch 5/20 80/80 [==============================] - 0s 783us/step - loss: 0.2500 - accuracy: 0.5015 - val_loss: 0.2500 - val_accuracy: 0.4920
loss函数采用mse(mean square error),因为label和prediction的范围都在[0,1]。[latex]\operatorname{MSE}=\frac{1}{n}\sum_{i=1}^n(Y_i-\hat{Y_i})^2[/latex]
import tensorflow as tf
if __name__ == '__main__':
size = 10000
x = tf.random.uniform((size, 2), 0.1, 0.5)
label = tf.cast(x[:, 0] > x[:, 1], tf.int32)
validationSize = 2000
trainingSet = tf.data.Dataset.from_tensor_slices((x[validationSize:], label[validationSize:]))
validationSet = tf.data.Dataset.from_tensor_slices((x[:validationSize], label[:validationSize]))
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(2, input_shape=(2,), activation='elu'))
model.add(tf.keras.layers.Dense(4, activation='elu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='mse', metrics=['binary_accuracy'])
assert len(trainingSet.element_spec) == 2, 'fit stipulates that if x is a dataset, its element must have two elements, feature and label, though it is ok to have None as label.'
assert len(trainingSet.element_spec) == 2, 'Each example in dataset must be a 2-tuple.'
assert len(trainingSet.element_spec[0].shape) > 0, 'In each example, the first element itself must be a list.'
model.fit(trainingSet.batch(100), epochs=20, validation_data=validationSet.batch(100))
Epoch 20/20 80/80 [==============================] - 0s 884us/step - loss: 0.0351 - accuracy: 0.9952 - val_loss: 0.0343 - val_accuracy: 0.9915
Epoch 1/20 2021-12-22 13:03:47.884242: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2) 80/80 [==============================] - 0s 2ms/step - loss: 0.2514 - binary_accuracy: 0.4849 - val_loss: 0.2498 - val_binary_accuracy: 0.5150 Epoch 2/20 80/80 [==============================] - 0s 745us/step - loss: 0.2500 - binary_accuracy: 0.4966 - val_loss: 0.2503 - val_binary_accuracy: 0.4850 Epoch 3/20 80/80 [==============================] - 0s 732us/step - loss: 0.2498 - binary_accuracy: 0.5151 - val_loss: 0.2505 - val_binary_accuracy: 0.4850 Epoch 4/20 80/80 [==============================] - 0s 745us/step - loss: 0.2498 - binary_accuracy: 0.5151 - val_loss: 0.2506 - val_binary_accuracy: 0.4850 Epoch 5/20 80/80 [==============================] - 0s 732us/step - loss: 0.2498 - binary_accuracy: 0.5151 - val_loss: 0.2506 - val_binary_accuracy: 0.4850 Epoch 6/20 80/80 [==============================] - 0s 745us/step - loss: 0.2498 - binary_accuracy: 0.5151 - val_loss: 0.2506 - val_binary_accuracy: 0.4850 Epoch 7/20 80/80 [==============================] - 0s 720us/step - loss: 0.2498 - binary_accuracy: 0.5151 - val_loss: 0.2506 - val_binary_accuracy: 0.4850 Epoch 8/20 80/80 [==============================] - 0s 732us/step - loss: 0.2498 - binary_accuracy: 0.5151 - val_loss: 0.2506 - val_binary_accuracy: 0.4850 Epoch 9/20 80/80 [==============================] - 0s 770us/step - loss: 0.2498 - binary_accuracy: 0.5151 - val_loss: 0.2506 - val_binary_accuracy: 0.4850 Epoch 10/20 80/80 [==============================] - 0s 757us/step - loss: 0.2498 - binary_accuracy: 0.5151 - val_loss: 0.2506 - val_binary_accuracy: 0.4850
import tensorflow as tf
if __name__ == '__main__':
size = 10000
x = tf.random.uniform((size, 2), 0.1, 5000)
label = tf.cast(x[:, 0] > x[:, 1], tf.int32)
validationSize = 2000
trainingSet = tf.data.Dataset.from_tensor_slices((x[validationSize:], label[validationSize:]))
validationSet = tf.data.Dataset.from_tensor_slices((x[:validationSize], label[:validationSize]))
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(2, input_shape=(2,)))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='mse', metrics=['binary_accuracy'])
assert len(trainingSet.element_spec) == 2, 'fit stipulates that if x is a dataset, its element must have two elements, feature and label, though it is ok to have None as label.'
assert len(trainingSet.element_spec) == 2, 'Each example in dataset must be a 2-tuple.'
assert len(trainingSet.element_spec[0].shape) > 0, 'In each example, the first element itself must be a list.'
model.fit(trainingSet.batch(100), epochs=20, validation_data=validationSet.batch(100))
Feature Engineering 数字相除
[[2483.0488 , 4344.6797 ], [ 572.3247 , 2752.0952 ], [3007.2236 , 3750.931 ], ... ]
这个技巧的要求是第二列不能为0。注意我们的输入范围是从0.1开始的。虽然在原始输入是[0.1, 5000]这种极端情况下,商仍然很大(如果不是更大),但大部分情况下都能缩小输入。在代码,相除这个操作被实现为Layer,可以被直接放入神经网络中,但这样训练速度变慢了一点点。所以代码把Division单独提了出来,作为preprocessing。准确率基本都达到0.99。
class Division(tf.keras.layers.Layer):
def __init__(self, dividends: Tuple[int, int], divisor, removeDivisor=True, **kwargs):
:param dividends: a tuple of two element, begin and end, indicating the columns to be divided.
:param divisor:
:param removeDivisor:
if divisor < dividends[1]:
raise Exception('divisor cannot be in dividends')
self.removeDivisor = removeDivisor
self.divisor = divisor
self.dividends = dividends
def call(self, inputs, *args, **kwargs):
before = inputs[:, 0: self.dividends[0]]
middle = inputs[:, self.dividends[0]: self.dividends[1]] / tf.expand_dims(inputs[:, self.divisor], -1)
after = inputs[:, self.dividends[1]:]
inputs = tf.concat([before, middle, after], axis=-1)
if self.removeDivisor:
before = inputs[:, 0:self.divisor]
after = inputs[:, self.divisor + 1:]
r = tf.concat([before, after], axis=-1)
assert inputs.shape[1] == r.shape[1] + 1, 'The divisor column is removed. The result tensor should have one less columns.'
inputs = r
return inputs
import tensorflow as tf
if __name__ == '__main__':
size = 10000
x = tf.random.uniform((size, 2), 0.1, 5000)
label = tf.cast(x[:, 0] > x[:, 1], tf.int32)
validationSize = 2000
trainingSet = tf.data.Dataset.from_tensor_slices((x[validationSize:], label[validationSize:]))
validationSet = tf.data.Dataset.from_tensor_slices((x[:validationSize], label[:validationSize]))
model = tf.keras.Sequential()
# model.add(nn.Division((0, 1), 1))
model.add(tf.keras.layers.Dense(2, activation='elu'))
model.add(tf.keras.layers.Dense(4, activation='elu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='mse', metrics=['binary_accuracy'], run_eagerly=True)
assert len(trainingSet.element_spec) == 2, 'fit stipulates that if x is a dataset, its element must have two elements, feature and label, though it is ok to have None as label.'
assert len(trainingSet.element_spec) == 2, 'Each example in dataset must be a 2-tuple.'
assert len(trainingSet.element_spec[0].shape) > 0, 'In each example, the first element itself must be a list.'
model.fit(trainingSet.batch(100), epochs=20, validation_data=validationSet.batch(100))
def f2(a, b, c):
if a > b:
return 1
elif a > c:
return 2
return 0
size = 10000
x = np.random.uniform(0.1, 5000, (size, 3))
label = np.apply_along_axis(g, 1, x)
x = tf.constant(x)
label = tf.constant(label)
division = nn.Division((1, 3), 0)
x = division(x)
validationSize = 2000
trainingSet = tf.data.Dataset.from_tensor_slices((x[validationSize:], label[validationSize:]))
validationSet = tf.data.Dataset.from_tensor_slices((x[:validationSize], label[:validationSize]))
model = tf.keras.Sequential()
# model.add(nn.Division((1, 3), 0))
model.add(tf.keras.layers.Dense(4, activation='elu'))
model.add(tf.keras.layers.Dense(12, activation='elu'))
model.add(tf.keras.layers.Dense(12, activation='elu'))
model.add(tf.keras.layers.Dense(12, activation='elu'))
model.add(tf.keras.layers.Dense(12, activation='elu'))
model.add(tf.keras.layers.Dense(3, activation='softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
assert len(trainingSet.element_spec) == 2, 'fit stipulates that if x is a dataset, its element must have two elements, feature and label, though it is ok to have None as label.'
assert len(trainingSet.element_spec) == 2, 'Each example in dataset must be a 2-tuple.'
assert len(trainingSet.element_spec[0].shape) > 0, 'In each example, the first element itself must be a list.'
model.fit(trainingSet.batch(100), epochs=20, validation_data=validationSet.batch(100))
import tensorflow as tf
import numpy as np
import nn
def g(t):
(a, b, c, d) = t
if a > b:
return 1
elif a > c:
return 2
return 0
if __name__ == '__main__':
size = 10000
x = np.random.uniform(0.1, 5000, (size, 4))
label = np.apply_along_axis(g, 1, x)
x = tf.constant(x)
label = tf.constant(label)
division = nn.Division((1, 4), 0)
x = division(x)
validationSize = 2000
trainingSet = tf.data.Dataset.from_tensor_slices((x[validationSize:], label[validationSize:]))
validationSet = tf.data.Dataset.from_tensor_slices((x[:validationSize], label[:validationSize]))
three = tf.keras.models.load_model('three')
three.trainable = False
model = tf.keras.Sequential()
# model.add(nn.Division((1, 3), 0))
model.add(tf.keras.layers.Dense(8, activation='elu'))
model.add(tf.keras.layers.Dense(8, activation='elu'))
model.add(tf.keras.layers.Dense(2, activation='elu'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
assert len(trainingSet.element_spec) == 2, 'fit stipulates that if x is a dataset, its element must have two elements, feature and label, though it is ok to have None as label.'
assert len(trainingSet.element_spec) == 2, 'Each example in dataset must be a 2-tuple.'
assert len(trainingSet.element_spec[0].shape) > 0, 'In each example, the first element itself must be a list.'
model.fit(trainingSet.batch(100), epochs=20, validation_data=validationSet.batch(100))
代码的最后保存了模型。现在用transfer learning的方法载入该模型,接着创建一个新模型,先过滤干扰项,然后用已保存的模型计算。因为transfer learning的缘故,准确率不可能比原先的0.99高,所以这里只要求0.97。
def g(t):
(a, b, c) = t
if a > 1.1 * b:
return 1
elif a > 1.2 * c:
return 2
return 0
按照Andreas Madsen; Alexander Rosenberg Johansen. NEURAL ARITHMETIC UNITS. ICLR 2020. , (): [2021-12-24].,用神经网络进行四则运算是困难的。
