本项目使用文本卷积神经网络，并使用MovieLens数据集完成电影推荐的任务。

推荐系统在日常的网络应用中无处不在，比如网上购物、网上买书、新闻app、社交网络、音乐网站、电影网站等等等等，有人的地方就有推荐。根据个人的喜好，相同喜好人群的习惯等信息进行个性化的内容推荐。比如打开新闻类的app，因为有了个性化的内容，每个人看到的新闻首页都是不一样的。

这当然是很有用的，在信息爆炸的今天，获取信息的途径和方式多种多样，人们花费时间最多的不再是去哪获取信息，而是要在众多的信息中寻找自己感兴趣的，这就是信息超载问题。为了解决这个问题，推荐系统应运而生。

协同过滤是推荐系统应用较广泛的技术，该方法搜集用户的历史记录、个人喜好等信息，计算与其他用户的相似度，利用相似用户的评价来预测目标用户对特定项目的喜好程度。优点是会给用户推荐未浏览过的项目，缺点呢，对于新用户来说，没有任何与商品的交互记录和个人喜好等信息，存在冷启动问题，导致模型无法找到相似的用户或商品。

为了解决冷启动的问题，通常的做法是对于刚注册的用户，要求用户先选择自己感兴趣的话题、群组、商品、性格、喜欢的音乐类型等信息，比如豆瓣FM：

基于卷积神经网络CNN的电影推荐系统

下载数据集

运行下面代码把数据集下载下来

import pandas as pd

from sklearn.model_selection import train_test_split

import numpy as np

from collections import Counter

import tensorflow as tf

import os

import pickle

import re

from tensorflow.python.ops import math_ops

from urllib.request import urlretrieve

from os.path import isfile, isdir

from tqdm import tqdm

import zipfile

import hashlib

def _unzip(save_path, _, database_name, data_path):

    """

    解压

    :param save_path: The path of the gzip files

    :param database_name: Name of database

    :param data_path: Path to extract to

    :param _: HACK - Used to have to same interface as _ungzip

    """

    print('Extracting {}...'.format(database_name))

    with zipfile.ZipFile(save_path) as zf:

        zf.extractall(data_path)

def download_extract(database_name, data_path):

    """

    下载提取数据

    :param database_name: Database name

    """

    DATASET_ML1M = 'ml-1m'

    if database_name == DATASET_ML1M:

        url = 'http://files.grouplens.org/datasets/movielens/ml-1m.zip'

        hash_code = 'c4d9eecfca2ab87c1945afe126590906'

        extract_path = os.path.join(data_path, 'ml-1m')

        save_path = os.path.join(data_path, 'ml-1m.zip')

        extract_fn = _unzip

    if os.path.exists(extract_path):

        print('Found {} Data'.format(database_name))

        return

    if not os.path.exists(data_path):

        os.makedirs(data_path)

    if not os.path.exists(save_path):

        with DLProgress(unit='B', unit_scale=True, miniters=1, desc='Downloading {}'.format(database_name)) as pbar:

            urlretrieve(

                url,

                save_path,

                pbar.hook)

    assert hashlib.md5(open(save_path, 'rb').read()).hexdigest() == hash_code, \

        '{} file is corrupted.  Remove the file and try again.'.format(save_path)

    os.makedirs(extract_path)

    try:

        extract_fn(save_path, extract_path, database_name, data_path)

    except Exception as err:

        shutil.rmtree(extract_path)  # Remove extraction folder if there is an error

        raise err

    print('Done.')

    # Remove compressed data

#     os.remove(save_path)

class DLProgress(tqdm):

    """

    下载时处理进度条

    """

    last_block = 0

    def hook(self, block_num=1, block_size=1, total_size=None):

        """

        A hook function that will be called once on establishment of the network connection and

        once after each block read thereafter.

        :param block_num: A count of blocks transferred so far

        :param block_size: Block size in bytes

        :param total_size: The total size of the file. This may be -1 on older FTP servers which do not return

                            a file size in response to a retrieval request.

        """

        self.total = total_size

        self.update((block_num - self.last_block) * block_size)

        self.last_block = block_num

data_dir = './'

download_extract('ml-1m', data_dir)

Extracting ml-1m...

Done.

先来看看数据

本项目使用的是MovieLens 1M 数据集，包含6000个用户在近4000部电影上的1亿条评论。

数据集分为三个文件：

用户数据users.dat
电影数据movies.dat
评分数据ratings.dat

用户数据

用户ID
性别
年龄
职业ID
邮编

数据中的格式：UserID::Gender::Age::Occupation::Zip-code

Gender is denoted by a "M" for male and "F" for female
Age is chosen from the following ranges:
- 1: "Under 18"
- 18: "18-24"
- 25: "25-34"
- 35: "35-44"
- 45: "45-49"
- 50: "50-55"
- 56: "56+"
Occupation is chosen from the following choices:
- 0: "other" or not specified
- 1: "academic/educator"
- 2: "artist"
- 3: "clerical/admin"
- 4: "college/grad student"
- 5: "customer service"
- 6: "doctor/health care"
- 7: "executive/managerial"
- 8: "farmer"
- 9: "homemaker"
- 10: "K-12 student"
- 11: "lawyer"
- 12: "programmer"
- 13: "retired"
- 14: "sales/marketing"
- 15: "scientist"
- 16: "self-employed"
- 17: "technician/engineer"
- 18: "tradesman/craftsman"
- 19: "unemployed"
- 20: "writer"

users_title = ['UserID', 'Gender', 'Age', 'OccupationID', 'Zip-code']

users = pd.read_table('./ml-1m/users.dat', sep='::', header=None, names=users_title, engine = 'python')

users.head()

.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}

.dataframe tbody tr th {

    vertical-align: top;

}

.dataframe thead th {

    text-align: right;

}

	UserID	Gender	Age	OccupationID	Zip-code
0	1	F	1	10	48067
1	2	M	56	16	70072
2	3	M	25	15	55117
3	4	M	45	7	02460
4	5	M	25	20	55455

可以看出UserID、Gender、Age和Occupation都是类别字段，其中邮编字段是我们不使用的。

电影数据

电影ID
电影名
电影风格

数据中的格式：MovieID::Title::Genres

Titles are identical to titles provided by the IMDB (including

year of release)
Genres are pipe-separated and are selected from the following genres:
- Action
- Adventure
- Animation
- Children's
- Comedy
- Crime
- Documentary
- Drama
- Fantasy
- Film-Noir
- Horror
- Musical
- Mystery
- Romance
- Sci-Fi
- Thriller
- War
- Western

movies_title = ['MovieID', 'Title', 'Genres']

movies = pd.read_table('./ml-1m/movies.dat', sep='::', header=None, names=movies_title, engine = 'python')

movies.head()

.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}

.dataframe tbody tr th {

    vertical-align: top;

}

.dataframe thead th {

    text-align: right;

}

	MovieID	Title	Genres
0	1	Toy Story (1995)	Animation\|Children's\|Comedy
1	2	Jumanji (1995)	Adventure\|Children's\|Fantasy
2	3	Grumpier Old Men (1995)	Comedy\|Romance
3	4	Waiting to Exhale (1995)	Comedy\|Drama
4	5	Father of the Bride Part II (1995)	Comedy

MovieID是类别字段，Title是文本，Genres也是类别字段

评分数据

用户ID
电影ID
评分
时间戳

数据中的格式：UserID::MovieID::Rating::Timestamp

UserIDs range between 1 and 6040
MovieIDs range between 1 and 3952
Ratings are made on a 5-star scale (whole-star ratings only)
Timestamp is represented in seconds since the epoch as returned by time(2)
Each user has at least 20 ratings

ratings_title = ['UserID','MovieID', 'Rating', 'timestamps']

ratings = pd.read_table('./ml-1m/ratings.dat', sep='::', header=None, names=ratings_title, engine = 'python')

ratings.head()

.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}

.dataframe tbody tr th {

    vertical-align: top;

}

.dataframe thead th {

    text-align: right;

}

	UserID	MovieID	Rating	timestamps
0	1	1193	5	978300760
1	1	661	3	978302109
2	1	914	3	978301968
3	1	3408	4	978300275
4	1	2355	5	978824291

评分字段Rating就是我们要学习的targets，时间戳字段我们不使用。

来说说数据预处理

UserID、Occupation和MovieID不用变。
Gender字段：需要将‘F’和‘M’转换成0和1。
Age字段：要转成7个连续数字0~6。
Genres字段：是分类字段，要转成数字。首先将Genres中的类别转成字符串到数字的字典，然后再将每个电影的Genres字段转成数字列表，因为有些电影是多个Genres的组合。
Title字段：处理方式跟Genres字段一样，首先创建文本到数字的字典，然后将Title中的描述转成数字的列表。另外Title中的年份也需要去掉。
Genres和Title字段需要将长度统一，这样在神经网络中方便处理。空白部分用‘< PAD >’对应的数字填充。

实现数据预处理

def load_data():

    """

    从文件中加载数据集

    """

    # 读取User数据

    users_title = ['UserID', 'Gender', 'Age', 'JobID', 'Zip-code']

    users = pd.read_table('./ml-1m/users.dat', sep='::', header=None, names=users_title, engine = 'python')

    users = users.filter(regex='UserID|Gender|Age|JobID')

    users_orig = users.values

    # 改变User数据中性别和年龄

    gender_map = {'F':0, 'M':1}

    users['Gender'] = users['Gender'].map(gender_map)

    age_map = {val:ii for ii,val in enumerate(set(users['Age']))}

    users['Age'] = users['Age'].map(age_map)

    # 读取Movie数据集

    movies_title = ['MovieID', 'Title', 'Genres']

    movies = pd.read_table('./ml-1m/movies.dat', sep='::', header=None, names=movies_title, engine = 'python')

    movies_orig = movies.values

    # 将Title中的年份去掉

    pattern = re.compile(r'^(.*)\((\d+)\)$')

    title_map = {val:pattern.match(val).group(1) for ii,val in enumerate(set(movies['Title']))}

    movies['Title'] = movies['Title'].map(title_map)

    # 电影类型转数字字典

    genres_set = set()

    for val in movies['Genres'].str.split('|'):

        genres_set.update(val)

    genres_set.add('<PAD>')

    genres2int = {val:ii for ii, val in enumerate(genres_set)}

    # 将电影类型转成等长数字列表，长度是18

    genres_map = {val:[genres2int[row] for row in val.split('|')] for ii,val in enumerate(set(movies['Genres']))}

    for key in genres_map:

        for cnt in range(max(genres2int.values()) - len(genres_map[key])):

            genres_map[key].insert(len(genres_map[key]) + cnt,genres2int['<PAD>'])

    movies['Genres'] = movies['Genres'].map(genres_map)

    # 电影Title转数字字典

    title_set = set()

    for val in movies['Title'].str.split():

        title_set.update(val)

    title_set.add('<PAD>')

    title2int = {val:ii for ii, val in enumerate(title_set)}

    # 将电影Title转成等长数字列表，长度是15

    title_count = 15

    title_map = {val:[title2int[row] for row in val.split()] for ii,val in enumerate(set(movies['Title']))}

    for key in title_map:

        for cnt in range(title_count - len(title_map[key])):

            title_map[key].insert(len(title_map[key]) + cnt,title2int['<PAD>'])

    movies['Title'] = movies['Title'].map(title_map)

    # 读取评分数据集

    ratings_title = ['UserID','MovieID', 'ratings', 'timestamps']

    ratings = pd.read_table('./ml-1m/ratings.dat', sep='::', header=None, names=ratings_title, engine = 'python')

    ratings = ratings.filter(regex='UserID|MovieID|ratings')

    # 合并三个表

    data = pd.merge(pd.merge(ratings, users), movies)

    # 将数据分成X和y两张表

    target_fields = ['ratings']

    features_pd, targets_pd = data.drop(target_fields, axis=1), data[target_fields]

    features = features_pd.values

    targets_values = targets_pd.values

    return title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig

加载数据并保存到本地

title_count：Title字段的长度（15）
title_set：Title文本的集合
genres2int：电影类型转数字的字典
features：是输入X
targets_values：是学习目标y
ratings：评分数据集的Pandas对象
users：用户数据集的Pandas对象
movies：电影数据的Pandas对象
data：三个数据集组合在一起的Pandas对象
movies_orig：没有做数据处理的原始电影数据
users_orig：没有做数据处理的原始用户数据

# 加载数据

title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig = load_data()

# 存入文件中

pickle.dump((title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig), open('preprocess.p', 'wb'))

预处理后的数据

users.head()

.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}

.dataframe tbody tr th {

    vertical-align: top;

}

.dataframe thead th {

    text-align: right;

}

	UserID	Gender	Age	JobID
0	1	0	0	10
1	2	1	5	16
2	3	1	6	15
3	4	1	2	7
4	5	1	6	20

movies.head()

.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}

.dataframe tbody tr th {

    vertical-align: top;

}

.dataframe thead th {

    text-align: right;

}

	MovieID	Title	Genres
0	1	[310, 2184, 634, 634, 634, 634, 634, 634, 634,...	[0, 18, 7, 17, 17, 17, 17, 17, 17, 17, 17, 17,...
1	2	[1182, 634, 634, 634, 634, 634, 634, 634, 634,...	[3, 18, 8, 17, 17, 17, 17, 17, 17, 17, 17, 17,...
2	3	[5011, 4744, 2629, 634, 634, 634, 634, 634, 63...	[7, 9, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,...
3	4	[4095, 1535, 1886, 634, 634, 634, 634, 634, 63...	[7, 5, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,...
4	5	[3563, 1725, 3790, 3727, 838, 343, 634, 634, 6...	[7, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17...

movies.values[0]

array([1,

       list([310, 2184, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634]),

       list([0, 18, 7, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17])],

      dtype=object)

从本地读取数据

title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig = pickle.load(open('preprocess.p', mode='rb'))

模型设计

基于卷积神经网络CNN的电影推荐系统

通过研究数据集中的字段类型，我们发现有一些是类别字段，通常的处理是将这些字段转成one hot编码，但是像UserID、MovieID这样的字段就会变成非常的稀疏，输入的维度急剧膨胀，这是我们不愿意见到的，毕竟我这小笔记本不像大厂动辄能处理数以亿计维度的输入：）

所以在预处理数据时将这些字段转成了数字，我们用这个数字当做嵌入矩阵的索引，在网络的第一层使用了嵌入层，维度是（N，32）和（N，16）。

电影类型的处理要多一步，有时一个电影有多个电影类型，这样从嵌入矩阵索引出来是一个（n，32）的矩阵，因为有多个类型嘛，我们要将这个矩阵求和，变成（1，32）的向量。

电影名的处理比较特殊，没有使用循环神经网络，而是用了文本卷积网络，下文会进行说明。

从嵌入层索引出特征以后，将各特征传入全连接层，将输出再次传入全连接层，最终分别得到（1，200）的用户特征和电影特征两个特征向量。

我们的目的就是要训练出用户特征和电影特征，在实现推荐功能时使用。得到这两个特征以后，就可以选择任意的方式来拟合评分了。我使用了两种方式，一个是上图中画出的将两个特征做向量乘法，将结果与真实评分做回归，采用MSE优化损失。因为本质上这是一个回归问题，另一种方式是，将两个特征作为输入，再次传入全连接层，输出一个值，将输出值回归到真实评分，采用MSE优化损失。

实际上第二个方式的MSE loss在0.8附近，第一个方式在1附近，5次迭代的结果。

文本卷积网络

网络看起来像下面这样

基于卷积神经网络CNN的电影推荐系统

图片来自Kim Yoon的论文：Convolutional Neural Networks for Sentence Classification

将卷积神经网络用于文本的文章建议你阅读Understanding Convolutional Neural Networks for NLP

网络的第一层是词嵌入层，由每一个单词的嵌入向量组成的嵌入矩阵。下一层使用多个不同尺寸（窗口大小）的卷积核在嵌入矩阵上做卷积，窗口大小指的是每次卷积覆盖几个单词。这里跟对图像做卷积不太一样，图像的卷积通常用2x2、3x3、5x5之类的尺寸，而文本卷积要覆盖整个单词的嵌入向量，所以尺寸是（单词数，向量维度），比如每次滑动3个，4个或者5个单词。第三层网络是max pooling得到一个长向量，最后使用dropout做正则化，最终得到了电影Title的特征。

辅助函数

import tensorflow as tf

import os

import pickle

def save_params(params):

    """

    保存参数到文件中

    """

    pickle.dump(params, open('params.p', 'wb'))

def load_params():

    """

    从文件中加载参数

    """

    return pickle.load(open('params.p', mode='rb'))

编码实现

# 嵌入矩阵的维度

embed_dim = 32

# 用户ID个数

uid_max = max(features.take(0,1)) + 1 # 6040

# 性别个数

gender_max = max(features.take(2,1)) + 1 # 1 + 1 = 2

# 年龄类别个数

age_max = max(features.take(3,1)) + 1 # 6 + 1 = 7

# 职业个数

job_max = max(features.take(4,1)) + 1# 20 + 1 = 21

# 电影ID个数

movie_id_max = max(features.take(1,1)) + 1 # 3952

# 电影类型个数

movie_categories_max = max(genres2int.values()) + 1 # 18 + 1 = 19

# 电影名单词个数

movie_title_max = len(title_set) # 5216

# 对电影类型嵌入向量做加和操作的标志，考虑过使用mean做平均，但是没实现mean

combiner = "sum"

# 电影名长度

sentences_size = title_count # = 15

# 文本卷积滑动窗口，分别滑动2, 3, 4, 5个单词

window_sizes = {2, 3, 4, 5}

# 文本卷积核数量

filter_num = 8

# 电影ID转下标的字典，数据集中电影ID跟下标不一致，比如第5行的数据电影ID不一定是5

movieid2idx = {val[0]:i for i, val in enumerate(movies.values)}

超参

# Number of Epochs

num_epochs = 5

# Batch Size

batch_size = 256

dropout_keep = 0.5

# Learning Rate

learning_rate = 0.0001

# Show stats for every n number of batches

show_every_n_batches = 20

save_dir = './save'

输入

定义输入的占位符

def get_inputs():

    uid = tf.placeholder(tf.int32, [None, 1], name="uid")

    user_gender = tf.placeholder(tf.int32, [None, 1], name="user_gender")

    user_age = tf.placeholder(tf.int32, [None, 1], name="user_age")

    user_job = tf.placeholder(tf.int32, [None, 1], name="user_job")

    movie_id = tf.placeholder(tf.int32, [None, 1], name="movie_id")

    movie_categories = tf.placeholder(tf.int32, [None, 18], name="movie_categories")

    movie_titles = tf.placeholder(tf.int32, [None, 15], name="movie_titles")

    targets = tf.placeholder(tf.int32, [None, 1], name="targets")

    LearningRate = tf.placeholder(tf.float32, name = "LearningRate")

    dropout_keep_prob = tf.placeholder(tf.float32, name = "dropout_keep_prob")

    return uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, LearningRate, dropout_keep_prob

构建神经网络

定义User的嵌入矩阵

def get_user_embedding(uid, user_gender, user_age, user_job):

    with tf.name_scope("user_embedding"):

        uid_embed_matrix = tf.Variable(tf.random_uniform([uid_max, embed_dim], -1, 1), name = "uid_embed_matrix")

        uid_embed_layer = tf.nn.embedding_lookup(uid_embed_matrix, uid, name = "uid_embed_layer")

        gender_embed_matrix = tf.Variable(tf.random_uniform([gender_max, embed_dim // 2], -1, 1), name= "gender_embed_matrix")

        gender_embed_layer = tf.nn.embedding_lookup(gender_embed_matrix, user_gender, name = "gender_embed_layer")

        age_embed_matrix = tf.Variable(tf.random_uniform([age_max, embed_dim // 2], -1, 1), name="age_embed_matrix")

        age_embed_layer = tf.nn.embedding_lookup(age_embed_matrix, user_age, name="age_embed_layer")

        job_embed_matrix = tf.Variable(tf.random_uniform([job_max, embed_dim // 2], -1, 1), name = "job_embed_matrix")

        job_embed_layer = tf.nn.embedding_lookup(job_embed_matrix, user_job, name = "job_embed_layer")

    return uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer

将User的嵌入矩阵一起全连接生成User的特征

def get_user_feature_layer(uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer):

    with tf.name_scope("user_fc"):

        #第一层全连接

        uid_fc_layer = tf.layers.dense(uid_embed_layer, embed_dim, name = "uid_fc_layer", activation=tf.nn.relu)

        gender_fc_layer = tf.layers.dense(gender_embed_layer, embed_dim, name = "gender_fc_layer", activation=tf.nn.relu)

        age_fc_layer = tf.layers.dense(age_embed_layer, embed_dim, name ="age_fc_layer", activation=tf.nn.relu)

        job_fc_layer = tf.layers.dense(job_embed_layer, embed_dim, name = "job_fc_layer", activation=tf.nn.relu)

        #第二层全连接

        user_combine_layer = tf.concat([uid_fc_layer, gender_fc_layer, age_fc_layer, job_fc_layer], 2)  #(?, 1, 128)

        user_combine_layer = tf.contrib.layers.fully_connected(user_combine_layer, 200, tf.tanh)  #(?, 1, 200)

        user_combine_layer_flat = tf.reshape(user_combine_layer, [-1, 200])

    return user_combine_layer, user_combine_layer_flat

定义Movie ID的嵌入矩阵

def get_movie_id_embed_layer(movie_id):

    with tf.name_scope("movie_embedding"):

        movie_id_embed_matrix = tf.Variable(tf.random_uniform([movie_id_max, embed_dim], -1, 1), name = "movie_id_embed_matrix")

        movie_id_embed_layer = tf.nn.embedding_lookup(movie_id_embed_matrix, movie_id, name = "movie_id_embed_layer")

    return movie_id_embed_layer

对电影类型的多个嵌入向量做加和

def get_movie_categories_layers(movie_categories):

    with tf.name_scope("movie_categories_layers"):

        movie_categories_embed_matrix = tf.Variable(tf.random_uniform([movie_categories_max, embed_dim], -1, 1), name = "movie_categories_embed_matrix")

        movie_categories_embed_layer = tf.nn.embedding_lookup(movie_categories_embed_matrix, movie_categories, name = "movie_categories_embed_layer")

        if combiner == "sum":

            movie_categories_embed_layer = tf.reduce_sum(movie_categories_embed_layer, axis=1, keep_dims=True)

    #     elif combiner == "mean":

    return movie_categories_embed_layer

Movie Title的文本卷积网络实现

def get_movie_cnn_layer(movie_titles):

    #从嵌入矩阵中得到电影名对应的各个单词的嵌入向量

    with tf.name_scope("movie_embedding"):

        movie_title_embed_matrix = tf.Variable(tf.random_uniform([movie_title_max, embed_dim], -1, 1), name = "movie_title_embed_matrix")

        movie_title_embed_layer = tf.nn.embedding_lookup(movie_title_embed_matrix, movie_titles, name = "movie_title_embed_layer")

        movie_title_embed_layer_expand = tf.expand_dims(movie_title_embed_layer, -1)

    #对文本嵌入层使用不同尺寸的卷积核做卷积和最大池化

    pool_layer_lst = []

    for window_size in window_sizes:

        with tf.name_scope("movie_txt_conv_maxpool_{}".format(window_size)):

            filter_weights = tf.Variable(tf.truncated_normal([window_size, embed_dim, 1, filter_num],stddev=0.1),name = "filter_weights")

            filter_bias = tf.Variable(tf.constant(0.1, shape=[filter_num]), name="filter_bias")

            conv_layer = tf.nn.conv2d(movie_title_embed_layer_expand, filter_weights, [1,1,1,1], padding="VALID", name="conv_layer")

            relu_layer = tf.nn.relu(tf.nn.bias_add(conv_layer,filter_bias), name ="relu_layer")

            maxpool_layer = tf.nn.max_pool(relu_layer, [1,sentences_size - window_size + 1 ,1,1], [1,1,1,1], padding="VALID", name="maxpool_layer")

            pool_layer_lst.append(maxpool_layer)

    #Dropout层

    with tf.name_scope("pool_dropout"):

        pool_layer = tf.concat(pool_layer_lst, 3, name ="pool_layer")

        max_num = len(window_sizes) * filter_num

        pool_layer_flat = tf.reshape(pool_layer , [-1, 1, max_num], name = "pool_layer_flat")

        dropout_layer = tf.nn.dropout(pool_layer_flat, dropout_keep_prob, name = "dropout_layer")

    return pool_layer_flat, dropout_layer

将Movie的各个层一起做全连接

def get_movie_feature_layer(movie_id_embed_layer, movie_categories_embed_layer, dropout_layer):

    with tf.name_scope("movie_fc"):

        #第一层全连接

        movie_id_fc_layer = tf.layers.dense(movie_id_embed_layer, embed_dim, name = "movie_id_fc_layer", activation=tf.nn.relu)

        movie_categories_fc_layer = tf.layers.dense(movie_categories_embed_layer, embed_dim, name = "movie_categories_fc_layer", activation=tf.nn.relu)

        #第二层全连接

        movie_combine_layer = tf.concat([movie_id_fc_layer, movie_categories_fc_layer, dropout_layer], 2)  #(?, 1, 96)

        movie_combine_layer = tf.contrib.layers.fully_connected(movie_combine_layer, 200, tf.tanh)  #(?, 1, 200)

        movie_combine_layer_flat = tf.reshape(movie_combine_layer, [-1, 200])

    return movie_combine_layer, movie_combine_layer_flat

构建计算图

tf.reset_default_graph()

train_graph = tf.Graph()

with train_graph.as_default():

    #获取输入占位符

    uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob = get_inputs()

    #获取User的4个嵌入向量

    uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer = get_user_embedding(uid, user_gender, user_age, user_job)

    #得到用户特征

    user_combine_layer, user_combine_layer_flat = get_user_feature_layer(uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer)

    #获取电影ID的嵌入向量

    movie_id_embed_layer = get_movie_id_embed_layer(movie_id)

    #获取电影类型的嵌入向量

    movie_categories_embed_layer = get_movie_categories_layers(movie_categories)

    #获取电影名的特征向量

    pool_layer_flat, dropout_layer = get_movie_cnn_layer(movie_titles)

    #得到电影特征

    movie_combine_layer, movie_combine_layer_flat = get_movie_feature_layer(movie_id_embed_layer,

                                                                            movie_categories_embed_layer,

                                                                            dropout_layer)

    #计算出评分，要注意两个不同的方案，inference的名字（name值）是不一样的，后面做推荐时要根据name取得tensor

    with tf.name_scope("inference"):

        #将用户特征和电影特征作为输入，经过全连接，输出一个值的方案

#         inference_layer = tf.concat([user_combine_layer_flat, movie_combine_layer_flat], 1)  #(?, 200)

#         inference = tf.layers.dense(inference_layer, 1,

#                                     kernel_initializer=tf.truncated_normal_initializer(stddev=0.01),

#                                     kernel_regularizer=tf.nn.l2_loss, name="inference")

        #简单的将用户特征和电影特征做矩阵乘法得到一个预测评分

#        inference = tf.matmul(user_combine_layer_flat, tf.transpose(movie_combine_layer_flat))

        inference = tf.reduce_sum(user_combine_layer_flat * movie_combine_layer_flat, axis=1)

        inference = tf.expand_dims(inference, axis=1)

    with tf.name_scope("loss"):

        # MSE损失，将计算值回归到评分

        cost = tf.losses.mean_squared_error(targets, inference )

        loss = tf.reduce_mean(cost)

    # 优化损失

#     train_op = tf.train.AdamOptimizer(lr).minimize(loss)  #cost

    global_step = tf.Variable(0, name="global_step", trainable=False)

    optimizer = tf.train.AdamOptimizer(lr)

    gradients = optimizer.compute_gradients(loss)  #cost

    train_op = optimizer.apply_gradients(gradients, global_step=global_step)

WARNING:tensorflow:From <ipython-input-20-559a1ee9ce9e>:6: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.

Instructions for updating:

keep_dims is deprecated, use keepdims instead

inference

<tf.Tensor 'inference/ExpandDims:0' shape=(?, 1) dtype=float32>

取得batch

def get_batches(Xs, ys, batch_size):

    for start in range(0, len(Xs), batch_size):

        end = min(start + batch_size, len(Xs))

        yield Xs[start:end], ys[start:end]

训练网络

%matplotlib inline

%config InlineBackend.figure_format = 'retina'

import matplotlib.pyplot as plt

import time

import datetime

losses = {'train':[], 'test':[]}

with tf.Session(graph=train_graph) as sess:

    #搜集数据给tensorBoard用

    # Keep track of gradient values and sparsity

    grad_summaries = []

    for g, v in gradients:

        if g is not None:

            grad_hist_summary = tf.summary.histogram("{}/grad/hist".format(v.name.replace(':', '_')), g)

            sparsity_summary = tf.summary.scalar("{}/grad/sparsity".format(v.name.replace(':', '_')), tf.nn.zero_fraction(g))

            grad_summaries.append(grad_hist_summary)

            grad_summaries.append(sparsity_summary)

    grad_summaries_merged = tf.summary.merge(grad_summaries)

    # Output directory for models and summaries

    timestamp = str(int(time.time()))

    out_dir = os.path.abspath(os.path.join(os.path.curdir, "runs", timestamp))

    print("Writing to {}\n".format(out_dir))

    # Summaries for loss and accuracy

    loss_summary = tf.summary.scalar("loss", loss)

    # Train Summaries

    train_summary_op = tf.summary.merge([loss_summary, grad_summaries_merged])

    train_summary_dir = os.path.join(out_dir, "summaries", "train")

    train_summary_writer = tf.summary.FileWriter(train_summary_dir, sess.graph)

    # Inference summaries

    inference_summary_op = tf.summary.merge([loss_summary])

    inference_summary_dir = os.path.join(out_dir, "summaries", "inference")

    inference_summary_writer = tf.summary.FileWriter(inference_summary_dir, sess.graph)

    sess.run(tf.global_variables_initializer())

    saver = tf.train.Saver()

    for epoch_i in range(num_epochs):

        #将数据集分成训练集和测试集，随机种子不固定

        train_X,test_X, train_y, test_y = train_test_split(features,

                                                           targets_values,

                                                           test_size = 0.2,

                                                           random_state = 0)  

        train_batches = get_batches(train_X, train_y, batch_size)

        test_batches = get_batches(test_X, test_y, batch_size)

        #训练的迭代，保存训练损失

        for batch_i in range(len(train_X) // batch_size):

            x, y = next(train_batches)

            categories = np.zeros([batch_size, 18])

            for i in range(batch_size):

                categories[i] = x.take(6,1)[i]

            titles = np.zeros([batch_size, sentences_size])

            for i in range(batch_size):

                titles[i] = x.take(5,1)[i]

            feed = {

                uid: np.reshape(x.take(0,1), [batch_size, 1]),

                user_gender: np.reshape(x.take(2,1), [batch_size, 1]),

                user_age: np.reshape(x.take(3,1), [batch_size, 1]),

                user_job: np.reshape(x.take(4,1), [batch_size, 1]),

                movie_id: np.reshape(x.take(1,1), [batch_size, 1]),

                movie_categories: categories,  #x.take(6,1)

                movie_titles: titles,  #x.take(5,1)

                targets: np.reshape(y, [batch_size, 1]),

                dropout_keep_prob: dropout_keep, #dropout_keep

                lr: learning_rate}

            step, train_loss, summaries, _ = sess.run([global_step, loss, train_summary_op, train_op], feed)  #cost

            losses['train'].append(train_loss)

            train_summary_writer.add_summary(summaries, step)  #

            # Show every <show_every_n_batches> batches

            if (epoch_i * (len(train_X) // batch_size) + batch_i) % show_every_n_batches == 0:

                time_str = datetime.datetime.now().isoformat()

                print('{}: Epoch {:>3} Batch {:>4}/{}   train_loss = {:.3f}'.format(

                    time_str,

                    epoch_i,

                    batch_i,

                    (len(train_X) // batch_size),

                    train_loss))

        #使用测试数据的迭代

        for batch_i  in range(len(test_X) // batch_size):

            x, y = next(test_batches)

            categories = np.zeros([batch_size, 18])

            for i in range(batch_size):

                categories[i] = x.take(6,1)[i]

            titles = np.zeros([batch_size, sentences_size])

            for i in range(batch_size):

                titles[i] = x.take(5,1)[i]

            feed = {

                uid: np.reshape(x.take(0,1), [batch_size, 1]),

                user_gender: np.reshape(x.take(2,1), [batch_size, 1]),

                user_age: np.reshape(x.take(3,1), [batch_size, 1]),

                user_job: np.reshape(x.take(4,1), [batch_size, 1]),

                movie_id: np.reshape(x.take(1,1), [batch_size, 1]),

                movie_categories: categories,  #x.take(6,1)

                movie_titles: titles,  #x.take(5,1)

                targets: np.reshape(y, [batch_size, 1]),

                dropout_keep_prob: 1,

                lr: learning_rate}

            step, test_loss, summaries = sess.run([global_step, loss, inference_summary_op], feed)  #cost

            #保存测试损失

            losses['test'].append(test_loss)

            inference_summary_writer.add_summary(summaries, step)  #

            time_str = datetime.datetime.now().isoformat()

            if (epoch_i * (len(test_X) // batch_size) + batch_i) % show_every_n_batches == 0:

                print('{}: Epoch {:>3} Batch {:>4}/{}   test_loss = {:.3f}'.format(

                    time_str,

                    epoch_i,

                    batch_i,

                    (len(test_X) // batch_size),

                    test_loss))

    # Save Model

    saver.save(sess, save_dir)  #, global_step=epoch_i

    print('Model Trained and Saved')

Writing to F:\jupyter\work\movie_recommender-master\runs\1554780412

2019-04-09T11:26:53.633627: Epoch   0 Batch    0/3125   train_loss = 8.810

2019-04-09T11:26:54.052240: Epoch   0 Batch   20/3125   train_loss = 3.457

2019-04-09T11:26:54.466181: Epoch   0 Batch   40/3125   train_loss = 2.563

2019-04-09T11:26:54.890814: Epoch   0 Batch   60/3125   train_loss = 1.962

2019-04-09T11:26:55.315803: Epoch   0 Batch   80/3125   train_loss = 1.852

2019-04-09T11:26:55.730125: Epoch   0 Batch  100/3125   train_loss = 1.826

2019-04-09T11:26:56.146734: Epoch   0 Batch  120/3125   train_loss = 1.781

2019-04-09T11:26:56.559145: Epoch   0 Batch  140/3125   train_loss = 1.630

2019-04-09T11:26:56.971689: Epoch   0 Batch  160/3125   train_loss = 1.652

2019-04-09T11:26:57.394125: Epoch   0 Batch  180/3125   train_loss = 1.361

2019-04-09T11:26:57.810824: Epoch   0 Batch  200/3125   train_loss = 1.715

2019-04-09T11:26:58.227455: Epoch   0 Batch  220/3125   train_loss = 1.430

2019-04-09T11:26:58.643714: Epoch   0 Batch  240/3125   train_loss = 1.342

2019-04-09T11:26:59.056816: Epoch   0 Batch  260/3125   train_loss = 1.512

2019-04-09T11:26:59.468409: Epoch   0 Batch  280/3125   train_loss = 1.678

2019-04-09T11:26:59.882126: Epoch   0 Batch  300/3125   train_loss = 1.482

2019-04-09T11:27:00.294685: Epoch   0 Batch  320/3125   train_loss = 1.463

2019-04-09T11:27:00.826546: Epoch   0 Batch  340/3125   train_loss = 1.333

2019-04-09T11:27:01.239302: Epoch   0 Batch  360/3125   train_loss = 1.318

2019-04-09T11:27:01.652219: Epoch   0 Batch  380/3125   train_loss = 1.253

2019-04-09T11:27:02.067588: Epoch   0 Batch  400/3125   train_loss = 1.155

2019-04-09T11:27:02.483490: Epoch   0 Batch  420/3125   train_loss = 1.341

2019-04-09T11:27:02.892079: Epoch   0 Batch  440/3125   train_loss = 1.429

2019-04-09T11:27:03.305331: Epoch   0 Batch  460/3125   train_loss = 1.315

2019-04-09T11:27:03.721028: Epoch   0 Batch  480/3125   train_loss = 1.351

2019-04-09T11:27:04.130622: Epoch   0 Batch  500/3125   train_loss = 1.043

2019-04-09T11:27:04.549775: Epoch   0 Batch  520/3125   train_loss = 1.340

2019-04-09T11:27:04.963936: Epoch   0 Batch  540/3125   train_loss = 1.258

2019-04-09T11:27:05.378772: Epoch   0 Batch  560/3125   train_loss = 1.474

2019-04-09T11:27:05.790245: Epoch   0 Batch  580/3125   train_loss = 1.399

2019-04-09T11:27:06.202342: Epoch   0 Batch  600/3125   train_loss = 1.374

2019-04-09T11:27:06.616239: Epoch   0 Batch  620/3125   train_loss = 1.429

2019-04-09T11:27:07.027259: Epoch   0 Batch  640/3125   train_loss = 1.346

2019-04-09T11:27:07.443480: Epoch   0 Batch  660/3125   train_loss = 1.377

2019-04-09T11:27:07.857450: Epoch   0 Batch  680/3125   train_loss = 1.191

2019-04-09T11:27:08.269326: Epoch   0 Batch  700/3125   train_loss = 1.302

2019-04-09T11:27:08.685203: Epoch   0 Batch  720/3125   train_loss = 1.171

2019-04-09T11:27:09.098769: Epoch   0 Batch  740/3125   train_loss = 1.403

2019-04-09T11:27:09.519383: Epoch   0 Batch  760/3125   train_loss = 1.369

2019-04-09T11:27:09.931100: Epoch   0 Batch  780/3125   train_loss = 1.402

2019-04-09T11:27:10.343018: Epoch   0 Batch  800/3125   train_loss = 1.250

2019-04-09T11:27:10.755994: Epoch   0 Batch  820/3125   train_loss = 1.292

2019-04-09T11:27:11.169596: Epoch   0 Batch  840/3125   train_loss = 1.215

2019-04-09T11:27:11.583017: Epoch   0 Batch  860/3125   train_loss = 1.201

2019-04-09T11:27:11.997121: Epoch   0 Batch  880/3125   train_loss = 1.189

2019-04-09T11:27:12.411392: Epoch   0 Batch  900/3125   train_loss = 1.240

2019-04-09T11:27:12.824492: Epoch   0 Batch  920/3125   train_loss = 1.220

2019-04-09T11:27:13.238173: Epoch   0 Batch  940/3125   train_loss = 1.414

2019-04-09T11:27:13.649014: Epoch   0 Batch  960/3125   train_loss = 1.332

2019-04-09T11:27:14.058947: Epoch   0 Batch  980/3125   train_loss = 1.345

2019-04-09T11:27:14.491861: Epoch   0 Batch 1000/3125   train_loss = 1.275

2019-04-09T11:27:14.920000: Epoch   0 Batch 1020/3125   train_loss = 1.341

2019-04-09T11:27:15.337096: Epoch   0 Batch 1040/3125   train_loss = 1.281

2019-04-09T11:27:15.760618: Epoch   0 Batch 1060/3125   train_loss = 1.478

2019-04-09T11:27:16.174406: Epoch   0 Batch 1080/3125   train_loss = 1.158

2019-04-09T11:27:16.591839: Epoch   0 Batch 1100/3125   train_loss = 1.268

2019-04-09T11:27:17.013498: Epoch   0 Batch 1120/3125   train_loss = 1.270

2019-04-09T11:27:17.438626: Epoch   0 Batch 1140/3125   train_loss = 1.280

2019-04-09T11:27:17.852226: Epoch   0 Batch 1160/3125   train_loss = 1.205

2019-04-09T11:27:18.273478: Epoch   0 Batch 1180/3125   train_loss = 1.274

2019-04-09T11:27:18.696339: Epoch   0 Batch 1200/3125   train_loss = 1.284

2019-04-09T11:27:19.117179: Epoch   0 Batch 1220/3125   train_loss = 1.155

2019-04-09T11:27:19.524543: Epoch   0 Batch 1240/3125   train_loss = 1.143

2019-04-09T11:27:19.938738: Epoch   0 Batch 1260/3125   train_loss = 1.247

2019-04-09T11:27:20.350656: Epoch   0 Batch 1280/3125   train_loss = 1.223

2019-04-09T11:27:20.761388: Epoch   0 Batch 1300/3125   train_loss = 1.267

2019-04-09T11:27:21.177496: Epoch   0 Batch 1320/3125   train_loss = 1.183

2019-04-09T11:27:21.590091: Epoch   0 Batch 1340/3125   train_loss = 1.047

2019-04-09T11:27:22.004788: Epoch   0 Batch 1360/3125   train_loss = 1.149

2019-04-09T11:27:22.414416: Epoch   0 Batch 1380/3125   train_loss = 1.114

2019-04-09T11:27:22.827015: Epoch   0 Batch 1400/3125   train_loss = 1.282

2019-04-09T11:27:23.236719: Epoch   0 Batch 1420/3125   train_loss = 1.256

2019-04-09T11:27:23.645758: Epoch   0 Batch 1440/3125   train_loss = 1.174

2019-04-09T11:27:24.063386: Epoch   0 Batch 1460/3125   train_loss = 1.251

2019-04-09T11:27:24.477184: Epoch   0 Batch 1480/3125   train_loss = 1.180

2019-04-09T11:27:24.890286: Epoch   0 Batch 1500/3125   train_loss = 1.322

2019-04-09T11:27:25.300422: Epoch   0 Batch 1520/3125   train_loss = 1.277

2019-04-09T11:27:25.709640: Epoch   0 Batch 1540/3125   train_loss = 1.270

2019-04-09T11:27:26.122241: Epoch   0 Batch 1560/3125   train_loss = 1.122

2019-04-09T11:27:26.534862: Epoch   0 Batch 1580/3125   train_loss = 1.138

2019-04-09T11:27:26.947461: Epoch   0 Batch 1600/3125   train_loss = 1.274

2019-04-09T11:27:27.359900: Epoch   0 Batch 1620/3125   train_loss = 1.169

2019-04-09T11:27:27.769969: Epoch   0 Batch 1640/3125   train_loss = 1.235

2019-04-09T11:27:28.180519: Epoch   0 Batch 1660/3125   train_loss = 1.282

2019-04-09T11:27:28.592653: Epoch   0 Batch 1680/3125   train_loss = 1.174

2019-04-09T11:27:29.003519: Epoch   0 Batch 1700/3125   train_loss = 1.009

2019-04-09T11:27:29.414262: Epoch   0 Batch 1720/3125   train_loss = 1.149

2019-04-09T11:27:29.828869: Epoch   0 Batch 1740/3125   train_loss = 1.221

2019-04-09T11:27:30.238773: Epoch   0 Batch 1760/3125   train_loss = 1.288

2019-04-09T11:27:30.648342: Epoch   0 Batch 1780/3125   train_loss = 1.067

2019-04-09T11:27:31.188925: Epoch   0 Batch 1800/3125   train_loss = 1.196

2019-04-09T11:27:31.603231: Epoch   0 Batch 1820/3125   train_loss = 1.142

2019-04-09T11:27:32.010926: Epoch   0 Batch 1840/3125   train_loss = 1.256

2019-04-09T11:27:32.425741: Epoch   0 Batch 1860/3125   train_loss = 1.345

2019-04-09T11:27:32.839345: Epoch   0 Batch 1880/3125   train_loss = 1.215

2019-04-09T11:27:33.248900: Epoch   0 Batch 1900/3125   train_loss = 1.048

2019-04-09T11:27:33.663116: Epoch   0 Batch 1920/3125   train_loss = 1.211

2019-04-09T11:27:34.074400: Epoch   0 Batch 1940/3125   train_loss = 1.070

2019-04-09T11:27:34.484302: Epoch   0 Batch 1960/3125   train_loss = 1.131

2019-04-09T11:27:34.894396: Epoch   0 Batch 1980/3125   train_loss = 1.196

2019-04-09T11:27:35.306864: Epoch   0 Batch 2000/3125   train_loss = 1.347

2019-04-09T11:27:35.722043: Epoch   0 Batch 2020/3125   train_loss = 1.297

2019-04-09T11:27:36.135143: Epoch   0 Batch 2040/3125   train_loss = 1.180

2019-04-09T11:27:36.543475: Epoch   0 Batch 2060/3125   train_loss = 1.025

2019-04-09T11:27:36.953066: Epoch   0 Batch 2080/3125   train_loss = 1.265

2019-04-09T11:27:37.370478: Epoch   0 Batch 2100/3125   train_loss = 1.094

2019-04-09T11:27:37.782974: Epoch   0 Batch 2120/3125   train_loss = 1.069

2019-04-09T11:27:38.190560: Epoch   0 Batch 2140/3125   train_loss = 1.132

2019-04-09T11:27:38.604746: Epoch   0 Batch 2160/3125   train_loss = 1.122

2019-04-09T11:27:39.019245: Epoch   0 Batch 2180/3125   train_loss = 1.166

2019-04-09T11:27:39.431946: Epoch   0 Batch 2200/3125   train_loss = 1.137

2019-04-09T11:27:39.847258: Epoch   0 Batch 2220/3125   train_loss = 1.118

2019-04-09T11:27:40.256398: Epoch   0 Batch 2240/3125   train_loss = 1.011

2019-04-09T11:27:40.665478: Epoch   0 Batch 2260/3125   train_loss = 1.160

2019-04-09T11:27:41.078758: Epoch   0 Batch 2280/3125   train_loss = 1.164

2019-04-09T11:27:41.489744: Epoch   0 Batch 2300/3125   train_loss = 1.163

2019-04-09T11:27:41.901845: Epoch   0 Batch 2320/3125   train_loss = 1.288

2019-04-09T11:27:42.312713: Epoch   0 Batch 2340/3125   train_loss = 1.177

2019-04-09T11:27:42.725320: Epoch   0 Batch 2360/3125   train_loss = 1.130

2019-04-09T11:27:43.132848: Epoch   0 Batch 2380/3125   train_loss = 1.163

2019-04-09T11:27:43.541373: Epoch   0 Batch 2400/3125   train_loss = 1.231

2019-04-09T11:27:43.947189: Epoch   0 Batch 2420/3125   train_loss = 1.133

2019-04-09T11:27:44.355782: Epoch   0 Batch 2440/3125   train_loss = 1.272

2019-04-09T11:27:44.768420: Epoch   0 Batch 2460/3125   train_loss = 1.128

2019-04-09T11:27:45.177740: Epoch   0 Batch 2480/3125   train_loss = 1.184

2019-04-09T11:27:45.584471: Epoch   0 Batch 2500/3125   train_loss = 1.161

2019-04-09T11:27:45.993960: Epoch   0 Batch 2520/3125   train_loss = 1.055

2019-04-09T11:27:46.402164: Epoch   0 Batch 2540/3125   train_loss = 1.108

2019-04-09T11:27:46.812056: Epoch   0 Batch 2560/3125   train_loss = 0.977

2019-04-09T11:27:47.230169: Epoch   0 Batch 2580/3125   train_loss = 1.101

2019-04-09T11:27:47.639261: Epoch   0 Batch 2600/3125   train_loss = 1.141

2019-04-09T11:27:48.047294: Epoch   0 Batch 2620/3125   train_loss = 1.098

2019-04-09T11:27:48.457188: Epoch   0 Batch 2640/3125   train_loss = 1.096

2019-04-09T11:27:48.870683: Epoch   0 Batch 2660/3125   train_loss = 1.241

2019-04-09T11:27:49.282413: Epoch   0 Batch 2680/3125   train_loss = 1.001

2019-04-09T11:27:49.690957: Epoch   0 Batch 2700/3125   train_loss = 1.266

2019-04-09T11:27:50.103555: Epoch   0 Batch 2720/3125   train_loss = 1.158

2019-04-09T11:27:50.514897: Epoch   0 Batch 2740/3125   train_loss = 1.210

2019-04-09T11:27:50.924909: Epoch   0 Batch 2760/3125   train_loss = 1.234

2019-04-09T11:27:51.336251: Epoch   0 Batch 2780/3125   train_loss = 1.121

2019-04-09T11:27:51.748175: Epoch   0 Batch 2800/3125   train_loss = 1.377

2019-04-09T11:27:52.164028: Epoch   0 Batch 2820/3125   train_loss = 1.417

2019-04-09T11:27:52.583020: Epoch   0 Batch 2840/3125   train_loss = 1.146

2019-04-09T11:27:53.001214: Epoch   0 Batch 2860/3125   train_loss = 1.067

2019-04-09T11:27:53.413084: Epoch   0 Batch 2880/3125   train_loss = 1.160

2019-04-09T11:27:53.830194: Epoch   0 Batch 2900/3125   train_loss = 1.134

2019-04-09T11:27:54.242290: Epoch   0 Batch 2920/3125   train_loss = 1.188

2019-04-09T11:27:54.657395: Epoch   0 Batch 2940/3125   train_loss = 1.103

2019-04-09T11:27:55.066253: Epoch   0 Batch 2960/3125   train_loss = 1.222

2019-04-09T11:27:55.476481: Epoch   0 Batch 2980/3125   train_loss = 1.197

2019-04-09T11:27:55.891054: Epoch   0 Batch 3000/3125   train_loss = 1.123

2019-04-09T11:27:56.299092: Epoch   0 Batch 3020/3125   train_loss = 1.213

2019-04-09T11:27:56.709737: Epoch   0 Batch 3040/3125   train_loss = 1.128

2019-04-09T11:27:57.121834: Epoch   0 Batch 3060/3125   train_loss = 1.174

2019-04-09T11:27:57.537893: Epoch   0 Batch 3080/3125   train_loss = 1.253

2019-04-09T11:27:57.945981: Epoch   0 Batch 3100/3125   train_loss = 1.169

2019-04-09T11:27:58.355315: Epoch   0 Batch 3120/3125   train_loss = 1.011

2019-04-09T11:27:58.525868: Epoch   0 Batch    0/781   test_loss = 1.003

2019-04-09T11:27:58.655211: Epoch   0 Batch   20/781   test_loss = 1.118

2019-04-09T11:27:58.785057: Epoch   0 Batch   40/781   test_loss = 0.975

2019-04-09T11:27:58.914903: Epoch   0 Batch   60/781   test_loss = 1.317

2019-04-09T11:27:59.043746: Epoch   0 Batch   80/781   test_loss = 1.261

2019-04-09T11:27:59.172589: Epoch   0 Batch  100/781   test_loss = 1.333

2019-04-09T11:27:59.301431: Epoch   0 Batch  120/781   test_loss = 1.186

2019-04-09T11:27:59.429434: Epoch   0 Batch  140/781   test_loss = 1.192

2019-04-09T11:27:59.557775: Epoch   0 Batch  160/781   test_loss = 1.259

2019-04-09T11:27:59.685114: Epoch   0 Batch  180/781   test_loss = 1.189

2019-04-09T11:27:59.813455: Epoch   0 Batch  200/781   test_loss = 1.093

2019-04-09T11:27:59.939791: Epoch   0 Batch  220/781   test_loss = 0.963

2019-04-09T11:28:00.066629: Epoch   0 Batch  240/781   test_loss = 1.173

2019-04-09T11:28:00.194468: Epoch   0 Batch  260/781   test_loss = 1.160

2019-04-09T11:28:00.321306: Epoch   0 Batch  280/781   test_loss = 1.354

2019-04-09T11:28:00.448551: Epoch   0 Batch  300/781   test_loss = 1.140

2019-04-09T11:28:00.576892: Epoch   0 Batch  320/781   test_loss = 1.270

2019-04-09T11:28:00.705735: Epoch   0 Batch  340/781   test_loss = 0.836

2019-04-09T11:28:00.832572: Epoch   0 Batch  360/781   test_loss = 1.297

2019-04-09T11:28:00.961415: Epoch   0 Batch  380/781   test_loss = 1.141

2019-04-09T11:28:01.090257: Epoch   0 Batch  400/781   test_loss = 1.135

2019-04-09T11:28:01.217095: Epoch   0 Batch  420/781   test_loss = 0.986

2019-04-09T11:28:01.344936: Epoch   0 Batch  440/781   test_loss = 1.153

2019-04-09T11:28:01.472184: Epoch   0 Batch  460/781   test_loss = 1.084

2019-04-09T11:28:01.599021: Epoch   0 Batch  480/781   test_loss = 1.101

2019-04-09T11:28:01.726862: Epoch   0 Batch  500/781   test_loss = 0.917

2019-04-09T11:28:01.854702: Epoch   0 Batch  520/781   test_loss = 1.127

2019-04-09T11:28:01.980536: Epoch   0 Batch  540/781   test_loss = 1.025

2019-04-09T11:28:02.108377: Epoch   0 Batch  560/781   test_loss = 1.267

2019-04-09T11:28:02.235214: Epoch   0 Batch  580/781   test_loss = 1.131

2019-04-09T11:28:02.362552: Epoch   0 Batch  600/781   test_loss = 1.179

2019-04-09T11:28:02.490387: Epoch   0 Batch  620/781   test_loss = 1.140

2019-04-09T11:28:02.617224: Epoch   0 Batch  640/781   test_loss = 1.194

2019-04-09T11:28:02.744563: Epoch   0 Batch  660/781   test_loss = 1.135

2019-04-09T11:28:02.875411: Epoch   0 Batch  680/781   test_loss = 1.403

2019-04-09T11:28:03.002248: Epoch   0 Batch  700/781   test_loss = 1.109

2019-04-09T11:28:03.130089: Epoch   0 Batch  720/781   test_loss = 1.243

2019-04-09T11:28:03.256926: Epoch   0 Batch  740/781   test_loss = 1.118

2019-04-09T11:28:03.383769: Epoch   0 Batch  760/781   test_loss = 1.098

2019-04-09T11:28:03.510695: Epoch   0 Batch  780/781   test_loss = 1.155

2019-04-09T11:28:04.289124: Epoch   1 Batch   15/3125   train_loss = 1.266

2019-04-09T11:28:04.711410: Epoch   1 Batch   35/3125   train_loss = 1.142

2019-04-09T11:28:05.124010: Epoch   1 Batch   55/3125   train_loss = 1.165

2019-04-09T11:28:05.539135: Epoch   1 Batch   75/3125   train_loss = 1.079

2019-04-09T11:28:05.955033: Epoch   1 Batch   95/3125   train_loss = 0.929

2019-04-09T11:28:06.374924: Epoch   1 Batch  115/3125   train_loss = 1.166

2019-04-09T11:28:06.784549: Epoch   1 Batch  135/3125   train_loss = 1.015

2019-04-09T11:28:07.202663: Epoch   1 Batch  155/3125   train_loss = 1.129

2019-04-09T11:28:07.622296: Epoch   1 Batch  175/3125   train_loss = 1.051

2019-04-09T11:28:08.044004: Epoch   1 Batch  195/3125   train_loss = 1.215

2019-04-09T11:28:08.464873: Epoch   1 Batch  215/3125   train_loss = 1.127

2019-04-09T11:28:08.882758: Epoch   1 Batch  235/3125   train_loss = 1.092

2019-04-09T11:28:09.302399: Epoch   1 Batch  255/3125   train_loss = 1.211

2019-04-09T11:28:09.718143: Epoch   1 Batch  275/3125   train_loss = 1.005

2019-04-09T11:28:10.135755: Epoch   1 Batch  295/3125   train_loss = 0.973

2019-04-09T11:28:10.556105: Epoch   1 Batch  315/3125   train_loss = 1.039

2019-04-09T11:28:10.968219: Epoch   1 Batch  335/3125   train_loss = 0.990

2019-04-09T11:28:11.382497: Epoch   1 Batch  355/3125   train_loss = 1.110

2019-04-09T11:28:11.792475: Epoch   1 Batch  375/3125   train_loss = 1.187

2019-04-09T11:28:12.203571: Epoch   1 Batch  395/3125   train_loss = 1.056

2019-04-09T11:28:12.616848: Epoch   1 Batch  415/3125   train_loss = 1.314

2019-04-09T11:28:13.031510: Epoch   1 Batch  435/3125   train_loss = 1.136

2019-04-09T11:28:13.442848: Epoch   1 Batch  455/3125   train_loss = 1.054

2019-04-09T11:28:13.860246: Epoch   1 Batch  475/3125   train_loss = 1.144

2019-04-09T11:28:14.274154: Epoch   1 Batch  495/3125   train_loss = 1.056

2019-04-09T11:28:14.692507: Epoch   1 Batch  515/3125   train_loss = 1.161

2019-04-09T11:28:15.109092: Epoch   1 Batch  535/3125   train_loss = 1.140

2019-04-09T11:28:15.524725: Epoch   1 Batch  555/3125   train_loss = 1.257

2019-04-09T11:28:15.938088: Epoch   1 Batch  575/3125   train_loss = 1.070

2019-04-09T11:28:16.350862: Epoch   1 Batch  595/3125   train_loss = 1.285

2019-04-09T11:28:16.761759: Epoch   1 Batch  615/3125   train_loss = 1.101

2019-04-09T11:28:17.182378: Epoch   1 Batch  635/3125   train_loss = 1.138

2019-04-09T11:28:17.599235: Epoch   1 Batch  655/3125   train_loss = 1.057

2019-04-09T11:28:18.019362: Epoch   1 Batch  675/3125   train_loss = 0.876

2019-04-09T11:28:18.438108: Epoch   1 Batch  695/3125   train_loss = 1.045

2019-04-09T11:28:18.849900: Epoch   1 Batch  715/3125   train_loss = 1.098

2019-04-09T11:28:19.261195: Epoch   1 Batch  735/3125   train_loss = 0.914

2019-04-09T11:28:19.812365: Epoch   1 Batch  755/3125   train_loss = 1.162

2019-04-09T11:28:20.222217: Epoch   1 Batch  775/3125   train_loss = 0.998

2019-04-09T11:28:20.645987: Epoch   1 Batch  795/3125   train_loss = 1.218

2019-04-09T11:28:21.064302: Epoch   1 Batch  815/3125   train_loss = 1.102

2019-04-09T11:28:21.482799: Epoch   1 Batch  835/3125   train_loss = 1.071

2019-04-09T11:28:21.907954: Epoch   1 Batch  855/3125   train_loss = 1.297

2019-04-09T11:28:22.327483: Epoch   1 Batch  875/3125   train_loss = 1.248

2019-04-09T11:28:22.741550: Epoch   1 Batch  895/3125   train_loss = 1.080

2019-04-09T11:28:23.157659: Epoch   1 Batch  915/3125   train_loss = 1.059

2019-04-09T11:28:23.571202: Epoch   1 Batch  935/3125   train_loss = 1.163

2019-04-09T11:28:23.984586: Epoch   1 Batch  955/3125   train_loss = 1.102

2019-04-09T11:28:24.396511: Epoch   1 Batch  975/3125   train_loss = 1.100

2019-04-09T11:28:24.824835: Epoch   1 Batch  995/3125   train_loss = 0.890

2019-04-09T11:28:25.242948: Epoch   1 Batch 1015/3125   train_loss = 1.077

2019-04-09T11:28:25.659444: Epoch   1 Batch 1035/3125   train_loss = 1.090

2019-04-09T11:28:26.076601: Epoch   1 Batch 1055/3125   train_loss = 1.154

2019-04-09T11:28:26.489531: Epoch   1 Batch 1075/3125   train_loss = 1.004

2019-04-09T11:28:26.897455: Epoch   1 Batch 1095/3125   train_loss = 1.012

2019-04-09T11:28:27.320553: Epoch   1 Batch 1115/3125   train_loss = 1.165

2019-04-09T11:28:27.739517: Epoch   1 Batch 1135/3125   train_loss = 1.029

2019-04-09T11:28:28.156628: Epoch   1 Batch 1155/3125   train_loss = 1.117

2019-04-09T11:28:28.570595: Epoch   1 Batch 1175/3125   train_loss = 1.103

2019-04-09T11:28:28.980586: Epoch   1 Batch 1195/3125   train_loss = 1.250

2019-04-09T11:28:29.393619: Epoch   1 Batch 1215/3125   train_loss = 0.930

2019-04-09T11:28:29.809238: Epoch   1 Batch 1235/3125   train_loss = 1.077

2019-04-09T11:28:30.219331: Epoch   1 Batch 1255/3125   train_loss = 1.089

2019-04-09T11:28:30.627580: Epoch   1 Batch 1275/3125   train_loss = 1.000

2019-04-09T11:28:31.035136: Epoch   1 Batch 1295/3125   train_loss = 1.006

2019-04-09T11:28:31.448626: Epoch   1 Batch 1315/3125   train_loss = 1.210

2019-04-09T11:28:31.948769: Epoch   1 Batch 1335/3125   train_loss = 1.045

2019-04-09T11:28:32.356933: Epoch   1 Batch 1355/3125   train_loss = 1.058

2019-04-09T11:28:32.771030: Epoch   1 Batch 1375/3125   train_loss = 1.110

2019-04-09T11:28:33.184133: Epoch   1 Batch 1395/3125   train_loss = 1.008

2019-04-09T11:28:33.596132: Epoch   1 Batch 1415/3125   train_loss = 1.086

2019-04-09T11:28:34.007114: Epoch   1 Batch 1435/3125   train_loss = 1.221

2019-04-09T11:28:34.419967: Epoch   1 Batch 1455/3125   train_loss = 1.241

2019-04-09T11:28:34.829988: Epoch   1 Batch 1475/3125   train_loss = 1.154

2019-04-09T11:28:35.241458: Epoch   1 Batch 1495/3125   train_loss = 1.102

2019-04-09T11:28:35.650228: Epoch   1 Batch 1515/3125   train_loss = 0.990

2019-04-09T11:28:36.060708: Epoch   1 Batch 1535/3125   train_loss = 0.907

2019-04-09T11:28:36.472293: Epoch   1 Batch 1555/3125   train_loss = 1.079

2019-04-09T11:28:36.880701: Epoch   1 Batch 1575/3125   train_loss = 0.986

2019-04-09T11:28:37.298235: Epoch   1 Batch 1595/3125   train_loss = 1.052

2019-04-09T11:28:37.710706: Epoch   1 Batch 1615/3125   train_loss = 1.025

2019-04-09T11:28:38.118793: Epoch   1 Batch 1635/3125   train_loss = 1.146

2019-04-09T11:28:38.533452: Epoch   1 Batch 1655/3125   train_loss = 1.123

2019-04-09T11:28:38.948779: Epoch   1 Batch 1675/3125   train_loss = 0.976

2019-04-09T11:28:39.359489: Epoch   1 Batch 1695/3125   train_loss = 1.035

2019-04-09T11:28:39.766989: Epoch   1 Batch 1715/3125   train_loss = 0.945

2019-04-09T11:28:40.179589: Epoch   1 Batch 1735/3125   train_loss = 1.174

2019-04-09T11:28:40.590375: Epoch   1 Batch 1755/3125   train_loss = 1.027

2019-04-09T11:28:40.998865: Epoch   1 Batch 1775/3125   train_loss = 1.026

2019-04-09T11:28:41.408017: Epoch   1 Batch 1795/3125   train_loss = 0.981

2019-04-09T11:28:41.821620: Epoch   1 Batch 1815/3125   train_loss = 0.966

2019-04-09T11:28:42.229169: Epoch   1 Batch 1835/3125   train_loss = 1.074

2019-04-09T11:28:42.642918: Epoch   1 Batch 1855/3125   train_loss = 0.959

2019-04-09T11:28:43.154530: Epoch   1 Batch 1875/3125   train_loss = 1.213

2019-04-09T11:28:43.560385: Epoch   1 Batch 1895/3125   train_loss = 0.935

2019-04-09T11:28:43.974210: Epoch   1 Batch 1915/3125   train_loss = 0.973

2019-04-09T11:28:44.393618: Epoch   1 Batch 1935/3125   train_loss = 1.016

2019-04-09T11:28:44.808725: Epoch   1 Batch 1955/3125   train_loss = 1.006

2019-04-09T11:28:45.224542: Epoch   1 Batch 1975/3125   train_loss = 1.036

2019-04-09T11:28:45.638372: Epoch   1 Batch 1995/3125   train_loss = 1.130

2019-04-09T11:28:46.050876: Epoch   1 Batch 2015/3125   train_loss = 1.092

2019-04-09T11:28:46.466638: Epoch   1 Batch 2035/3125   train_loss = 1.163

2019-04-09T11:28:46.877782: Epoch   1 Batch 2055/3125   train_loss = 0.961

2019-04-09T11:28:47.297977: Epoch   1 Batch 2075/3125   train_loss = 1.154

2019-04-09T11:28:47.707362: Epoch   1 Batch 2095/3125   train_loss = 1.007

2019-04-09T11:28:48.119961: Epoch   1 Batch 2115/3125   train_loss = 1.150

2019-04-09T11:28:48.536958: Epoch   1 Batch 2135/3125   train_loss = 1.026

2019-04-09T11:28:48.955579: Epoch   1 Batch 2155/3125   train_loss = 1.008

2019-04-09T11:28:49.371992: Epoch   1 Batch 2175/3125   train_loss = 1.028

2019-04-09T11:28:49.785513: Epoch   1 Batch 2195/3125   train_loss = 1.013

2019-04-09T11:28:50.199116: Epoch   1 Batch 2215/3125   train_loss = 1.034

2019-04-09T11:28:50.609969: Epoch   1 Batch 2235/3125   train_loss = 1.184

2019-04-09T11:28:51.023581: Epoch   1 Batch 2255/3125   train_loss = 1.135

2019-04-09T11:28:51.436197: Epoch   1 Batch 2275/3125   train_loss = 0.936

2019-04-09T11:28:51.854318: Epoch   1 Batch 2295/3125   train_loss = 1.230

2019-04-09T11:28:52.266593: Epoch   1 Batch 2315/3125   train_loss = 1.180

2019-04-09T11:28:53.027310: Epoch   1 Batch 2335/3125   train_loss = 1.068

2019-04-09T11:28:53.443572: Epoch   1 Batch 2355/3125   train_loss = 1.021

2019-04-09T11:28:53.859233: Epoch   1 Batch 2375/3125   train_loss = 1.241

2019-04-09T11:28:54.268702: Epoch   1 Batch 2395/3125   train_loss = 1.022

2019-04-09T11:28:54.684586: Epoch   1 Batch 2415/3125   train_loss = 1.062

2019-04-09T11:28:55.104188: Epoch   1 Batch 2435/3125   train_loss = 0.978

2019-04-09T11:28:55.517661: Epoch   1 Batch 2455/3125   train_loss = 1.075

2019-04-09T11:28:55.940375: Epoch   1 Batch 2475/3125   train_loss = 0.997

2019-04-09T11:28:56.355446: Epoch   1 Batch 2495/3125   train_loss = 0.991

2019-04-09T11:28:56.767784: Epoch   1 Batch 2515/3125   train_loss = 1.057

2019-04-09T11:28:57.185487: Epoch   1 Batch 2535/3125   train_loss = 1.064

2019-04-09T11:28:57.599402: Epoch   1 Batch 2555/3125   train_loss = 0.883

2019-04-09T11:28:58.012436: Epoch   1 Batch 2575/3125   train_loss = 0.914

2019-04-09T11:28:58.427098: Epoch   1 Batch 2595/3125   train_loss = 0.934

2019-04-09T11:28:58.836389: Epoch   1 Batch 2615/3125   train_loss = 1.151

2019-04-09T11:28:59.262074: Epoch   1 Batch 2635/3125   train_loss = 1.017

2019-04-09T11:28:59.680762: Epoch   1 Batch 2655/3125   train_loss = 1.036

2019-04-09T11:29:00.094884: Epoch   1 Batch 2675/3125   train_loss = 0.960

2019-04-09T11:29:00.510614: Epoch   1 Batch 2695/3125   train_loss = 1.031

2019-04-09T11:29:00.925679: Epoch   1 Batch 2715/3125   train_loss = 1.011

2019-04-09T11:29:01.343105: Epoch   1 Batch 2735/3125   train_loss = 0.876

2019-04-09T11:29:01.762199: Epoch   1 Batch 2755/3125   train_loss = 1.087

2019-04-09T11:29:02.171790: Epoch   1 Batch 2775/3125   train_loss = 1.101

2019-04-09T11:29:02.585480: Epoch   1 Batch 2795/3125   train_loss = 1.064

2019-04-09T11:29:02.995887: Epoch   1 Batch 2815/3125   train_loss = 0.981

2019-04-09T11:29:03.414306: Epoch   1 Batch 2835/3125   train_loss = 1.123

2019-04-09T11:29:03.824405: Epoch   1 Batch 2855/3125   train_loss = 1.069

2019-04-09T11:29:04.236239: Epoch   1 Batch 2875/3125   train_loss = 1.006

2019-04-09T11:29:04.644747: Epoch   1 Batch 2895/3125   train_loss = 1.013

2019-04-09T11:29:05.058545: Epoch   1 Batch 2915/3125   train_loss = 0.985

2019-04-09T11:29:05.473539: Epoch   1 Batch 2935/3125   train_loss = 1.152

2019-04-09T11:29:05.881997: Epoch   1 Batch 2955/3125   train_loss = 1.015

2019-04-09T11:29:06.294405: Epoch   1 Batch 2975/3125   train_loss = 0.977

2019-04-09T11:29:06.707933: Epoch   1 Batch 2995/3125   train_loss = 0.928

2019-04-09T11:29:07.122537: Epoch   1 Batch 3015/3125   train_loss = 1.033

2019-04-09T11:29:07.534921: Epoch   1 Batch 3035/3125   train_loss = 1.097

2019-04-09T11:29:07.945410: Epoch   1 Batch 3055/3125   train_loss = 1.058

2019-04-09T11:29:08.355520: Epoch   1 Batch 3075/3125   train_loss = 1.009

2019-04-09T11:29:08.775390: Epoch   1 Batch 3095/3125   train_loss = 0.946

2019-04-09T11:29:09.190497: Epoch   1 Batch 3115/3125   train_loss = 0.919

2019-04-09T11:29:09.605177: Epoch   1 Batch   19/781   test_loss = 1.005

2019-04-09T11:29:09.737030: Epoch   1 Batch   39/781   test_loss = 0.844

2019-04-09T11:29:09.863600: Epoch   1 Batch   59/781   test_loss = 0.955

2019-04-09T11:29:09.991439: Epoch   1 Batch   79/781   test_loss = 0.980

2019-04-09T11:29:10.118778: Epoch   1 Batch   99/781   test_loss = 0.997

2019-04-09T11:29:10.246117: Epoch   1 Batch  119/781   test_loss = 0.996

2019-04-09T11:29:10.374962: Epoch   1 Batch  139/781   test_loss = 0.988

2019-04-09T11:29:10.503975: Epoch   1 Batch  159/781   test_loss = 0.970

2019-04-09T11:29:10.630812: Epoch   1 Batch  179/781   test_loss = 0.950

2019-04-09T11:29:10.758151: Epoch   1 Batch  199/781   test_loss = 0.939

2019-04-09T11:29:10.885992: Epoch   1 Batch  219/781   test_loss = 0.993

2019-04-09T11:29:11.014332: Epoch   1 Batch  239/781   test_loss = 1.237

2019-04-09T11:29:11.141671: Epoch   1 Batch  259/781   test_loss = 0.976

2019-04-09T11:29:11.270013: Epoch   1 Batch  279/781   test_loss = 1.069

2019-04-09T11:29:11.399713: Epoch   1 Batch  299/781   test_loss = 1.209

2019-04-09T11:29:11.531062: Epoch   1 Batch  319/781   test_loss = 0.913

2019-04-09T11:29:11.661408: Epoch   1 Batch  339/781   test_loss = 0.906

2019-04-09T11:29:11.787744: Epoch   1 Batch  359/781   test_loss = 0.924

2019-04-09T11:29:11.914581: Epoch   1 Batch  379/781   test_loss = 1.030

2019-04-09T11:29:12.043424: Epoch   1 Batch  399/781   test_loss = 0.912

2019-04-09T11:29:12.171264: Epoch   1 Batch  419/781   test_loss = 0.959

2019-04-09T11:29:12.300107: Epoch   1 Batch  439/781   test_loss = 1.026

2019-04-09T11:29:12.428123: Epoch   1 Batch  459/781   test_loss = 1.085

2019-04-09T11:29:12.553965: Epoch   1 Batch  479/781   test_loss = 1.054

2019-04-09T11:29:12.683302: Epoch   1 Batch  499/781   test_loss = 0.919

2019-04-09T11:29:12.810139: Epoch   1 Batch  519/781   test_loss = 1.083

2019-04-09T11:29:12.939483: Epoch   1 Batch  539/781   test_loss = 0.888

2019-04-09T11:29:13.066822: Epoch   1 Batch  559/781   test_loss = 1.165

2019-04-09T11:29:13.195164: Epoch   1 Batch  579/781   test_loss = 1.014

2019-04-09T11:29:13.321500: Epoch   1 Batch  599/781   test_loss = 0.975

2019-04-09T11:29:13.449045: Epoch   1 Batch  619/781   test_loss = 1.152

2019-04-09T11:29:13.578390: Epoch   1 Batch  639/781   test_loss = 0.881

2019-04-09T11:29:13.706229: Epoch   1 Batch  659/781   test_loss = 1.086

2019-04-09T11:29:13.834069: Epoch   1 Batch  679/781   test_loss = 1.149

2019-04-09T11:29:13.964416: Epoch   1 Batch  699/781   test_loss = 0.888

2019-04-09T11:29:14.094763: Epoch   1 Batch  719/781   test_loss = 0.940

2019-04-09T11:29:14.223606: Epoch   1 Batch  739/781   test_loss = 1.001

2019-04-09T11:29:14.350443: Epoch   1 Batch  759/781   test_loss = 0.925

2019-04-09T11:29:14.479091: Epoch   1 Batch  779/781   test_loss = 0.786

2019-04-09T11:29:15.169929: Epoch   2 Batch   10/3125   train_loss = 0.962

2019-04-09T11:29:15.585033: Epoch   2 Batch   30/3125   train_loss = 0.921

2019-04-09T11:29:16.090936: Epoch   2 Batch   50/3125   train_loss = 1.098

2019-04-09T11:29:16.504056: Epoch   2 Batch   70/3125   train_loss = 1.066

2019-04-09T11:29:16.916616: Epoch   2 Batch   90/3125   train_loss = 1.065

2019-04-09T11:29:17.335995: Epoch   2 Batch  110/3125   train_loss = 0.908

2019-04-09T11:29:17.744923: Epoch   2 Batch  130/3125   train_loss = 0.927

2019-04-09T11:29:18.156518: Epoch   2 Batch  150/3125   train_loss = 1.094

2019-04-09T11:29:18.572814: Epoch   2 Batch  170/3125   train_loss = 1.062

2019-04-09T11:29:18.979180: Epoch   2 Batch  190/3125   train_loss = 1.043

2019-04-09T11:29:19.392758: Epoch   2 Batch  210/3125   train_loss = 0.920

2019-04-09T11:29:19.806360: Epoch   2 Batch  230/3125   train_loss = 0.990

2019-04-09T11:29:20.213864: Epoch   2 Batch  250/3125   train_loss = 0.956

2019-04-09T11:29:20.624843: Epoch   2 Batch  270/3125   train_loss = 0.816

2019-04-09T11:29:21.034399: Epoch   2 Batch  290/3125   train_loss = 1.029

2019-04-09T11:29:21.450506: Epoch   2 Batch  310/3125   train_loss = 1.039

2019-04-09T11:29:21.860168: Epoch   2 Batch  330/3125   train_loss = 0.981

2019-04-09T11:29:22.268774: Epoch   2 Batch  350/3125   train_loss = 0.927

2019-04-09T11:29:22.681125: Epoch   2 Batch  370/3125   train_loss = 1.157

2019-04-09T11:29:23.092834: Epoch   2 Batch  390/3125   train_loss = 1.131

2019-04-09T11:29:23.503543: Epoch   2 Batch  410/3125   train_loss = 0.945

2019-04-09T11:29:23.913894: Epoch   2 Batch  430/3125   train_loss = 1.121

2019-04-09T11:29:24.324622: Epoch   2 Batch  450/3125   train_loss = 0.925

2019-04-09T11:29:24.740883: Epoch   2 Batch  470/3125   train_loss = 0.952

2019-04-09T11:29:25.150474: Epoch   2 Batch  490/3125   train_loss = 1.031

2019-04-09T11:29:25.566388: Epoch   2 Batch  510/3125   train_loss = 1.045

2019-04-09T11:29:25.981499: Epoch   2 Batch  530/3125   train_loss = 0.936

2019-04-09T11:29:26.427824: Epoch   2 Batch  550/3125   train_loss = 1.041

2019-04-09T11:29:26.844394: Epoch   2 Batch  570/3125   train_loss = 1.175

2019-04-09T11:29:27.262411: Epoch   2 Batch  590/3125   train_loss = 1.093

2019-04-09T11:29:27.677138: Epoch   2 Batch  610/3125   train_loss = 0.941

2019-04-09T11:29:28.088132: Epoch   2 Batch  630/3125   train_loss = 1.067

2019-04-09T11:29:28.504546: Epoch   2 Batch  650/3125   train_loss = 1.015

2019-04-09T11:29:28.919901: Epoch   2 Batch  670/3125   train_loss = 0.921

2019-04-09T11:29:29.332525: Epoch   2 Batch  690/3125   train_loss = 0.946

2019-04-09T11:29:29.752401: Epoch   2 Batch  710/3125   train_loss = 0.958

2019-04-09T11:29:30.169512: Epoch   2 Batch  730/3125   train_loss = 0.833

2019-04-09T11:29:30.581918: Epoch   2 Batch  750/3125   train_loss = 0.983

2019-04-09T11:29:30.990078: Epoch   2 Batch  770/3125   train_loss = 0.882

2019-04-09T11:29:31.401819: Epoch   2 Batch  790/3125   train_loss = 0.922

2019-04-09T11:29:31.821438: Epoch   2 Batch  810/3125   train_loss = 0.843

2019-04-09T11:29:32.231582: Epoch   2 Batch  830/3125   train_loss = 0.875

2019-04-09T11:29:32.646142: Epoch   2 Batch  850/3125   train_loss = 1.077

2019-04-09T11:29:33.064808: Epoch   2 Batch  870/3125   train_loss = 0.952

2019-04-09T11:29:33.477008: Epoch   2 Batch  890/3125   train_loss = 0.888

2019-04-09T11:29:33.887466: Epoch   2 Batch  910/3125   train_loss = 1.012

2019-04-09T11:29:34.298086: Epoch   2 Batch  930/3125   train_loss = 0.959

2019-04-09T11:29:34.715677: Epoch   2 Batch  950/3125   train_loss = 0.975

2019-04-09T11:29:35.130281: Epoch   2 Batch  970/3125   train_loss = 1.050

2019-04-09T11:29:35.544737: Epoch   2 Batch  990/3125   train_loss = 0.864

2019-04-09T11:29:35.958160: Epoch   2 Batch 1010/3125   train_loss = 1.084

2019-04-09T11:29:36.371777: Epoch   2 Batch 1030/3125   train_loss = 0.946

2019-04-09T11:29:36.780334: Epoch   2 Batch 1050/3125   train_loss = 1.009

2019-04-09T11:29:37.193936: Epoch   2 Batch 1070/3125   train_loss = 0.981

2019-04-09T11:29:37.603917: Epoch   2 Batch 1090/3125   train_loss = 1.081

2019-04-09T11:29:38.014688: Epoch   2 Batch 1110/3125   train_loss = 1.080

2019-04-09T11:29:38.435423: Epoch   2 Batch 1130/3125   train_loss = 0.920

2019-04-09T11:29:38.848851: Epoch   2 Batch 1150/3125   train_loss = 0.949

2019-04-09T11:29:39.260649: Epoch   2 Batch 1170/3125   train_loss = 0.944

2019-04-09T11:29:39.676982: Epoch   2 Batch 1190/3125   train_loss = 1.046

2019-04-09T11:29:40.089421: Epoch   2 Batch 1210/3125   train_loss = 0.873

2019-04-09T11:29:40.501075: Epoch   2 Batch 1230/3125   train_loss = 0.862

2019-04-09T11:29:40.912917: Epoch   2 Batch 1250/3125   train_loss = 0.963

2019-04-09T11:29:41.331306: Epoch   2 Batch 1270/3125   train_loss = 1.041

2019-04-09T11:29:41.745589: Epoch   2 Batch 1290/3125   train_loss = 0.935

2019-04-09T11:29:42.155682: Epoch   2 Batch 1310/3125   train_loss = 1.011

2019-04-09T11:29:42.565230: Epoch   2 Batch 1330/3125   train_loss = 1.089

2019-04-09T11:29:42.972821: Epoch   2 Batch 1350/3125   train_loss = 0.929

2019-04-09T11:29:43.384313: Epoch   2 Batch 1370/3125   train_loss = 0.871

2019-04-09T11:29:43.800679: Epoch   2 Batch 1390/3125   train_loss = 1.056

2019-04-09T11:29:44.212277: Epoch   2 Batch 1410/3125   train_loss = 0.956

2019-04-09T11:29:44.622595: Epoch   2 Batch 1430/3125   train_loss = 0.991

2019-04-09T11:29:45.030926: Epoch   2 Batch 1450/3125   train_loss = 1.019

2019-04-09T11:29:45.446118: Epoch   2 Batch 1470/3125   train_loss = 1.018

2019-04-09T11:29:45.858249: Epoch   2 Batch 1490/3125   train_loss = 1.025

2019-04-09T11:29:46.264877: Epoch   2 Batch 1510/3125   train_loss = 0.987

2019-04-09T11:29:46.680210: Epoch   2 Batch 1530/3125   train_loss = 1.077

2019-04-09T11:29:47.097122: Epoch   2 Batch 1550/3125   train_loss = 0.871

2019-04-09T11:29:47.505701: Epoch   2 Batch 1570/3125   train_loss = 0.963

2019-04-09T11:29:47.915740: Epoch   2 Batch 1590/3125   train_loss = 0.935

2019-04-09T11:29:48.325191: Epoch   2 Batch 1610/3125   train_loss = 1.024

2019-04-09T11:29:48.741050: Epoch   2 Batch 1630/3125   train_loss = 1.033

2019-04-09T11:29:49.303637: Epoch   2 Batch 1650/3125   train_loss = 0.892

2019-04-09T11:29:49.716688: Epoch   2 Batch 1670/3125   train_loss = 0.828

2019-04-09T11:29:50.127782: Epoch   2 Batch 1690/3125   train_loss = 0.886

2019-04-09T11:29:50.541466: Epoch   2 Batch 1710/3125   train_loss = 1.033

2019-04-09T11:29:50.952638: Epoch   2 Batch 1730/3125   train_loss = 0.990

2019-04-09T11:29:51.366254: Epoch   2 Batch 1750/3125   train_loss = 0.851

2019-04-09T11:29:51.779575: Epoch   2 Batch 1770/3125   train_loss = 1.130

2019-04-09T11:29:52.189667: Epoch   2 Batch 1790/3125   train_loss = 0.970

2019-04-09T11:29:52.600989: Epoch   2 Batch 1810/3125   train_loss = 1.004

2019-04-09T11:29:53.010986: Epoch   2 Batch 1830/3125   train_loss = 1.035

2019-04-09T11:29:53.428213: Epoch   2 Batch 1850/3125   train_loss = 0.935

2019-04-09T11:29:53.847839: Epoch   2 Batch 1870/3125   train_loss = 1.039

2019-04-09T11:29:54.260999: Epoch   2 Batch 1890/3125   train_loss = 0.822

2019-04-09T11:29:54.670587: Epoch   2 Batch 1910/3125   train_loss = 0.885

2019-04-09T11:29:55.079904: Epoch   2 Batch 1930/3125   train_loss = 1.038

2019-04-09T11:29:55.492941: Epoch   2 Batch 1950/3125   train_loss = 0.887

2019-04-09T11:29:55.909977: Epoch   2 Batch 1970/3125   train_loss = 0.998

2019-04-09T11:29:56.321669: Epoch   2 Batch 1990/3125   train_loss = 0.864

2019-04-09T11:29:56.731912: Epoch   2 Batch 2010/3125   train_loss = 0.792

2019-04-09T11:29:57.143008: Epoch   2 Batch 2030/3125   train_loss = 0.907

2019-04-09T11:29:57.555451: Epoch   2 Batch 2050/3125   train_loss = 0.952

2019-04-09T11:29:57.967763: Epoch   2 Batch 2070/3125   train_loss = 0.882

2019-04-09T11:29:58.396253: Epoch   2 Batch 2090/3125   train_loss = 0.831

2019-04-09T11:29:58.810290: Epoch   2 Batch 2110/3125   train_loss = 1.050

2019-04-09T11:29:59.220382: Epoch   2 Batch 2130/3125   train_loss = 0.973

2019-04-09T11:29:59.638177: Epoch   2 Batch 2150/3125   train_loss = 1.009

2019-04-09T11:30:00.054094: Epoch   2 Batch 2170/3125   train_loss = 0.862

2019-04-09T11:30:00.465054: Epoch   2 Batch 2190/3125   train_loss = 0.967

2019-04-09T11:30:00.875581: Epoch   2 Batch 2210/3125   train_loss = 0.950

2019-04-09T11:30:01.283669: Epoch   2 Batch 2230/3125   train_loss = 0.843

2019-04-09T11:30:01.702253: Epoch   2 Batch 2250/3125   train_loss = 0.933

2019-04-09T11:30:02.116357: Epoch   2 Batch 2270/3125   train_loss = 0.917

2019-04-09T11:30:02.530943: Epoch   2 Batch 2290/3125   train_loss = 0.856

2019-04-09T11:30:02.942953: Epoch   2 Batch 2310/3125   train_loss = 0.851

2019-04-09T11:30:03.360388: Epoch   2 Batch 2330/3125   train_loss = 1.097

2019-04-09T11:30:03.770799: Epoch   2 Batch 2350/3125   train_loss = 0.989

2019-04-09T11:30:04.189427: Epoch   2 Batch 2370/3125   train_loss = 0.886

2019-04-09T11:30:04.602910: Epoch   2 Batch 2390/3125   train_loss = 1.017

2019-04-09T11:30:05.013193: Epoch   2 Batch 2410/3125   train_loss = 1.025

2019-04-09T11:30:05.426136: Epoch   2 Batch 2430/3125   train_loss = 0.885

2019-04-09T11:30:05.834048: Epoch   2 Batch 2450/3125   train_loss = 0.968

2019-04-09T11:30:06.246209: Epoch   2 Batch 2470/3125   train_loss = 1.042

2019-04-09T11:30:06.661647: Epoch   2 Batch 2490/3125   train_loss = 1.003

2019-04-09T11:30:07.071296: Epoch   2 Batch 2510/3125   train_loss = 1.084

2019-04-09T11:30:07.490192: Epoch   2 Batch 2530/3125   train_loss = 0.793

2019-04-09T11:30:07.904515: Epoch   2 Batch 2550/3125   train_loss = 0.954

2019-04-09T11:30:08.315032: Epoch   2 Batch 2570/3125   train_loss = 0.957

2019-04-09T11:30:08.733158: Epoch   2 Batch 2590/3125   train_loss = 0.984

2019-04-09T11:30:09.146760: Epoch   2 Batch 2610/3125   train_loss = 1.043

2019-04-09T11:30:09.564414: Epoch   2 Batch 2630/3125   train_loss = 0.660

2019-04-09T11:30:09.977708: Epoch   2 Batch 2650/3125   train_loss = 0.913

2019-04-09T11:30:10.392227: Epoch   2 Batch 2670/3125   train_loss = 1.051

2019-04-09T11:30:10.803323: Epoch   2 Batch 2690/3125   train_loss = 0.980

2019-04-09T11:30:11.221892: Epoch   2 Batch 2710/3125   train_loss = 0.845

2019-04-09T11:30:11.636832: Epoch   2 Batch 2730/3125   train_loss = 1.067

2019-04-09T11:30:12.048855: Epoch   2 Batch 2750/3125   train_loss = 1.020

2019-04-09T11:30:12.466622: Epoch   2 Batch 2770/3125   train_loss = 0.894

2019-04-09T11:30:12.877228: Epoch   2 Batch 2790/3125   train_loss = 0.881

2019-04-09T11:30:13.292940: Epoch   2 Batch 2810/3125   train_loss = 0.958

2019-04-09T11:30:13.707370: Epoch   2 Batch 2830/3125   train_loss = 0.816

2019-04-09T11:30:14.115458: Epoch   2 Batch 2850/3125   train_loss = 1.005

2019-04-09T11:30:14.527402: Epoch   2 Batch 2870/3125   train_loss = 0.792

2019-04-09T11:30:14.941006: Epoch   2 Batch 2890/3125   train_loss = 0.779

2019-04-09T11:30:15.351115: Epoch   2 Batch 2910/3125   train_loss = 1.007

2019-04-09T11:30:15.761429: Epoch   2 Batch 2930/3125   train_loss = 0.813

2019-04-09T11:30:16.174529: Epoch   2 Batch 2950/3125   train_loss = 1.069

2019-04-09T11:30:16.592845: Epoch   2 Batch 2970/3125   train_loss = 0.993

2019-04-09T11:30:17.005062: Epoch   2 Batch 2990/3125   train_loss = 0.862

2019-04-09T11:30:17.425470: Epoch   2 Batch 3010/3125   train_loss = 0.936

2019-04-09T11:30:17.837640: Epoch   2 Batch 3030/3125   train_loss = 0.968

2019-04-09T11:30:18.248424: Epoch   2 Batch 3050/3125   train_loss = 0.980

2019-04-09T11:30:18.666115: Epoch   2 Batch 3070/3125   train_loss = 0.896

2019-04-09T11:30:19.074163: Epoch   2 Batch 3090/3125   train_loss = 0.774

2019-04-09T11:30:19.491628: Epoch   2 Batch 3110/3125   train_loss = 0.837

2019-04-09T11:30:19.895275: Epoch   2 Batch   18/781   test_loss = 0.808

2019-04-09T11:30:20.023969: Epoch   2 Batch   38/781   test_loss = 0.915

2019-04-09T11:30:20.152310: Epoch   2 Batch   58/781   test_loss = 0.851

2019-04-09T11:30:20.280151: Epoch   2 Batch   78/781   test_loss = 0.905

2019-04-09T11:30:20.408187: Epoch   2 Batch   98/781   test_loss = 0.903

2019-04-09T11:30:20.536028: Epoch   2 Batch  118/781   test_loss = 0.884

2019-04-09T11:30:20.663366: Epoch   2 Batch  138/781   test_loss = 1.000

2019-04-09T11:30:20.791206: Epoch   2 Batch  158/781   test_loss = 0.904

2019-04-09T11:30:20.918545: Epoch   2 Batch  178/781   test_loss = 0.785

2019-04-09T11:30:21.045884: Epoch   2 Batch  198/781   test_loss = 0.922

2019-04-09T11:30:21.177736: Epoch   2 Batch  218/781   test_loss = 0.997

2019-04-09T11:30:21.310087: Epoch   2 Batch  238/781   test_loss = 0.998

2019-04-09T11:30:21.437625: Epoch   2 Batch  258/781   test_loss = 0.959

2019-04-09T11:30:21.565465: Epoch   2 Batch  278/781   test_loss = 1.074

2019-04-09T11:30:21.692804: Epoch   2 Batch  298/781   test_loss = 0.915

2019-04-09T11:30:21.821646: Epoch   2 Batch  318/781   test_loss = 0.889

2019-04-09T11:30:21.952495: Epoch   2 Batch  338/781   test_loss = 0.941

2019-04-09T11:30:22.081338: Epoch   2 Batch  358/781   test_loss = 0.913

2019-04-09T11:30:22.210686: Epoch   2 Batch  378/781   test_loss = 0.890

2019-04-09T11:30:22.344036: Epoch   2 Batch  398/781   test_loss = 0.833

2019-04-09T11:30:22.471957: Epoch   2 Batch  418/781   test_loss = 0.941

2019-04-09T11:30:22.599296: Epoch   2 Batch  438/781   test_loss = 1.013

2019-04-09T11:30:22.728139: Epoch   2 Batch  458/781   test_loss = 0.919

2019-04-09T11:30:22.855992: Epoch   2 Batch  478/781   test_loss = 0.965

2019-04-09T11:30:22.982816: Epoch   2 Batch  498/781   test_loss = 0.813

2019-04-09T11:30:23.110155: Epoch   2 Batch  518/781   test_loss = 0.919

2019-04-09T11:30:23.238497: Epoch   2 Batch  538/781   test_loss = 0.795

2019-04-09T11:30:23.366838: Epoch   2 Batch  558/781   test_loss = 0.830

2019-04-09T11:30:23.495883: Epoch   2 Batch  578/781   test_loss = 0.915

2019-04-09T11:30:23.623225: Epoch   2 Batch  598/781   test_loss = 1.055

2019-04-09T11:30:23.751062: Epoch   2 Batch  618/781   test_loss = 0.850

2019-04-09T11:30:23.879905: Epoch   2 Batch  638/781   test_loss = 0.845

2019-04-09T11:30:24.007243: Epoch   2 Batch  658/781   test_loss = 1.026

2019-04-09T11:30:24.138091: Epoch   2 Batch  678/781   test_loss = 0.926

2019-04-09T11:30:24.266433: Epoch   2 Batch  698/781   test_loss = 0.875

2019-04-09T11:30:24.395604: Epoch   2 Batch  718/781   test_loss = 1.006

2019-04-09T11:30:24.523445: Epoch   2 Batch  738/781   test_loss = 0.850

2019-04-09T11:30:24.651786: Epoch   2 Batch  758/781   test_loss = 0.892

2019-04-09T11:30:24.779626: Epoch   2 Batch  778/781   test_loss = 0.913

2019-04-09T11:30:25.360700: Epoch   3 Batch    5/3125   train_loss = 0.900

2019-04-09T11:30:25.776594: Epoch   3 Batch   25/3125   train_loss = 0.995

2019-04-09T11:30:26.190195: Epoch   3 Batch   45/3125   train_loss = 0.823

2019-04-09T11:30:26.605221: Epoch   3 Batch   65/3125   train_loss = 0.936

2019-04-09T11:30:27.017575: Epoch   3 Batch   85/3125   train_loss = 0.811

2019-04-09T11:30:27.433325: Epoch   3 Batch  105/3125   train_loss = 0.735

2019-04-09T11:30:27.845489: Epoch   3 Batch  125/3125   train_loss = 0.883

2019-04-09T11:30:28.255902: Epoch   3 Batch  145/3125   train_loss = 0.946

2019-04-09T11:30:28.676186: Epoch   3 Batch  165/3125   train_loss = 0.907

2019-04-09T11:30:29.086028: Epoch   3 Batch  185/3125   train_loss = 0.843

2019-04-09T11:30:29.498049: Epoch   3 Batch  205/3125   train_loss = 0.782

2019-04-09T11:30:29.910137: Epoch   3 Batch  225/3125   train_loss = 0.818

2019-04-09T11:30:30.321717: Epoch   3 Batch  245/3125   train_loss = 1.094

2019-04-09T11:30:30.732822: Epoch   3 Batch  265/3125   train_loss = 0.907

2019-04-09T11:30:31.144919: Epoch   3 Batch  285/3125   train_loss = 0.899

2019-04-09T11:30:31.564878: Epoch   3 Batch  305/3125   train_loss = 0.886

2019-04-09T11:30:31.986450: Epoch   3 Batch  325/3125   train_loss = 0.900

2019-04-09T11:30:32.402943: Epoch   3 Batch  345/3125   train_loss = 0.966

2019-04-09T11:30:32.817756: Epoch   3 Batch  365/3125   train_loss = 0.897

2019-04-09T11:30:33.231358: Epoch   3 Batch  385/3125   train_loss = 0.854

2019-04-09T11:30:33.642523: Epoch   3 Batch  405/3125   train_loss = 0.854

2019-04-09T11:30:34.052009: Epoch   3 Batch  425/3125   train_loss = 0.950

2019-04-09T11:30:34.463651: Epoch   3 Batch  445/3125   train_loss = 0.963

2019-04-09T11:30:34.877612: Epoch   3 Batch  465/3125   train_loss = 0.840

2019-04-09T11:30:35.291041: Epoch   3 Batch  485/3125   train_loss = 1.043

2019-04-09T11:30:35.701510: Epoch   3 Batch  505/3125   train_loss = 0.820

2019-04-09T11:30:36.113107: Epoch   3 Batch  525/3125   train_loss = 0.977

2019-04-09T11:30:36.526067: Epoch   3 Batch  545/3125   train_loss = 0.785

2019-04-09T11:30:36.938504: Epoch   3 Batch  565/3125   train_loss = 1.138

2019-04-09T11:30:37.354627: Epoch   3 Batch  585/3125   train_loss = 0.877

2019-04-09T11:30:37.769480: Epoch   3 Batch  605/3125   train_loss = 0.865

2019-04-09T11:30:38.180576: Epoch   3 Batch  625/3125   train_loss = 0.931

2019-04-09T11:30:38.595414: Epoch   3 Batch  645/3125   train_loss = 1.007

2019-04-09T11:30:39.007112: Epoch   3 Batch  665/3125   train_loss = 0.960

2019-04-09T11:30:39.427161: Epoch   3 Batch  685/3125   train_loss = 0.908

2019-04-09T11:30:39.841768: Epoch   3 Batch  705/3125   train_loss = 1.001

2019-04-09T11:30:40.258352: Epoch   3 Batch  725/3125   train_loss = 0.888

2019-04-09T11:30:40.672977: Epoch   3 Batch  745/3125   train_loss = 0.834

2019-04-09T11:30:41.090307: Epoch   3 Batch  765/3125   train_loss = 0.864

2019-04-09T11:30:41.504196: Epoch   3 Batch  785/3125   train_loss = 1.046

2019-04-09T11:30:41.912423: Epoch   3 Batch  805/3125   train_loss = 0.816

2019-04-09T11:30:42.328090: Epoch   3 Batch  825/3125   train_loss = 0.904

2019-04-09T11:30:42.740677: Epoch   3 Batch  845/3125   train_loss = 0.932

2019-04-09T11:30:43.153777: Epoch   3 Batch  865/3125   train_loss = 1.004

2019-04-09T11:30:43.566946: Epoch   3 Batch  885/3125   train_loss = 0.968

2019-04-09T11:30:43.981050: Epoch   3 Batch  905/3125   train_loss = 0.998

2019-04-09T11:30:44.394270: Epoch   3 Batch  925/3125   train_loss = 0.896

2019-04-09T11:30:44.807669: Epoch   3 Batch  945/3125   train_loss = 0.978

2019-04-09T11:30:45.224278: Epoch   3 Batch  965/3125   train_loss = 0.731

2019-04-09T11:30:45.644716: Epoch   3 Batch  985/3125   train_loss = 1.003

2019-04-09T11:30:46.056218: Epoch   3 Batch 1005/3125   train_loss = 0.794

2019-04-09T11:30:46.465616: Epoch   3 Batch 1025/3125   train_loss = 0.879

2019-04-09T11:30:46.878718: Epoch   3 Batch 1045/3125   train_loss = 1.127

2019-04-09T11:30:47.297579: Epoch   3 Batch 1065/3125   train_loss = 0.875

2019-04-09T11:30:47.709534: Epoch   3 Batch 1085/3125   train_loss = 0.834

2019-04-09T11:30:48.125642: Epoch   3 Batch 1105/3125   train_loss = 0.842

2019-04-09T11:30:48.538103: Epoch   3 Batch 1125/3125   train_loss = 0.859

2019-04-09T11:30:48.952197: Epoch   3 Batch 1145/3125   train_loss = 0.905

2019-04-09T11:30:49.366261: Epoch   3 Batch 1165/3125   train_loss = 0.964

2019-04-09T11:30:49.774853: Epoch   3 Batch 1185/3125   train_loss = 0.869

2019-04-09T11:30:50.190392: Epoch   3 Batch 1205/3125   train_loss = 0.836

2019-04-09T11:30:50.605998: Epoch   3 Batch 1225/3125   train_loss = 1.002

2019-04-09T11:30:51.020181: Epoch   3 Batch 1245/3125   train_loss = 1.006

2019-04-09T11:30:51.434899: Epoch   3 Batch 1265/3125   train_loss = 0.896

2019-04-09T11:30:51.850872: Epoch   3 Batch 1285/3125   train_loss = 0.960

2019-04-09T11:30:52.265731: Epoch   3 Batch 1305/3125   train_loss = 0.802

2019-04-09T11:30:53.236710: Epoch   3 Batch 1325/3125   train_loss = 0.886

2019-04-09T11:30:53.650278: Epoch   3 Batch 1345/3125   train_loss = 0.928

2019-04-09T11:30:54.066153: Epoch   3 Batch 1365/3125   train_loss = 0.761

2019-04-09T11:30:54.481716: Epoch   3 Batch 1385/3125   train_loss = 0.779

2019-04-09T11:30:54.890807: Epoch   3 Batch 1405/3125   train_loss = 0.857

2019-04-09T11:30:55.303205: Epoch   3 Batch 1425/3125   train_loss = 1.106

2019-04-09T11:30:55.713796: Epoch   3 Batch 1445/3125   train_loss = 1.002

2019-04-09T11:30:56.127899: Epoch   3 Batch 1465/3125   train_loss = 0.887

2019-04-09T11:30:56.544126: Epoch   3 Batch 1485/3125   train_loss = 0.920

2019-04-09T11:30:56.952476: Epoch   3 Batch 1505/3125   train_loss = 0.745

2019-04-09T11:30:57.370433: Epoch   3 Batch 1525/3125   train_loss = 0.759

2019-04-09T11:30:57.781531: Epoch   3 Batch 1545/3125   train_loss = 0.843

2019-04-09T11:30:58.194632: Epoch   3 Batch 1565/3125   train_loss = 0.983

2019-04-09T11:30:58.613587: Epoch   3 Batch 1585/3125   train_loss = 0.827

2019-04-09T11:30:59.029585: Epoch   3 Batch 1605/3125   train_loss = 0.971

2019-04-09T11:30:59.443109: Epoch   3 Batch 1625/3125   train_loss = 0.950

2019-04-09T11:30:59.862969: Epoch   3 Batch 1645/3125   train_loss = 0.978

2019-04-09T11:31:00.280054: Epoch   3 Batch 1665/3125   train_loss = 0.916

2019-04-09T11:31:00.697972: Epoch   3 Batch 1685/3125   train_loss = 0.893

2019-04-09T11:31:01.120406: Epoch   3 Batch 1705/3125   train_loss = 0.883

2019-04-09T11:31:01.540523: Epoch   3 Batch 1725/3125   train_loss = 0.834

2019-04-09T11:31:01.957635: Epoch   3 Batch 1745/3125   train_loss = 0.775

2019-04-09T11:31:02.372311: Epoch   3 Batch 1765/3125   train_loss = 0.825

2019-04-09T11:31:02.786676: Epoch   3 Batch 1785/3125   train_loss = 1.015

2019-04-09T11:31:03.204288: Epoch   3 Batch 1805/3125   train_loss = 0.958

2019-04-09T11:31:03.616851: Epoch   3 Batch 1825/3125   train_loss = 1.031

2019-04-09T11:31:04.029497: Epoch   3 Batch 1845/3125   train_loss = 0.922

2019-04-09T11:31:04.442097: Epoch   3 Batch 1865/3125   train_loss = 0.753

2019-04-09T11:31:04.856887: Epoch   3 Batch 1885/3125   train_loss = 0.986

2019-04-09T11:31:05.271825: Epoch   3 Batch 1905/3125   train_loss = 0.799

2019-04-09T11:31:05.688152: Epoch   3 Batch 1925/3125   train_loss = 0.830

2019-04-09T11:31:06.097059: Epoch   3 Batch 1945/3125   train_loss = 0.865

2019-04-09T11:31:06.510931: Epoch   3 Batch 1965/3125   train_loss = 0.867

2019-04-09T11:31:06.924666: Epoch   3 Batch 1985/3125   train_loss = 0.840

2019-04-09T11:31:07.341276: Epoch   3 Batch 2005/3125   train_loss = 0.881

2019-04-09T11:31:07.755738: Epoch   3 Batch 2025/3125   train_loss = 0.951

2019-04-09T11:31:08.168337: Epoch   3 Batch 2045/3125   train_loss = 0.754

2019-04-09T11:31:08.583280: Epoch   3 Batch 2065/3125   train_loss = 0.727

2019-04-09T11:31:08.998421: Epoch   3 Batch 2085/3125   train_loss = 1.058

2019-04-09T11:31:09.415818: Epoch   3 Batch 2105/3125   train_loss = 0.891

2019-04-09T11:31:09.827917: Epoch   3 Batch 2125/3125   train_loss = 0.976

2019-04-09T11:31:10.237408: Epoch   3 Batch 2145/3125   train_loss = 1.002

2019-04-09T11:31:10.652222: Epoch   3 Batch 2165/3125   train_loss = 0.862

2019-04-09T11:31:11.061610: Epoch   3 Batch 2185/3125   train_loss = 0.948

2019-04-09T11:31:11.476691: Epoch   3 Batch 2205/3125   train_loss = 0.958

2019-04-09T11:31:11.893028: Epoch   3 Batch 2225/3125   train_loss = 0.811

2019-04-09T11:31:12.428069: Epoch   3 Batch 2245/3125   train_loss = 0.798

2019-04-09T11:31:12.840171: Epoch   3 Batch 2265/3125   train_loss = 0.896

2019-04-09T11:31:13.254127: Epoch   3 Batch 2285/3125   train_loss = 1.099

2019-04-09T11:31:13.671868: Epoch   3 Batch 2305/3125   train_loss = 0.812

2019-04-09T11:31:14.083559: Epoch   3 Batch 2325/3125   train_loss = 0.788

2019-04-09T11:31:14.499758: Epoch   3 Batch 2345/3125   train_loss = 0.885

2019-04-09T11:31:14.912859: Epoch   3 Batch 2365/3125   train_loss = 0.702

2019-04-09T11:31:15.331776: Epoch   3 Batch 2385/3125   train_loss = 0.915

2019-04-09T11:31:15.749019: Epoch   3 Batch 2405/3125   train_loss = 0.908

2019-04-09T11:31:16.161618: Epoch   3 Batch 2425/3125   train_loss = 0.875

2019-04-09T11:31:16.583581: Epoch   3 Batch 2445/3125   train_loss = 1.002

2019-04-09T11:31:17.000198: Epoch   3 Batch 2465/3125   train_loss = 0.748

2019-04-09T11:31:17.420234: Epoch   3 Batch 2485/3125   train_loss = 0.880

2019-04-09T11:31:17.834288: Epoch   3 Batch 2505/3125   train_loss = 0.852

2019-04-09T11:31:18.247812: Epoch   3 Batch 2525/3125   train_loss = 0.849

2019-04-09T11:31:18.663700: Epoch   3 Batch 2545/3125   train_loss = 1.010

2019-04-09T11:31:19.076134: Epoch   3 Batch 2565/3125   train_loss = 0.851

2019-04-09T11:31:19.490451: Epoch   3 Batch 2585/3125   train_loss = 0.768

2019-04-09T11:31:19.905388: Epoch   3 Batch 2605/3125   train_loss = 0.867

2019-04-09T11:31:20.318355: Epoch   3 Batch 2625/3125   train_loss = 1.004

2019-04-09T11:31:20.732786: Epoch   3 Batch 2645/3125   train_loss = 0.906

2019-04-09T11:31:21.146894: Epoch   3 Batch 2665/3125   train_loss = 0.984

2019-04-09T11:31:21.566102: Epoch   3 Batch 2685/3125   train_loss = 0.920

2019-04-09T11:31:21.981681: Epoch   3 Batch 2705/3125   train_loss = 0.784

2019-04-09T11:31:22.399609: Epoch   3 Batch 2725/3125   train_loss = 0.916

2019-04-09T11:31:22.817940: Epoch   3 Batch 2745/3125   train_loss = 0.925

2019-04-09T11:31:23.266133: Epoch   3 Batch 2765/3125   train_loss = 0.837

2019-04-09T11:31:23.679262: Epoch   3 Batch 2785/3125   train_loss = 0.935

2019-04-09T11:31:24.097862: Epoch   3 Batch 2805/3125   train_loss = 0.839

2019-04-09T11:31:24.511944: Epoch   3 Batch 2825/3125   train_loss = 0.844

2019-04-09T11:31:24.926787: Epoch   3 Batch 2845/3125   train_loss = 0.858

2019-04-09T11:31:25.347381: Epoch   3 Batch 2865/3125   train_loss = 0.853

2019-04-09T11:31:25.764592: Epoch   3 Batch 2885/3125   train_loss = 0.939

2019-04-09T11:31:26.184209: Epoch   3 Batch 2905/3125   train_loss = 0.969

2019-04-09T11:31:26.601925: Epoch   3 Batch 2925/3125   train_loss = 0.868

2019-04-09T11:31:27.016711: Epoch   3 Batch 2945/3125   train_loss = 0.900

2019-04-09T11:31:27.435058: Epoch   3 Batch 2965/3125   train_loss = 0.939

2019-04-09T11:31:27.848061: Epoch   3 Batch 2985/3125   train_loss = 0.843

2019-04-09T11:31:28.261955: Epoch   3 Batch 3005/3125   train_loss = 0.860

2019-04-09T11:31:28.677308: Epoch   3 Batch 3025/3125   train_loss = 0.917

2019-04-09T11:31:29.091668: Epoch   3 Batch 3045/3125   train_loss = 0.883

2019-04-09T11:31:29.505770: Epoch   3 Batch 3065/3125   train_loss = 0.864

2019-04-09T11:31:29.920149: Epoch   3 Batch 3085/3125   train_loss = 0.867

2019-04-09T11:31:30.335191: Epoch   3 Batch 3105/3125   train_loss = 0.929

2019-04-09T11:31:30.978022: Epoch   3 Batch   17/781   test_loss = 0.866

2019-04-09T11:31:31.112380: Epoch   3 Batch   37/781   test_loss = 0.868

2019-04-09T11:31:31.248741: Epoch   3 Batch   57/781   test_loss = 0.894

2019-04-09T11:31:31.387784: Epoch   3 Batch   77/781   test_loss = 0.898

2019-04-09T11:31:31.519144: Epoch   3 Batch   97/781   test_loss = 0.790

2019-04-09T11:31:31.648478: Epoch   3 Batch  117/781   test_loss = 0.950

2019-04-09T11:31:31.787347: Epoch   3 Batch  137/781   test_loss = 0.922

2019-04-09T11:31:31.934742: Epoch   3 Batch  157/781   test_loss = 0.919

2019-04-09T11:31:32.076115: Epoch   3 Batch  177/781   test_loss = 0.873

2019-04-09T11:31:32.206462: Epoch   3 Batch  197/781   test_loss = 0.928

2019-04-09T11:31:32.347500: Epoch   3 Batch  217/781   test_loss = 0.699

2019-04-09T11:31:32.483362: Epoch   3 Batch  237/781   test_loss = 0.752

2019-04-09T11:31:32.612205: Epoch   3 Batch  257/781   test_loss = 1.014

2019-04-09T11:31:32.754584: Epoch   3 Batch  277/781   test_loss = 0.979

2019-04-09T11:31:32.897965: Epoch   3 Batch  297/781   test_loss = 0.961

2019-04-09T11:31:33.031821: Epoch   3 Batch  317/781   test_loss = 1.030

2019-04-09T11:31:33.166680: Epoch   3 Batch  337/781   test_loss = 0.906

2019-04-09T11:31:33.308477: Epoch   3 Batch  357/781   test_loss = 0.883

2019-04-09T11:31:33.450355: Epoch   3 Batch  377/781   test_loss = 0.932

2019-04-09T11:31:33.580701: Epoch   3 Batch  397/781   test_loss = 0.918

2019-04-09T11:31:33.721075: Epoch   3 Batch  417/781   test_loss = 0.842

2019-04-09T11:31:33.859944: Epoch   3 Batch  437/781   test_loss = 0.808

2019-04-09T11:31:33.988286: Epoch   3 Batch  457/781   test_loss = 0.690

2019-04-09T11:31:34.116627: Epoch   3 Batch  477/781   test_loss = 0.923

2019-04-09T11:31:34.256500: Epoch   3 Batch  497/781   test_loss = 0.807

2019-04-09T11:31:34.394868: Epoch   3 Batch  517/781   test_loss = 0.805

2019-04-09T11:31:34.522207: Epoch   3 Batch  537/781   test_loss = 0.802

2019-04-09T11:31:34.650046: Epoch   3 Batch  557/781   test_loss = 1.050

2019-04-09T11:31:34.792425: Epoch   3 Batch  577/781   test_loss = 0.912

2019-04-09T11:31:34.930292: Epoch   3 Batch  597/781   test_loss = 0.875

2019-04-09T11:31:35.058634: Epoch   3 Batch  617/781   test_loss = 0.862

2019-04-09T11:31:35.184973: Epoch   3 Batch  637/781   test_loss = 0.781

2019-04-09T11:31:35.314815: Epoch   3 Batch  657/781   test_loss = 1.008

2019-04-09T11:31:35.444363: Epoch   3 Batch  677/781   test_loss = 0.931

2019-04-09T11:31:35.578721: Epoch   3 Batch  697/781   test_loss = 0.907

2019-04-09T11:31:35.712076: Epoch   3 Batch  717/781   test_loss = 0.812

2019-04-09T11:31:35.841921: Epoch   3 Batch  737/781   test_loss = 0.764

2019-04-09T11:31:35.983800: Epoch   3 Batch  757/781   test_loss = 1.099

2019-04-09T11:31:36.119660: Epoch   3 Batch  777/781   test_loss = 0.960

2019-04-09T11:31:36.666392: Epoch   4 Batch    0/3125   train_loss = 0.960

2019-04-09T11:31:37.108038: Epoch   4 Batch   20/3125   train_loss = 0.848

2019-04-09T11:31:37.523644: Epoch   4 Batch   40/3125   train_loss = 0.929

2019-04-09T11:31:37.940279: Epoch   4 Batch   60/3125   train_loss = 0.729

2019-04-09T11:31:38.360397: Epoch   4 Batch   80/3125   train_loss = 0.870

2019-04-09T11:31:38.783226: Epoch   4 Batch  100/3125   train_loss = 0.972

2019-04-09T11:31:39.208774: Epoch   4 Batch  120/3125   train_loss = 1.008

2019-04-09T11:31:39.670500: Epoch   4 Batch  140/3125   train_loss = 0.932

2019-04-09T11:31:40.130223: Epoch   4 Batch  160/3125   train_loss = 0.786

2019-04-09T11:31:40.578223: Epoch   4 Batch  180/3125   train_loss = 0.829

2019-04-09T11:31:40.994831: Epoch   4 Batch  200/3125   train_loss = 1.105

2019-04-09T11:31:41.423976: Epoch   4 Batch  220/3125   train_loss = 0.862

2019-04-09T11:31:41.847103: Epoch   4 Batch  240/3125   train_loss = 0.981

2019-04-09T11:31:42.273237: Epoch   4 Batch  260/3125   train_loss = 0.926

2019-04-09T11:31:42.696015: Epoch   4 Batch  280/3125   train_loss = 0.991

2019-04-09T11:31:43.118928: Epoch   4 Batch  300/3125   train_loss = 1.056

2019-04-09T11:31:43.543558: Epoch   4 Batch  320/3125   train_loss = 0.991

2019-04-09T11:31:43.963668: Epoch   4 Batch  340/3125   train_loss = 0.723

2019-04-09T11:31:44.405001: Epoch   4 Batch  360/3125   train_loss = 0.811

2019-04-09T11:31:44.837830: Epoch   4 Batch  380/3125   train_loss = 0.903

2019-04-09T11:31:45.256898: Epoch   4 Batch  400/3125   train_loss = 0.788

2019-04-09T11:31:45.684205: Epoch   4 Batch  420/3125   train_loss = 0.845

2019-04-09T11:31:46.114850: Epoch   4 Batch  440/3125   train_loss = 0.845

2019-04-09T11:31:46.554569: Epoch   4 Batch  460/3125   train_loss = 0.917

2019-04-09T11:31:46.990729: Epoch   4 Batch  480/3125   train_loss = 0.982

2019-04-09T11:31:47.417146: Epoch   4 Batch  500/3125   train_loss = 0.671

2019-04-09T11:31:47.851802: Epoch   4 Batch  520/3125   train_loss = 0.905

2019-04-09T11:31:48.283919: Epoch   4 Batch  540/3125   train_loss = 0.806

2019-04-09T11:31:48.718582: Epoch   4 Batch  560/3125   train_loss = 1.032

2019-04-09T11:31:49.138201: Epoch   4 Batch  580/3125   train_loss = 0.989

2019-04-09T11:31:49.559825: Epoch   4 Batch  600/3125   train_loss = 0.909

2019-04-09T11:31:49.989670: Epoch   4 Batch  620/3125   train_loss = 0.941

2019-04-09T11:31:50.406780: Epoch   4 Batch  640/3125   train_loss = 0.862

2019-04-09T11:31:50.859348: Epoch   4 Batch  660/3125   train_loss = 0.912

2019-04-09T11:31:51.275455: Epoch   4 Batch  680/3125   train_loss = 0.932

2019-04-09T11:31:51.691919: Epoch   4 Batch  700/3125   train_loss = 0.911

2019-04-09T11:31:52.107926: Epoch   4 Batch  720/3125   train_loss = 0.782

2019-04-09T11:31:52.527656: Epoch   4 Batch  740/3125   train_loss = 0.911

2019-04-09T11:31:52.969684: Epoch   4 Batch  760/3125   train_loss = 0.782

2019-04-09T11:31:53.409955: Epoch   4 Batch  780/3125   train_loss = 0.905

2019-04-09T11:31:53.832580: Epoch   4 Batch  800/3125   train_loss = 0.798

2019-04-09T11:31:54.247683: Epoch   4 Batch  820/3125   train_loss = 0.871

2019-04-09T11:31:54.668933: Epoch   4 Batch  840/3125   train_loss = 0.808

2019-04-09T11:31:55.088550: Epoch   4 Batch  860/3125   train_loss = 0.828

2019-04-09T11:31:55.506010: Epoch   4 Batch  880/3125   train_loss = 0.811

2019-04-09T11:31:55.953370: Epoch   4 Batch  900/3125   train_loss = 0.888

2019-04-09T11:31:56.475762: Epoch   4 Batch  920/3125   train_loss = 0.953

2019-04-09T11:31:56.895627: Epoch   4 Batch  940/3125   train_loss = 0.898

2019-04-09T11:31:57.314926: Epoch   4 Batch  960/3125   train_loss = 0.927

2019-04-09T11:31:57.736404: Epoch   4 Batch  980/3125   train_loss = 1.019

2019-04-09T11:31:58.155519: Epoch   4 Batch 1000/3125   train_loss = 0.972

2019-04-09T11:31:58.571659: Epoch   4 Batch 1020/3125   train_loss = 0.885

2019-04-09T11:31:58.987239: Epoch   4 Batch 1040/3125   train_loss = 0.766

2019-04-09T11:31:59.407857: Epoch   4 Batch 1060/3125   train_loss = 0.975

2019-04-09T11:31:59.827189: Epoch   4 Batch 1080/3125   train_loss = 0.890

2019-04-09T11:32:00.250485: Epoch   4 Batch 1100/3125   train_loss = 0.794

2019-04-09T11:32:00.665686: Epoch   4 Batch 1120/3125   train_loss = 0.830

2019-04-09T11:32:01.076280: Epoch   4 Batch 1140/3125   train_loss = 0.850

2019-04-09T11:32:01.495207: Epoch   4 Batch 1160/3125   train_loss = 0.826

2019-04-09T11:32:01.909009: Epoch   4 Batch 1180/3125   train_loss = 0.813

2019-04-09T11:32:02.325685: Epoch   4 Batch 1200/3125   train_loss = 1.011

2019-04-09T11:32:02.747689: Epoch   4 Batch 1220/3125   train_loss = 0.964

2019-04-09T11:32:03.171817: Epoch   4 Batch 1240/3125   train_loss = 0.782

2019-04-09T11:32:03.593569: Epoch   4 Batch 1260/3125   train_loss = 0.848

2019-04-09T11:32:04.011798: Epoch   4 Batch 1280/3125   train_loss = 0.908

2019-04-09T11:32:04.430913: Epoch   4 Batch 1300/3125   train_loss = 0.794

2019-04-09T11:32:04.846453: Epoch   4 Batch 1320/3125   train_loss = 0.872

2019-04-09T11:32:05.263562: Epoch   4 Batch 1340/3125   train_loss = 0.716

2019-04-09T11:32:05.679810: Epoch   4 Batch 1360/3125   train_loss = 0.847

2019-04-09T11:32:06.099427: Epoch   4 Batch 1380/3125   train_loss = 0.831

2019-04-09T11:32:06.515033: Epoch   4 Batch 1400/3125   train_loss = 0.932

2019-04-09T11:32:06.932977: Epoch   4 Batch 1420/3125   train_loss = 0.911

2019-04-09T11:32:07.349584: Epoch   4 Batch 1440/3125   train_loss = 0.767

2019-04-09T11:32:07.768391: Epoch   4 Batch 1460/3125   train_loss = 0.885

2019-04-09T11:32:08.186503: Epoch   4 Batch 1480/3125   train_loss = 0.855

2019-04-09T11:32:08.610562: Epoch   4 Batch 1500/3125   train_loss = 0.890

2019-04-09T11:32:09.027935: Epoch   4 Batch 1520/3125   train_loss = 0.807

2019-04-09T11:32:09.448052: Epoch   4 Batch 1540/3125   train_loss = 0.970

2019-04-09T11:32:09.864802: Epoch   4 Batch 1560/3125   train_loss = 0.786

2019-04-09T11:32:10.279906: Epoch   4 Batch 1580/3125   train_loss = 0.913

2019-04-09T11:32:10.694227: Epoch   4 Batch 1600/3125   train_loss = 0.830

2019-04-09T11:32:11.113843: Epoch   4 Batch 1620/3125   train_loss = 0.764

2019-04-09T11:32:11.535264: Epoch   4 Batch 1640/3125   train_loss = 0.948

2019-04-09T11:32:11.951873: Epoch   4 Batch 1660/3125   train_loss = 1.003

2019-04-09T11:32:12.368324: Epoch   4 Batch 1680/3125   train_loss = 0.899

2019-04-09T11:32:12.877578: Epoch   4 Batch 1700/3125   train_loss = 0.787

2019-04-09T11:32:13.293848: Epoch   4 Batch 1720/3125   train_loss = 0.872

2019-04-09T11:32:13.710885: Epoch   4 Batch 1740/3125   train_loss = 0.929

2019-04-09T11:32:14.120976: Epoch   4 Batch 1760/3125   train_loss = 0.887

2019-04-09T11:32:14.538451: Epoch   4 Batch 1780/3125   train_loss = 0.851

2019-04-09T11:32:14.959239: Epoch   4 Batch 1800/3125   train_loss = 0.820

2019-04-09T11:32:15.374844: Epoch   4 Batch 1820/3125   train_loss = 0.807

2019-04-09T11:32:15.787555: Epoch   4 Batch 1840/3125   train_loss = 0.903

2019-04-09T11:32:16.206090: Epoch   4 Batch 1860/3125   train_loss = 0.977

2019-04-09T11:32:16.620547: Epoch   4 Batch 1880/3125   train_loss = 0.887

2019-04-09T11:32:17.036185: Epoch   4 Batch 1900/3125   train_loss = 0.734

2019-04-09T11:32:17.454960: Epoch   4 Batch 1920/3125   train_loss = 0.883

2019-04-09T11:32:17.870896: Epoch   4 Batch 1940/3125   train_loss = 0.792

2019-04-09T11:32:18.287611: Epoch   4 Batch 1960/3125   train_loss = 0.756

2019-04-09T11:32:18.708944: Epoch   4 Batch 1980/3125   train_loss = 0.856

2019-04-09T11:32:19.124550: Epoch   4 Batch 2000/3125   train_loss = 0.989

2019-04-09T11:32:19.539524: Epoch   4 Batch 2020/3125   train_loss = 0.987

2019-04-09T11:32:19.955392: Epoch   4 Batch 2040/3125   train_loss = 0.793

2019-04-09T11:32:20.373002: Epoch   4 Batch 2060/3125   train_loss = 0.851

2019-04-09T11:32:20.788365: Epoch   4 Batch 2080/3125   train_loss = 0.980

2019-04-09T11:32:21.207642: Epoch   4 Batch 2100/3125   train_loss = 0.782

2019-04-09T11:32:21.628621: Epoch   4 Batch 2120/3125   train_loss = 0.808

2019-04-09T11:32:22.042255: Epoch   4 Batch 2140/3125   train_loss = 0.840

2019-04-09T11:32:22.456976: Epoch   4 Batch 2160/3125   train_loss = 0.829

2019-04-09T11:32:22.867969: Epoch   4 Batch 2180/3125   train_loss = 0.917

2019-04-09T11:32:23.281501: Epoch   4 Batch 2200/3125   train_loss = 0.803

2019-04-09T11:32:23.696260: Epoch   4 Batch 2220/3125   train_loss = 0.832

2019-04-09T11:32:24.112367: Epoch   4 Batch 2240/3125   train_loss = 0.797

2019-04-09T11:32:24.528127: Epoch   4 Batch 2260/3125   train_loss = 0.872

2019-04-09T11:32:24.944427: Epoch   4 Batch 2280/3125   train_loss = 0.880

2019-04-09T11:32:25.362539: Epoch   4 Batch 2300/3125   train_loss = 0.847

2019-04-09T11:32:25.776624: Epoch   4 Batch 2320/3125   train_loss = 0.908

2019-04-09T11:32:26.191315: Epoch   4 Batch 2340/3125   train_loss = 0.849

2019-04-09T11:32:26.607493: Epoch   4 Batch 2360/3125   train_loss = 0.881

2019-04-09T11:32:27.021723: Epoch   4 Batch 2380/3125   train_loss = 0.835

2019-04-09T11:32:27.440410: Epoch   4 Batch 2400/3125   train_loss = 0.915

2019-04-09T11:32:27.850694: Epoch   4 Batch 2420/3125   train_loss = 0.794

2019-04-09T11:32:28.265448: Epoch   4 Batch 2440/3125   train_loss = 0.800

2019-04-09T11:32:28.684222: Epoch   4 Batch 2460/3125   train_loss = 0.852

2019-04-09T11:32:29.103336: Epoch   4 Batch 2480/3125   train_loss = 0.954

2019-04-09T11:32:29.520448: Epoch   4 Batch 2500/3125   train_loss = 0.811

2019-04-09T11:32:29.941087: Epoch   4 Batch 2520/3125   train_loss = 0.885

2019-04-09T11:32:30.357195: Epoch   4 Batch 2540/3125   train_loss = 0.845

2019-04-09T11:32:30.780301: Epoch   4 Batch 2560/3125   train_loss = 0.665

2019-04-09T11:32:31.195065: Epoch   4 Batch 2580/3125   train_loss = 0.825

2019-04-09T11:32:31.604654: Epoch   4 Batch 2600/3125   train_loss = 0.868

2019-04-09T11:32:32.018648: Epoch   4 Batch 2620/3125   train_loss = 0.813

2019-04-09T11:32:32.435706: Epoch   4 Batch 2640/3125   train_loss = 0.826

2019-04-09T11:32:32.853230: Epoch   4 Batch 2660/3125   train_loss = 1.017

2019-04-09T11:32:33.270841: Epoch   4 Batch 2680/3125   train_loss = 0.769

2019-04-09T11:32:33.692010: Epoch   4 Batch 2700/3125   train_loss = 0.922

2019-04-09T11:32:34.136192: Epoch   4 Batch 2720/3125   train_loss = 0.796

2019-04-09T11:32:34.551979: Epoch   4 Batch 2740/3125   train_loss = 0.870

2019-04-09T11:32:34.968683: Epoch   4 Batch 2760/3125   train_loss = 0.799

2019-04-09T11:32:35.385795: Epoch   4 Batch 2780/3125   train_loss = 0.842

2019-04-09T11:32:35.803009: Epoch   4 Batch 2800/3125   train_loss = 1.050

2019-04-09T11:32:36.220554: Epoch   4 Batch 2820/3125   train_loss = 1.034

2019-04-09T11:32:36.638668: Epoch   4 Batch 2840/3125   train_loss = 0.822

2019-04-09T11:32:37.057100: Epoch   4 Batch 2860/3125   train_loss = 0.789

2019-04-09T11:32:37.477429: Epoch   4 Batch 2880/3125   train_loss = 0.858

2019-04-09T11:32:37.894122: Epoch   4 Batch 2900/3125   train_loss = 0.833

2019-04-09T11:32:38.309463: Epoch   4 Batch 2920/3125   train_loss = 0.849

2019-04-09T11:32:38.727701: Epoch   4 Batch 2940/3125   train_loss = 0.879

2019-04-09T11:32:39.142808: Epoch   4 Batch 2960/3125   train_loss = 0.877

2019-04-09T11:32:39.560118: Epoch   4 Batch 2980/3125   train_loss = 0.827

2019-04-09T11:32:39.978247: Epoch   4 Batch 3000/3125   train_loss = 0.920

2019-04-09T11:32:40.396863: Epoch   4 Batch 3020/3125   train_loss = 1.001

2019-04-09T11:32:40.812059: Epoch   4 Batch 3040/3125   train_loss = 0.956

2019-04-09T11:32:41.228167: Epoch   4 Batch 3060/3125   train_loss = 0.814

2019-04-09T11:32:41.643774: Epoch   4 Batch 3080/3125   train_loss = 1.017

2019-04-09T11:32:42.059833: Epoch   4 Batch 3100/3125   train_loss = 1.032

2019-04-09T11:32:42.478235: Epoch   4 Batch 3120/3125   train_loss = 0.816

2019-04-09T11:32:42.674176: Epoch   4 Batch   16/781   test_loss = 0.830

2019-04-09T11:32:42.806027: Epoch   4 Batch   36/781   test_loss = 0.903

2019-04-09T11:32:42.936875: Epoch   4 Batch   56/781   test_loss = 0.934

2019-04-09T11:32:43.067222: Epoch   4 Batch   76/781   test_loss = 0.974

2019-04-09T11:32:43.197569: Epoch   4 Batch   96/781   test_loss = 1.000

2019-04-09T11:32:43.326913: Epoch   4 Batch  116/781   test_loss = 0.887

2019-04-09T11:32:43.457535: Epoch   4 Batch  136/781   test_loss = 0.811

2019-04-09T11:32:43.588383: Epoch   4 Batch  156/781   test_loss = 0.876

2019-04-09T11:32:43.716224: Epoch   4 Batch  176/781   test_loss = 0.865

2019-04-09T11:32:43.846583: Epoch   4 Batch  196/781   test_loss = 0.786

2019-04-09T11:32:43.975413: Epoch   4 Batch  216/781   test_loss = 0.974

2019-04-09T11:32:44.105258: Epoch   4 Batch  236/781   test_loss = 0.793

2019-04-09T11:32:44.235605: Epoch   4 Batch  256/781   test_loss = 0.827

2019-04-09T11:32:44.367456: Epoch   4 Batch  276/781   test_loss = 1.097

2019-04-09T11:32:44.496146: Epoch   4 Batch  296/781   test_loss = 0.813

2019-04-09T11:32:44.625489: Epoch   4 Batch  316/781   test_loss = 0.820

2019-04-09T11:32:44.754834: Epoch   4 Batch  336/781   test_loss = 0.760

2019-04-09T11:32:44.884178: Epoch   4 Batch  356/781   test_loss = 0.885

2019-04-09T11:32:45.013021: Epoch   4 Batch  376/781   test_loss = 0.872

2019-04-09T11:32:45.141362: Epoch   4 Batch  396/781   test_loss = 0.807

2019-04-09T11:32:45.273213: Epoch   4 Batch  416/781   test_loss = 0.935

2019-04-09T11:32:45.402058: Epoch   4 Batch  436/781   test_loss = 0.955

2019-04-09T11:32:45.533244: Epoch   4 Batch  456/781   test_loss = 0.735

2019-04-09T11:32:45.666098: Epoch   4 Batch  476/781   test_loss = 0.931

2019-04-09T11:32:45.795442: Epoch   4 Batch  496/781   test_loss = 0.966

2019-04-09T11:32:45.925789: Epoch   4 Batch  516/781   test_loss = 0.760

2019-04-09T11:32:46.054130: Epoch   4 Batch  536/781   test_loss = 0.990

2019-04-09T11:32:46.183474: Epoch   4 Batch  556/781   test_loss = 0.868

2019-04-09T11:32:46.312818: Epoch   4 Batch  576/781   test_loss = 0.940

2019-04-09T11:32:46.441522: Epoch   4 Batch  596/781   test_loss = 0.959

2019-04-09T11:32:46.571869: Epoch   4 Batch  616/781   test_loss = 0.930

2019-04-09T11:32:46.702216: Epoch   4 Batch  636/781   test_loss = 0.809

2019-04-09T11:32:46.831560: Epoch   4 Batch  656/781   test_loss = 0.876

2019-04-09T11:32:46.960904: Epoch   4 Batch  676/781   test_loss = 1.057

2019-04-09T11:32:47.092254: Epoch   4 Batch  696/781   test_loss = 0.856

2019-04-09T11:32:47.222600: Epoch   4 Batch  716/781   test_loss = 0.852

2019-04-09T11:32:47.351944: Epoch   4 Batch  736/781   test_loss = 1.075

2019-04-09T11:32:47.480286: Epoch   4 Batch  756/781   test_loss = 0.809

2019-04-09T11:32:47.610131: Epoch   4 Batch  776/781   test_loss = 0.753

Model Trained and Saved

在 TensorBoard 中查看可视化结果

tensorboard --logdir=/PATH_TO_CODE/runs/1513402825/summaries/

基于卷积神经网络CNN的电影推荐系统

保存参数

保存save_dir 在生成预测时使用。

save_params((save_dir))

load_dir = load_params()

显示训练Loss

plt.plot(losses['train'], label='Training loss')

plt.legend()

_ = plt.ylim()

基于卷积神经网络CNN的电影推荐系统

显示测试Loss

迭代次数再增加一些，下降的趋势会明显一些

plt.plot(losses['test'], label='Test loss')

plt.legend()

_ = plt.ylim()

基于卷积神经网络CNN的电影推荐系统

获取 Tensors

使用函数 get_tensor_by_name()从 loaded_graph 中获取tensors，后面的推荐功能要用到。

def get_tensors(loaded_graph):

    uid = loaded_graph.get_tensor_by_name("uid:0")

    user_gender = loaded_graph.get_tensor_by_name("user_gender:0")

    user_age = loaded_graph.get_tensor_by_name("user_age:0")

    user_job = loaded_graph.get_tensor_by_name("user_job:0")

    movie_id = loaded_graph.get_tensor_by_name("movie_id:0")

    movie_categories = loaded_graph.get_tensor_by_name("movie_categories:0")

    movie_titles = loaded_graph.get_tensor_by_name("movie_titles:0")

    targets = loaded_graph.get_tensor_by_name("targets:0")

    dropout_keep_prob = loaded_graph.get_tensor_by_name("dropout_keep_prob:0")

    lr = loaded_graph.get_tensor_by_name("LearningRate:0")

    #两种不同计算预测评分的方案使用不同的name获取tensor inference

#     inference = loaded_graph.get_tensor_by_name("inference/inference/BiasAdd:0")

    inference = loaded_graph.get_tensor_by_name("inference/ExpandDims:0") # 之前是MatMul:0 因为inference代码修改了 这里也要修改 感谢网友 @清歌 指出问题

    movie_combine_layer_flat = loaded_graph.get_tensor_by_name("movie_fc/Reshape:0")

    user_combine_layer_flat = loaded_graph.get_tensor_by_name("user_fc/Reshape:0")

    return uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, inference, movie_combine_layer_flat, user_combine_layer_flat

指定用户和电影进行评分

这部分就是对网络做正向传播，计算得到预测的评分

def rating_movie(user_id_val, movie_id_val):

    loaded_graph = tf.Graph()  #

    with tf.Session(graph=loaded_graph) as sess:  #

        # Load saved model

        loader = tf.train.import_meta_graph(load_dir + '.meta')

        loader.restore(sess, load_dir)

        # Get Tensors from loaded model

        uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, inference,_, __ = get_tensors(loaded_graph)  #loaded_graph

        categories = np.zeros([1, 18])

        categories[0] = movies.values[movieid2idx[movie_id_val]][2]

        titles = np.zeros([1, sentences_size])

        titles[0] = movies.values[movieid2idx[movie_id_val]][1]

        feed = {

              uid: np.reshape(users.values[user_id_val-1][0], [1, 1]),

              user_gender: np.reshape(users.values[user_id_val-1][1], [1, 1]),

              user_age: np.reshape(users.values[user_id_val-1][2], [1, 1]),

              user_job: np.reshape(users.values[user_id_val-1][3], [1, 1]),

              movie_id: np.reshape(movies.values[movieid2idx[movie_id_val]][0], [1, 1]),

              movie_categories: categories,  #x.take(6,1)

              movie_titles: titles,  #x.take(5,1)

              dropout_keep_prob: 1}

        # Get Prediction

        inference_val = sess.run([inference], feed)  

        return (inference_val)

rating_movie(234, 1401)

INFO:tensorflow:Restoring parameters from ./save

[array([[3.1157281]], dtype=float32)]

生成Movie特征矩阵

将训练好的电影特征组合成电影特征矩阵并保存到本地

loaded_graph = tf.Graph()  #

movie_matrics = []

with tf.Session(graph=loaded_graph) as sess:  #

    # Load saved model

    loader = tf.train.import_meta_graph(load_dir + '.meta')

    loader.restore(sess, load_dir)

    # Get Tensors from loaded model

    uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, _, movie_combine_layer_flat, __ = get_tensors(loaded_graph)  #loaded_graph

    for item in movies.values:

        categories = np.zeros([1, 18])

        categories[0] = item.take(2)

        titles = np.zeros([1, sentences_size])

        titles[0] = item.take(1)

        feed = {

            movie_id: np.reshape(item.take(0), [1, 1]),

            movie_categories: categories,  #x.take(6,1)

            movie_titles: titles,  #x.take(5,1)

            dropout_keep_prob: 1}

        movie_combine_layer_flat_val = sess.run([movie_combine_layer_flat], feed)

        movie_matrics.append(movie_combine_layer_flat_val)

pickle.dump((np.array(movie_matrics).reshape(-1, 200)), open('movie_matrics.p', 'wb'))

movie_matrics = pickle.load(open('movie_matrics.p', mode='rb'))

INFO:tensorflow:Restoring parameters from ./save

movie_matrics = pickle.load(open('movie_matrics.p', mode='rb'))

生成User特征矩阵

将训练好的用户特征组合成用户特征矩阵并保存到本地

loaded_graph = tf.Graph()  #

users_matrics = []

with tf.Session(graph=loaded_graph) as sess:  #

    # Load saved model

    loader = tf.train.import_meta_graph(load_dir + '.meta')

    loader.restore(sess, load_dir)

    # Get Tensors from loaded model

    uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, _, __,user_combine_layer_flat = get_tensors(loaded_graph)  #loaded_graph

    for item in users.values:

        feed = {

            uid: np.reshape(item.take(0), [1, 1]),

            user_gender: np.reshape(item.take(1), [1, 1]),

            user_age: np.reshape(item.take(2), [1, 1]),

            user_job: np.reshape(item.take(3), [1, 1]),

            dropout_keep_prob: 1}

        user_combine_layer_flat_val = sess.run([user_combine_layer_flat], feed)

        users_matrics.append(user_combine_layer_flat_val)

pickle.dump((np.array(users_matrics).reshape(-1, 200)), open('users_matrics.p', 'wb'))

users_matrics = pickle.load(open('users_matrics.p', mode='rb'))

INFO:tensorflow:Restoring parameters from ./save

users_matrics = pickle.load(open('users_matrics.p', mode='rb'))

开始推荐电影

使用生产的用户特征矩阵和电影特征矩阵做电影推荐

看过这个电影的人还看了（喜欢）哪些电影

首先选出喜欢某个电影的top_k个人，得到这几个人的用户特征向量。
然后计算这几个人对所有电影的评分
选择每个人评分最高的电影作为推荐
同样加入了随机选择

import random

def recommend_other_favorite_movie(movie_id_val, top_k = 20):

    loaded_graph = tf.Graph()  #

    with tf.Session(graph=loaded_graph) as sess:  #

        # Load saved model

        loader = tf.train.import_meta_graph(load_dir + '.meta')

        loader.restore(sess, load_dir)

        probs_movie_embeddings = (movie_matrics[movieid2idx[movie_id_val]]).reshape([1, 200])

        probs_user_favorite_similarity = tf.matmul(probs_movie_embeddings, tf.transpose(users_matrics))

        favorite_user_id = np.argsort(probs_user_favorite_similarity.eval())[0][-top_k:]

    #     print(normalized_users_matrics.eval().shape)

    #     print(probs_user_favorite_similarity.eval()[0][favorite_user_id])

    #     print(favorite_user_id.shape)

        print("您看的电影是：{}".format(movies_orig[movieid2idx[movie_id_val]]))

        print("喜欢看这个电影的人是：{}".format(users_orig[favorite_user_id-1]))

        probs_users_embeddings = (users_matrics[favorite_user_id-1]).reshape([-1, 200])

        probs_similarity = tf.matmul(probs_users_embeddings, tf.transpose(movie_matrics))

        sim = (probs_similarity.eval())

    #     results = (-sim[0]).argsort()[0:top_k]

    #     print(results)

    #     print(sim.shape)

    #     print(np.argmax(sim, 1))

        p = np.argmax(sim, 1)

        print("喜欢看这个电影的人还喜欢看：")

        results = set()

        while len(results) != 5:

            c = p[random.randrange(top_k)]

            results.add(c)

        for val in (results):

            print(val)

            print(movies_orig[val])

        return results

recommend_other_favorite_movie(1401, 20)

INFO:tensorflow:Restoring parameters from ./save

您看的电影是：[1401 'Ghosts of Mississippi (1996)' 'Drama']

喜欢看这个电影的人是：[[1568 'F' 1 10]

 [4814 'M' 18 14]

 [5217 'M' 25 17]

 [1745 'M' 45 0]

 [1763 'M' 35 7]

 [5861 'F' 50 1]

 [493 'M' 50 7]

 [3031 'M' 18 4]

 [2144 'M' 18 0]

 [1644 'M' 18 12]

 [3833 'M' 25 1]

 [5678 'M' 35 17]

 [1701 'F' 25 4]

 [3297 'M' 18 4]

 [4800 'M' 18 4]

 [1109 'M' 18 10]

 [2496 'M' 50 1]

 [100 'M' 35 17]

 [2154 'M' 25 12]

 [4085 'F' 25 6]]

喜欢看这个电影的人还喜欢看：

1132

[1148 'Wrong Trousers, The (1993)' 'Animation|Comedy']

1133

[1149 'JLG/JLG - autoportrait de d閏embre (1994)' 'Documentary|Drama']

847

[858 'Godfather, The (1972)' 'Action|Crime|Drama']

763

[773 'Touki Bouki (Journey of the Hyena) (1973)' 'Drama']

1950

[2019

 'Seven Samurai (The Magnificent Seven) (Shichinin no samurai) (1954)'

 'Action|Drama']

{763, 847, 1132, 1133, 1950}

结论

以上就是实现的常用的推荐功能，将网络模型作为回归问题进行训练，得到训练好的用户特征矩阵和电影特征矩阵进行推荐。

扩展阅读

如果你对个性化推荐感兴趣，以下资料建议你看看：

今天的分享就到这里，请多指教！

基于卷积神经网络CNN的电影推荐系统

下载数据集

先来看看数据

用户数据

电影数据

评分数据

来说说数据预处理

实现数据预处理

加载数据并保存到本地

预处理后的数据

从本地读取数据

模型设计

文本卷积网络

辅助函数

编码实现

超参

输入

构建神经网络

定义User的嵌入矩阵

将User的嵌入矩阵一起全连接生成User的特征

定义Movie ID的嵌入矩阵

对电影类型的多个嵌入向量做加和

Movie Title的文本卷积网络实现

将Movie的各个层一起做全连接

构建计算图

取得batch

训练网络

在 TensorBoard 中查看可视化结果

保存参数

显示训练Loss

显示测试Loss

获取 Tensors

指定用户和电影进行评分

生成Movie特征矩阵

生成User特征矩阵

开始推荐电影

推荐同类型的电影

推荐您喜欢的电影

看过这个电影的人还看了（喜欢）哪些电影

结论

扩展阅读

相关文章