Encog3Java-User.pdf翻译:第一章 回归,分类 & 聚类

时间:2022-12-08 17:10:59

    打算开始研究人工智能,先从翻译人工智能Java框架Encog的英文资料开始。本人水平有限,贴出来就是给大家做个参考。

   Chapter 1
第一章
Regression, Classification & Clustering
回归,分类 & 聚类


? Classifying Data
分类数据
? Regression Analysis of Data
数据回归分析
? Clustering Data
聚类数据
? How Machine Learning Problems are Structured
如何构建机器学习问题


While there are other models, regression, classification and clustering are the three primary ways that data is evaluated for machine learning problems.These three models are the most common and the focus of this book. The next sections will introduce you to classification, regression and clustering.
虽然还有其他模型,但是回归、分类、聚类是三种主要的分析机器学习问题的方法。这三种模式是本书最常见和集中讨论的。下一节介绍回归、分类和聚类。


1.1 Data Classification
1.1 数据分类
Classification attempts to determine what class the input data falls into. Classification is usually a supervised training operation, meaning the user provides data and expected results to the neural network. For data classification, the expected result is identification of the data class.
分类试图确定输入数据所属类别。分类通常是有监督的训练操作,这意味着是用户向神经网络提供数据和预期结果。对于数据分类,预期结果是识别出数据类别。


Supervised neural networks are always trained with known data. During training, the networks are evaluated on how well they classify known data.The hope is that the neural network, once trained, will be able to classify unknown data as well.
有监督的神经网络使用已知数据训练。训练期间,评估网络将已知数据分类进行的如何。训练完成后,期望神经网络也能对未知数据进行分类。


Fisher’s Iris Dataset is an example of classification. This is a dataset that contains measurements of Iris flowers. This is one of the most famous datasets and is often used to evaluate machine learning methods. The full dataset is available at the following URL.
Fisher’s Iris数据集是分类的例子。这个数据集包含Iris的测量数据。这是一个非常著名的数据集,经常用来评估机器学习方法。全部数据可以从下面URL获取。


http://www.heatonresearch.com/wiki/Iris Data Set


Below is small sampling from the Iris data set.
下面是数据集的一小部分。


” Sepal Length ” , ” Sepal Width” , ” Peta l Length ” , ” Petal Width” , ” S p e c i e s”
5.1 , 3.5 , 1.4 , 0.2 , ”setosa”
4.9 , 3.0 , 1.4 , 0.2 , ”setosa”
4.7 , 3.2 , 1.3 , 0.2 , ”setosa”
...
7.0 , 3.2 , 4.7 , 1.4 , ”versicolor”
6.4 , 3.2 , 4.5 , 1.5 , ”versicolor”
6.9 , 3.1 , 4.9 , 1.5 , ”versicolor”
...
6.3 , 3.3 , 6.0 , 2.5 , ”virginica”
5.8 , 2.7 , 5.1 , 1.9 , ”virginica”
7.1 , 3.0 , 5.9 , 2.1 , ”virginica”
The above data is shown as a CSV file. CSV is a very common input format for a neural network. The first row is typically a definition for each of the columns in the file. As you can see, for each of the flowers there are five pieces of information are provided.
上面数据显示在一个CSV文件中。CSV文件是神经网络里常用的输入格式。第一行是文件中每一列的定义。正如你所见,每一朵花都有5组数据提供。


? Sepal Length 萼片
? Sepal Width
? Petal Length 花瓣
? Petal Width
? Species  物种


For classification, the neural network is instructed that, given the sepal length/width and the petal length/width, the species of the flower can be determined.The species is the class.     
对于分类,神经网络被要求给出萼片和花瓣的长/宽,能够识别出花的物种。物种是类别。


A class is usually a non-numeric data attribute and as such, membership in the class must be well-defined. For the Iris data set, there are three different types of Iris. If a neural network is trained on three types of Iris, it cannot be expected to identify a rose. All members of the class must be known at the time of training.
类通常是非数字数据属性,因此,类中的成员必须是明确定义的。对于Iris数据集,有三种不同类型的Iris。如果一个神经网络对这三种类型的Iris进行训练,就不能指望它能识别玫瑰。所有的成员都必须在训练时就知道。


1.2 Regression Analysis
1.2 回归分析


In the last section, we learned how to use data to classify data. Often the desired output is not simply a class, but a number. Consider the calculation of an automobile’s miles per gallon (MPG). Provided data such as the engine size and car weight, the MPG for the specified car may be calculated.
在上一节,我们学习了如何数据分类。经常地,期望输出不是简单的类别或数字。考虑一下汽车每加仑英里数(MPG)的计算。提供的数据,如发动机尺寸和汽车重量,为指定的汽车可以计算出来MPG。


Consider the following sample data for five cars:
考虑下面的五个汽车的数据样本:
”mpg” , ”cylinders” , ”displacement” , ”horsepower” , ”weight” , ”acceleration” , ”model year” , ”origin” , ”car name”
1 8.0 , 8 , 3 0 7.0 , 1 3 0.0 , 3 5 0 4., 1 2.0 , 7 0 , 1 , ”chevroletchevelle malibu”
1 5.0 , 8 , 3 5 0.0 , 1 6 5.0 , 3 6 9 3., 1 1.5 , 7 0 , 1 , ” buick skylark 320”
1 8.0 , 8 , 3 1 8.0 , 1 5 0.0 , 3 4 3 6., 1 1.0 , 7 0 , 1 , ” plymouth satellite”
1 6.0 , 8 , 3 0 4.0 , 1 5 0.0 , 3 4 3 3., 1 2.0 , 7 0 , 1 , ”amcrebelsst”
1 7.0 , 8 , 3 0 2.0 , 1 4 0.0 , 3 4 4 9., 1 0.5 , 7 0 , 1 , ”fordtorino”
...
For more information, the entirety of this dataset may be found at:
http://www.heatonresearch.com/wiki/MPG_Data_Set


The idea of regression is to train the neural network with input data about the car. However, using regression, the network will not produce a class. The neural network is expected to provide the miles per gallon that the specified car would likely get.
回归的思想是用车辆的输入数据训练神经网络。但是,使用回归时,网络不会生成类。神经网络预计能提供每加仑汽油的里程数。


It is also important to note that not use every piece of data in the above file will be used. The columns “car name” and “origin” are not used. The name of a car has nothing to do with its fuel efficiency and is therefore excluded. Likewise the origin does not contribute to this equation. The origin is a numeric value that specifies what geographic region the car was produced in. While some regions do focus on fuel efficiency, this piece of data is far too broad to be useful.
同样重要的是要注意,并不是上述文件中的每一个数据都将被使用。列“汽车名称”和“原产地”不使用。汽车的名称与燃油效率无关,因此被排除在外。同样,原产地也不利于这个方程。原产地是一个数值,它指定汽车的地理区域。虽然有些地区确实注重燃料效率,但这一数据范围太广,不起作用。


1.3 Clustering
1.3 聚类


Another common type of analysis is clustering. Unlike the previous two analysis types, clustering is typically unsupervised. Either of the datasets from the previous two sections could be used for clustering. The difference is that clustering analysis would not require the user to provide the species in the case of the Iris dataset, or the MPG number for the MPG dataset. The clustering algorithm is expected to place the data elements into clusters that correspond to the species or MPG.
另一个常用分析类型是聚类。与前两种分析类型不同,聚类通常是无监督的。前两部分中的任何一个数据集都可以用于聚类。不同的是,聚类分析不需要用户像在IRIS数据集提供物种或MPG数据集提供MPG。聚类算法预计将数据元素放置到与物种或MPG相对应的集群中。


For clustering, the machine learning method simply looks at the data and attempts to place that data into a number of clusters. The number of clusters expected must be defined ahead of time. If the number of clusters changes, the clustering machine learning method will need to be retrained.
对于聚类,机器学习方法简单地查看数据,并尝试将数据放入多个聚类中。预期的聚类数量必须提前确定。如果聚类的数量变化,聚类的机器学习方法需要再训练。


Clustering is very similar to classification, with its output being a cluster, which is similar to a class. However, clustering differs from regression as it does not provide a number. So if clustering were used with the MPG dataset, the output would need to be a cluster that the car falls into. Perhaps each cluster would specify the varying level of fuel efficiency for the vehicle. Perhaps the clusters would group the cars into clusters that demonstrated some relationship that had not yet been noticed.
聚类与分类非常相似,其输出是一个类似于类的集群。然而,聚类不同于回归,因为它不提供一个数字。因此,如果将聚类与MPG数据集一起使用,那么输出将是一个汽车可以归入其中的集群。也许每个聚类都会指定车辆燃料效率的不同级别。也许聚类会把汽车分成几组,显示出一些尚未被注意到的关系。


1.4 Structuring a Neural Network
1.4 构建一个神经网络


Now the three major problem models for neural networks are identified, it is time to examine how data is actually presented to the neural network. This section focuses mainly on how the neural network is structured to accept data items and provide output. The following chapter will detail how to normalize the data prior to being presented to the neural network.
三个主要的模型问题已经有概念了,现在是考察数据如何实际地提交给神经网络的时候了。这部分主要聚焦于神经网络如何构建来接收数据并给出输出。下一章将详细介绍如何在提交给神经网络之前对数据进行规范化。


Neural networks are typically layered with an input and output layer at minimum. There may also be hidden layers. Some neural network types are not broken up into any formal layers beyond the input and output layer. However, the input layer and output layer will always be present and may be incorporated in the same layer. We will now examine the input layer, output layer and hidden layers.
神经网络通常是分层的,最少也要有输入和输出层。也可以有隐藏层。某些神经网络类型不会在输入和输出层之外分解成任何形式层。然而,输入和输出层是一直要提供的,也可以合并为一层。现在我们讨论输入层、输出层和隐藏层。


1.4.1 Understanding the Input Layer
1.4.1 了解输入层


The input layer is the first layer in a neural network. This layer, like all layers,contains a specific number of neurons. The neurons in a layer all contain similar properties. Typically, the input layer will have one neuron for each attribute that the neural network will use for classification, regression or clustering.
在神经网络中,输入层是第一层。和其它层一样,这一层含有指定数目的神经元。同一层中的神经元都含有相似的属性。通常,对于神经网络将用于分类、回归或聚类的每个属性,输入层将有一个神经元与之对应。


Consider the previous examples. The Iris dataset has four input neurons. These neurons represent the petal width/length and the sepal width/length. The MPG dataset has more input neurons. The number of input neurons does not always directly correspond to the number of attributes and some attributes will take more than one neuron to encode. This encoding process, called normalization, will be covered in the next chapter.
考虑前面的例子。IRIS数据集有四个输入神经元。这些神经元代表花瓣宽度/长度与萼片宽度/长度。MPG数据集有更多的输入神经元。输入神经元的数目并不总是直接对应于属性的数量,而一些属性将需要不止一个神经元来编码。这个编码过程称为规范化,将在下一章中讨论。


The number of neurons determines how a layer’s input is structured. For each input neuron, one double value is stored. For example, the following array could be used as input to a layer that contained five neurons.
神经元的数量决定了层输入的结构。对于每个输入神经元,存储一个double值。例如,下面的数组可以输入到包含五个神经元的层。


double [ ] input = new double [ 5 ] ;


The input to a neural network is always an array of the type double. The size of this array directly corresponds to the number of neurons on the input layer. Encog uses the MLData interface to define classes that hold these arrays. The array above can be easily converted into an MLData object with the following line of code.
对神经网络的输入总是一个类型为double的数组。这个数组的大小直接对应于输入层上神经元的个数。Encog采用MLData接口定义类来保存这些数组。上面的数组可以很容易转换成下面这行代码的MLData对象。


MLData data = new BasicMLData ( input ) ;


The MLData interface defines any “array like” data that may be presented to Encog. Input must always be presented to the neural network inside of a MLData object. The BasicMLData class implements the MLData interface. However, the BasicMLData class is not the only way to provide Encog with data. Other implementations of MLData are used for more specialized types of data.
可提供给Encog的数据,是由MLData接口定义的“类数组”数据。输入必须放到MLData对象里提交给神经网络。BasicMLData类实现MLData接口。然而,这BasicMLData类不是提供Encog数据的唯一途径。MLData其他实现被用于更专门的数据类型。


The BasicMLData class simply provides a memory-based data holder for the neural network data. Once the neural network processes the input, a MLData-based class will be returned from the neural network’s output layer. The output layer is discussed in the next section.
BasicMLData类简单提供了一个基于内存的数据保存方式。一旦神经网络处理数据,一个基于MLData的类将从神经网络的输出层返回。输出层在接下来的部分讨论。


1.4.2 Understanding the Output Layer
1.4.2 了解输出层


The output layer is the final layer in a neural network. This layer provides the output after all previous layers have processed the input. The output from the output layer is formatted very similarly to the data that was provided to the input layer. The neural network outputs an array of doubles.
输出层是神经网络的最后一层。此层提供所有先前层处理输入后的输出。输出层的输出格式非常类似于提供给输入层的数据。神经网络输出一个double数组。


The neural network wraps the output in a class based on the MLData interface. Most of the built-in neural network types return a BasicMLData class as the output. However, future and third party neural network classes may return different classes based other implementations of the MLData interface.
神经网络使用基于MLData接口的类包装输出。大多数的内置的神经网络类型返回一个BasicMLData类作为输出。然而,未来第三方的神经网络类可以返回基于MLData接口的其它实现。


Neural networks are designed to accept input (an array of doubles) and then produce output (also an array of doubles). Determining how to structure the input data and attaching meaning to the output are the two main challenges of adapting a problem to a neural network. The real power of a neural network comes from its pattern recognition capabilities. The neural network should be able to produce the desired output even if the input has been slightly distorted.
神经网络被设计成接受输入(一个double数组),然后产生输出(也是一个double数组)。确定如何构造输入数据和将意义附加到输出,是将问题适配到神经网络的两个主要挑战。神经网络的真正力量来自于它的模式识别能力。即使输入稍有失真,神经网络也应该能够产生所需的输出。


Regression neural networks typically produce a single output neuron that provides the numeric value produced by the neural network. Multiple output neurons may exist if the same neural network is supposed to predict two or more numbers for the given inputs.
回归神经网络通常产生一个输出神经元,它提供神经网络产生的数值。如果相同的神经网络可以预测给定输入的两个或多个数字,则可能存在多个输出神经元。


Classification produce one or more output neurons, depending on how the output class was encoded. There are several different ways to encode classes. This will be discussed in greater detail in the next chapter.
分类产生一个或多个输出神经元,这依赖于如何编码输出类。有几种不同方式编码类。下一章,会更详细地讨论这些。


Clustering is setup similarly as the output neurons identify which data belongs to what cluster.
聚类被组织相似于输出神经元标识数据属于哪个集群。


1.4.3 Hidden Layers
1.4.3 隐藏层


As previously discussed, neural networks contain and input layer and an output layer. Sometimes the input layer and output layer are the same, but are most often two separate layers. Additionally, other layers may exist between the input and output layers and are called hidden layers. These hidden layers are simply inserted between the input and output layers. The hidden layers can also take on more complex structures.
如前所述,神经网络包含输入层和输出层。有时输入层和输出层是同一层,但通常是两个独立的层。此外,在输入层和输出层之间可能存在其他层,称为隐藏层。这些隐藏层简单地插入在输入层和输出层之间。隐藏层也可以有更复杂的结构。


The only purpose of the hidden layers is to allow the neural network to better produce the expected output for the given input. Neural network programming involves first defining the input and output layer neuron counts.Once it is determined how to translate the programming problem into the input and output neuron counts, it is time to define the hidden layers.
隐藏层的唯一目的是允许神经网络更好地生成给定输入的期望输出。神经网络编程首先定义输入和输出层神经元计数,一旦确定如何将编程问题转换成输入和输出神经元计数,那么就是定义隐藏层的时候了。


The hidden layers are very much a “black box.” The problem is defined in terms of the neuron counts for the hidden and output layers. How the neural network produces the correct output is performed in part by hidden layers. Once the structure of the input and output layers is defined, the hidden layer structure that optimally learns the problem must also be defined.
隐藏层非常像一个“黑盒子”,这个问题是根据隐藏层和输出层的神经元数定义的。神经网络如何产生正确的输出部分是通过隐藏层执行的。一旦定义了输入和输出层的结构,就必须定义最佳学习的隐藏层结构。


The challenge is to avoid creating a hidden structure that is either too complex or too simple. Too complex of a hidden structure will take too long to train. Too simple of a hidden structure will not learn the problem. A good starting point is a single hidden layer with a number of neurons equal to twice the input layer. Depending on this network’s performance, the hidden layer’s number of neurons is either increased or decreased.
我们面临的挑战是避免创建一个过于复杂或过于简单的隐藏结构。太复杂的隐藏结构需要太长时间来训练。一个隐藏的结构太简单了,学不到这个问题。一个好的起点是一个隐藏层,它的神经元数目等于输入层的两倍。根据这个网络的性能,隐层神经元的数量要么增加要么减少。


Developers often wonder how many hidden layers to use. Some research has indicated that a second hidden layer is rarely of any value. Encog is an excellent way to perform a trial and error search for the most optimal hidden layer configuration. For more information see the following URL:
开发人员常常想知道要使用多少隐藏层。一些研究表明,第二隐藏层很少有任何价值。Encog是一个很好的方式来进行试验和错误搜索寻找最佳隐层结构。有关更多信息,请参见下面的URL:


http://www.heatonresearch.com/wiki/Hidden_Layers


Some neural networks have no hidden layers, with the input layer directly connected to the output layer. Further, some neural networks have only a single layer in which the single layer is self-connected. These connections permit the network to learn. Contained in these connections, called synapses, are individual weight matrixes. These values are changed as the neural network learns. The next chapter delves more into weight matrixes.
有些神经网络没有隐藏层,输入层直接连接到输出层。此外,一些神经网络只有一层,其中单层是自连接的。这些连接允许网络学习。这些连接被称为突触,是一个权重矩阵。这些值随着神经网络的学习而改变。下一章更加深入研究权重矩阵。


1.5 Using a Neural Network
1.5 使用神经网络


This section will detail how to structure a neural network for a very simple problem: to design a neural network that can function as an XOR operator. Learning the XOR operator is a frequent “first example” when demonstrating the architecture of a new neural network. Just as most new programming languages are first demonstrated with a program that simply displays “Hello World,” neural networks are frequently demonstrated with the XOR operator. Learning the XOR operator is sort of the “Hello World” application for neural networks.
本节将详细介绍如何为一个非常简单的问题构造一个神经网络:设计一个可以充当XOR运算符的神经网络。在演示新的神经网络的结构时,学习XOR运算符是一个常见的“第一示例”。正如大多数新的编程语言首先用一个只显示“Hello World”的程序来演示,神经网络经常用XOR运算符来演示。学习XOR运算符是一种神经网络的“Hello World”应用程序。


1.5.1 The XOR Operator and Neural Networks
1.5.1 XOR操作和神经网络


The XOR operator is one of common Boolean logical operators. The other two are the AND and OR operators. For each of these logical operators, there are four different combinations. All possible combinations for the AND operator are shown below.
异或运算符是常见的布尔逻辑运算符之一。另外两个是和或运算符。对于每个逻辑运算符,有四种不同的组合。运算符和运算符的所有可能的组合如下所示。


0 AND 0 = 0
1 AND 0 = 0
0 AND 1 = 0
1 AND 1 = 1


0 OR 0 = 0
1 OR 0 = 1
0 OR 1 = 1
1 OR 1 = 1


0 XOR 0 = 0
1 XOR 0 = 1
0 XOR 1 = 1
1 XOR 1 = 0


1.5.2 Structuring a Neural Network for XOR
1.5.2 构建XOR神经网络


There are two inputs to the XOR operator and one output. The input and output layers will be structured accordingly. The input neurons are fed the following double values:
异或运算符有两个输入和一个输出。输入和输出层将相应地构造。输入神经元输入以下double值:


0.0 , 0.0
1.0 , 0.0
0.0 , 1.0
1.0 , 1.0
These values correspond to the inputs to the XOR operator, shown above.The one output neuron is expected to produce the following double values:
这些值对应于上面所示的XOR运算符的输入。输出神经元被期望产生下面的double值:


0.0
1.0
1.0
0.0
This is one way that the neural network can be structured. This method allows a simple feedforward neural network to learn the XOR operator. The feedforward neural network, also called a perceptron, is one of the first neural network architectures that we will learn.
这是神经网络结构的一种方式。该方法允许一个简单的前馈神经网络学习异或运算符。前馈神经网络也称为感知器,是我们将要学习的第一个神经网络结构之一。


There are other ways that the XOR data could be presented to the neural network. Later in this book, two examples of recurrent neural networks will be explored including Elman and Jordan styles of neural networks. These methods would treat the XOR data as one long sequence, basically concatenating the truth table for XOR together, resulting in one long XOR sequence, such as:
异或数据可以以其他方式提供给神经网络。在这本书的后面,将探索两个递归神经网络的例子,包括Elman和Jordan神经网络。这些方法会将异或数据作为一个长序列,基本上连接事实表进行异或运算,产生一个长异或序列,如:


0.0 , 0.0 , 0.0 ,
0.0 , 1.0 , 1.0 ,
1.0 , 0.0 , 1.0 ,
1.0 , 1.0 , 0.0
The line breaks are only for readability; the neural network treats XOR as a long sequence. By using the data above, the network has a single input neuron and a single output neuron. The input neuron is fed one value from the list above and the output neuron is expected to return the next value.
换行仅用于可读性;神经网络把XOR看作一个长序列。利用上面的数据,网络有一个输入神经元和一个输出神经元。输入神经元从上面的列表中输入一个值,输出神经元将返回下一个值。


This shows that there are often multiple ways to model the data for a neural network. How the data is modeled will greatly influence the success of a neural network. If one particular model is not working, another should be considered. The next step is to format the XOR data for a feedforward neural network.
这表明通常有多种方法来为神经网络建立数据模型。数据如何建模将极大地影响神经网络的成功。如果某一型号不工作,则应考虑另一型号。下一步是格式化前向神经网络的XOR数据。


Because the XOR operator has two inputs and one output, the neural network follows suit. Additionally, the neural network has a single hidden layer with two neurons to help process the data. The choice for two neurons in the hidden layer is arbitrary and often results in trial and error. The XOR problem is simple and two hidden neurons are sufficient to solve it. A diagram for this network is shown in Figure 1.1.
由于XOR算子有两个输入和一个输出,所以神经网络也跟随这种结构。此外,神经网络有一个隐藏层,有两个神经元来帮助处理数据。隐层中两个神经元的选择是任意的,常常导致试验和错误。异或问题很简单,两个隐式神经元就足以解决这个问题。此网络的图如图1.1所示。

Encog3Java-User.pdf翻译:第一章 回归,分类 & 聚类
There are four different types of neurons in the above network. These are summarized below:
在上述网络中有四种不同类型的神经元。这些总结如下:


? Input Neurons: I1, I2
? Output Neuron: O1
? Hidden Neurons: H1, H2
? Bias Neurons: B1, B2
The input, output and hidden neurons were discussed previously. The new neuron type seen in this diagram is the bias neuron. A bias neuron always outputs a value of 1 and never receives input from the previous layer. In a nutshell, bias neurons allow the neural network to learn patterns more effectively. They serve a similar function to the hidden neurons. Without bias neurons, it is very hard for the neural network to output a value of one when the input is zero. This is not so much a problem for XOR data, but it can be for other data sets. To read more about their exact function, visit the following URL:
前面讨论了输入、输出和隐层神经元。图中所示的新神经元类型是偏置神经元。偏置神经元总是输出1的值,从不接收前一层的输入。简而言之,偏置神经元允许神经网络更有效地学习模式。它们对隐藏的神经元起着类似的作用。如果没有偏置神经元,当输入为零时,神经网络输出一个值是非常困难的。这并不是XOR数据的问题,但它可以用于其他数据集。要阅读更多有关其确切功能的信息,请访问以下网址:


http://www.heatonresearch.com/wiki/Bias
Now look at the code used to produce a neural network that solves the XOR operator. The complete code is included with the Encog examples and can be found at the following location. 
现在来看一下产生一个解XOR运算符的神经网络的代码。完整的代码是包含在Encog实例可以在以下位置找到。


org.encog.examples.neural.xor.XORHelloWorld


The example begins by creating the neural network seen in Figure 1.1. The code needed to create this network is relatively simple:
该示例首先创建图1.1所示的神经网络。创建此网络所需的代码相对简单:


BasicNetwork network = new BasicNetwork () ;
network.addLayer (new BasicLayer ( null , true , 2 ) ) ;
network.addLayer (new BasicLayer (new ActivationSigmoid () , true , 3 ) ) ;
network.addLayer (new BasicLayer (new ActivationSigmoid () , false , 1 ) ) ;
network.getStructure ().finalizeStructure();
network.reset();


In the above code, a BasicNetwork is being created. Three layers are added to this network. The first layer, which becomes the input layer, has two neurons. The hidden layer is added second and has two neurons also. Lastly, the output layer is added and has a single neuron. Finally, the finalizeStructure method is called to inform the network that no more layers are to be added. The call to reset randomizes the weights in the connections between these layers.
在上面的代码中,一个网络正在形成。这个网络添加了三层。第一层是输入层,它有两个神经元。隐层被添加到第二层并且有两个神经元。最后,添加了输出层并具有单个神经元。最后,该finalizestructure方法来通知网络没有更多的层被添加。调用重置随机在这些层之间的连接权重值。


Neural networks always begin with random weight values. A process called training refines these weights to values that will provide the desired output. Because neural networks always start with random values, very different results occur from two runs of the same program. Some random weights provide a better starting point than others. Sometimes random weights will be far enough off that the network will fail to learn. In this case, the weights should be randomized again and the process restarted.
神经网络总是以随机权重值开始。一个训练过程将这些权重调整到提供期望输出的值。由于神经网络总是以随机值开头,所以同一程序的两次运行会产生非常不同的结果。一些随机权重提供了比其他更好的起点。有时随机权重相距太远,网络将无法学习。在这种情况下,权重应该再次随机化,进程重新启动。


You will also notice the ActivationSigmoid class in the above code. This specifies the neural network to use the sigmoid activation function. Activation functions will be covered in Chapter 4. The activation functions are only placed on the hidden and output layer; the input layer does not have an activation function. If an activation function were specified for the input layer, it would have no effect.
你也会注意到上面的代码ActivationSigmoid类。这指定使用sigmoid激活函数的神经网络。激活函数将在第4章中讨论。激活函数只放置在隐藏和输出层上;输入层没有激活功能。如果为输入层指定了激活函数,则没有效果。


Each layer also specifies a boolean value. This boolean value specifies if bias neurons are present on a layer or not. The output layer, as shown in Figure 1.1, does not have a bias neuron as input and hidden layers do. This is because a bias neuron is only connected to the next layer. The output layer is the final layer, so there is no need for a bias neuron. If a bias neuron was specified on the output layer, it would have no effect.
每个层还指定一个布尔值。这个布尔值指定是否存在一个层上的偏置神经元。输出层,如图1.1所示,没有偏置神经元作为输入,而隐层则是有的。这是因为偏置神经元只连接到下一层。输出层是最后一层,因此不需要偏置神经元。如果在输出层指定一个偏置神经元,则不会产生影响。


These weights make up the long-term memory of the neural network. Some neural networks also contain context layers which give the neural network a short-term memory as well. The neural network learns by modifying these weight values. This is also true of the Elman and Jordan neural networks.
这些权重构成了神经网络的长期记忆。有些神经网络也包含上下文层,这也给神经网络带来了短时记忆。神经网络通过修改这些权值来学习。Elman神经网络和Jordan神经网络也是如此。


Now that the neural network has been created, it must be trained. Training is the process where the random weights are refined to produce output closer to the desired output. Training is discussed in the next section.
既然已经建立了神经网络,就必须训练它。训练是随机权重的调整过程,使输出更接近期望的输出。训练将在下一节讨论。


1.5.3 Training a Neural Network
1.5.3 训练神经网络


To train the neural network, a MLDataSet object is constructed. This object contains the inputs and the expected outputs. To construct this object, two arrays are created. The first array will hold the input values for the XOR operator. The second array will hold the ideal outputs for each of four corresponding input values. These will correspond to the possible values for XOR. To review, the four possible values are as follows:
训练神经网络,要构造一个MLDataSet对象。该对象包含输入和期望输出。要构造这个对象,需要创建两个数组。第一个数组将保存XOR运算符的输入值。第二个数组将为四个相应的输入值中的每一个保存理想的输出。这些将对应于XOR的可能值。检查,这四种可能的值如下所示:


0 XOR 0 = 0
1 XOR 0 = 1
0 XOR 1 = 1
1 XOR 1 = 0
First, construct an array to hold the four input values to the XOR operator using a two dimensional double array. This array is as follows:
首先,构造一个数组,用二维数组对XOR运算符保存四个输入值。这个数组如下所示:


public static double XOR_INPUT [ ] [ ] = {
{ 0.0 , 0.0 } ,
{ 1.0 , 0.0 } ,
{ 0.0 , 1.0 } ,
{ 1.0 , 1.0 } } ;
Likewise, an array must be created for the expected outputs for each of the input values. This array is as follows:
同样,必须为每个输入值的期望输出创建一个数组。这个数组如下所示:


public static double XOR_IDEAL [ ] [ ] = {
{ 0.0 } ,
{ 1.0 } ,
{ 1.0 } ,
{ 0.0 } } ;


Even though there is only one output value, a two-dimensional array must still be used to represent the output. If there is more than one output neuron,additional columns are added to the array.
即使只有一个输出值,仍然必须使用一个二维数组来表示输出。如果有多个输出神经元,则向数组添加附加列。


Now that the two input arrays are constructed, a MLDataSet object must be created to hold the training set. This object is created as follows:
现在,两个输入数组构成,一个MLDataSet对象必须创建持有训练集。此对象创建如下所示:


MLDataSet trainingSet = new BasicMLDataSet (XOR_INPUT, XOR_IDEAL) ;


Now that the training set has been created, the neural network can be trained.Training is the process where the neural network’s weights are adjusted to better produce the expected output. Training will continue for many iterations until the error rate of the network is below an acceptable level. First, a training object must be created. Encog supports many different types of training.
在训练集建立后,训练神经网络。训练时调整权值,以更好地产生期望输出。训练将继续进行许多迭代,直到网络的错误率低于可接受的水平为止。首先,必须创建一个训练对象。Encog支持许多不同类型的训练。


For this example Resilient Propagation (RPROP) training is used. RPROP is perhaps the best general-purpose training algorithm supported by Encog. Other training techniques are provided as well as certain problems are solved better with certain training techniques. The following code constructs a RPROP trainer:
例如使用弹性传播(RPROP)训练。RPROP也许是Encog支持的最好的通用训练算法。其他的训练技术,也可以更好地解决问题。下面的代码建立一个新的RPROP训练:


MLTrain train = new ResilientPropagation ( network , trainingSet ) ;


All training classes implement the MLTrain interface. The RPROP algorithm is implemented by the ResilientPropagation class, which is constructed above. Once the trainer is constructed, the neural network should be trained. Training the neural network involves calling the iteration method on the MLTrain class until the error is below a specific value. The error is the degree to which the neural network output matches the desired output.
所有训练类都实现了MLTrain接口。RPROP算法是通过上面构建的ResilientPropagation类实现的。一旦构造了训练器,就可以训练神经网络了。训练神经网络涉及对MLTrain类调用迭代法直到误差小于某一特定值。误差是神经网络输出与期望输出匹配的程度。


int epoch = 1;
do {
train.iteration();
System.out.println ( ”Epoch #” + epoch + ” Error : ”
+ train.getError () ) ;
epoch++;
} while ( train.getError () > 0.01) ;
The above code loops through as many iterations, or epochs, as it takes to get the error rate for the neural network to be below 1%. Once the neural network has been trained, it is ready for use. The next section will explain how to use a neural network.
上面的代码遍历许多迭代,因为要使神经网络的错误率低于1%。一旦训练好神经网络,就可以使用了。下一节将解释如何使用神经网络。


1.5.4 Executing a Neural Network
1.5.4 执行神经网络


Making use of the neural network involves calling the compute method on the BasicNetwork class. Here we loop through every training set value and display the output from the neural network:
运用神经网络是对BasicNetwork类调用的计算方法。在这里,我们遍历每个训练集值,并显示神经网络的输出:


System.out.println(”Neural Network Results : ”) ;
for ( MLDataPair pair:trainingSet) {
final MLData output = network.compute ( pair.getInput ( ) ) ;
System.out.println( pair.getInput ( ).getData (0)
+ ” , ” + pair.getInput ( ).getData (1)
+ ” , actual=” + output.getData (0) + ” ,ideal=” +
pair.getIdeal().getData (0)) ;
}
The compute method accepts an MLData class and also returns another MLData object. The returned object contains the output from the neural network, which is displayed to the user. With the program run, the training results are first displayed. For each epoch, the current error rate is displayed.
计算方法接受一个MLData类并返回另一个MLData对象。返回的对象包含来自神经网络的输出,该输出显示给用户。程序运行时,首先显示训练结果。对于每个迭代,显示当前错误率。


Epoch #1 Error :0.5604437512295236
Epoch #2 Error :0.5056375155784316
Epoch #3 Error :0.5026960720526166
Epoch #4 Error :0.4907299498390594
...
Epoch #104 Error :0.01017278345766472
Epoch #105 Error :0.010557202078697751
Epoch #106 Error :0.011034965164672806
Epoch #107 Error :0.009682102808616387


Finally, the program displays the results from each of the training items as follows:
最后,程序显示每个训练集的结果。


Neural Network Results :
0.0 , 0.0 , actual =0.002782538818034049 , ideal=0.0
1.0 , 0.0 , actual =0.9903741937121177 , ideal=1.0
0.0 , 1.0 , actual =0.9836807956566187 , ideal=1.0
1.0 , 1.0 , actual =0.0011646072586172778 , ideal=0.0


As you can see, the network has not been trained to give the exact results.This is normal. Because the network was trained to 1% error, each of the results will also be within generally 1% of the expected value.Because the neural network is initialized to random values, the final output will be different on second run of the program.
正如你所看到的,网络并没有被训练来给出准确的结果,这是正常的。由于网络被训练为1%错误,每个结果也将在一般的预期值的1%。由于神经网络被初始化为随机值,最终输出将在程序的第二次运行时不同。


Neural Network Results :
0.0 , 0.0 , actual =0.005489822214926685 , ideal=0.0
1.0 , 0.0 , actual =0.985425090860287 , ideal=1.0
0.0 , 1.0 , actual =0.9888064742994463 , ideal=1.0
1.0 , 1.0 , actual =0.005923146369557053 , ideal=0.0
The second run output is slightly different. This is normal.This is the first Encog example. All of the examples contained in this book are also included with the examples downloaded with Encog. For more information on how to download these examples and where this particular example is located, refer to Appendix A, “Installing Encog”.
第二次运行的输出略有不同。这是正常的。这是第一个Encog例子。下载的Encog中的例子中包含在这本书中的所有例子。更多关于如何下载这些示例和这些特殊例子所在位置的信息,参见附录A“安装Encog”。