An important part of implementing weight and bias updates for neural networks is the use of the BackPropagation algorithm. Specifically, the BackPropagation algorithm uses backpropagation of errors to calculate the derivatives of w (weights) and b (bias) with respect to the objective function, so that the original w, b can be updated by subtracting the partial derivatives. One of the python implementation of gradient descent I wrote last time there is a function backprop(x,y) is used to implement the algorithm of backpropagation. (Note: the code is not self-summarizing, there are implementations of this code on github)/LCAIZJ/neural-networks-and-deep-learning)
def backprop(self,x,y): nabla_b = [() for b in ] nabla_w = [() for w in ] # With input x, forward compute the value of the output layer activation = x activations = [x]# Stored is so the output layer zs = [] for b,w in zip(,): z = (w,activation)+b (z) activation = sigmoid(z) (activation) # Calculate the error of the output layer delta = self.cost_derivative(activations[-1],y)*sigmoid_prime(zs[:-1]) nabla_b[-1] = delta nabla_w[-1] = (delta,activations[-2].transpose()) #Reverse update error for l in xrange(2,self.num_layers): z = zs[-l] sp = sigmoid_prime(z) delta = ([-l+1].transpose(),delta)*sp nabla_b[-l] = delta nabla_w[-l] = (delta,activations[-l-1].transpose()) return (nabla_b,nabla_w)
where the incoming x and y are separate instances.
def cost_derivative(self,output_activation,y): return (output_activation-y) def sigmoid(z): return 1.0/(1.0+(z)) def sigmoid_prime(z): return sigmoid(z)*(1-sigmoid(z))
This is the whole content of this article.