日期:2014-05-16  浏览次数:20441 次

机器学习实战:单变量线性回归的实现

一、算法实现


由前面的理论,我们知道了用梯度下降解决线性回归的公式:


梯度下降解决线性回归思路:




算法实现:

ComputeCost函数:

function J = computeCost(X, y, theta)
	
	m = length(y); % number of training examples
	J = 0;
	predictions = X * theta;
	J = 1/(2*m)*(predictions - y)'*(predictions - y);

end

gradientDescent函数:

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
% X is m*(n+1) matrix 
% y is m*1
% theta is (n+1)*1 matrix
% alpha is a number 
% num_iters is number of iterators

	
	m = length(y); % number of training examples
	J_history = zeros(num_iters, 1);  %cost function的值的变化过程
	%预先定义了迭代的次数

	for iter = 1:num_iters

		temp1 = theta(1) - (alpha / m) * sum((X * theta - y).* X(:,1));
		temp2 = theta(2) - (alpha / m) * sum((X * theta - y).* X(:,2));
		theta(1) = temp1;
		theta(2) = temp2;
		J_history(iter) = computeCost(X, y, theta);

	end

end



二、数据可视化


我们通过算法实现能够求出函数h(x),但是我们还需要将数据可视化:
(1)画出训练集的散点图+拟合后的直线;
(2)画出J(theta)为z轴,theta0为x轴,theta1为y轴的三维曲线;
(3)画出(2)的三维曲线的等高线图;



1.画散点图+拟合的直线


描述:给定ex1data1.txt,文件中有两列数据,每一列代表一个维度,第一列代表X,第二列代表Y,用Octave画出散布图(Scalar Plot),数据的形式如下:

6.1101,17.592

5.5277,9.1302

8.5186,13.662

7.0032,11.854

5.8598,6.8233

8.3829,11.886

........


答:
(1)data = load('ex1data1.txt');             %导入该文件,并赋予data变量
(2)X = data( : , 1 );Y = data( : , 2);    %将两列分别赋予X和Y
(3)X = [ones(size(X,1),1),X];                  %在X的左边添加一列1
(4)