Situation&Task:
遇到一个机器学习的回归问题。
Actioin:
1.尝试用LinearRegression这个model,涉及调参问题。然后阅读LinearRegression的源码,简单做个笔记。
2. 还尝试了RidgeCV (l2 regularizaiton), LassoCV(l1 regularizaiton)
Result&learned:
LinearRegression中的fit函数求参数w和b,是用normal equation求解的。具体求解方法:
(X, y)
函数说明参考官方函数
其中最重要的一句话,Solves the equation a x = b by computing a vector x that minimizes the Euclidean 2-norm || b - a x ||^2.
def fit(self, X, y, sample_weight=None):
"""
Fit linear model.
Parameters
----------
X : numpy array or sparse matrix of shape [n_samples,n_features]
Training data
y : numpy array of shape [n_samples, n_targets]
Target values. Will be cast to X's dtype if necessary
sample_weight : numpy array of shape [n_samples]
Individual weights for each sample
.. versionadded:: 0.17
parameter *sample_weight* support to LinearRegression.
Returns
-------
self : returns an instance of self.
"""
n_jobs_ = self.n_jobs
X, y = check_X_y(X, y, accept_sparse=['csr', 'csc', 'coo'],
y_numeric=True, multi_output=True)
if sample_weight is not None and np.atleast_1d(sample_weight).ndim > 1:
raise ValueError("Sample weights must be 1D array or scalar")
X, y, X_offset, y_offset, X_scale = self._preprocess_data(
X, y, fit_intercept=self.fit_intercept, normalize=,
copy=self.copy_X, sample_weight=sample_weight)
if sample_weight is not None:
# Sample weight can be implemented via a simple rescaling.
X, y = _rescale_data(X, y, sample_weight)
if (X):
if < 2:
out = sparse_lsqr(X, y)
self.coef_ = out[0]
self._residues = out[3]
else:
# sparse_lstsq cannot handle y with shape (M, K)
outs = Parallel(n_jobs=n_jobs_)(
delayed(sparse_lsqr)(X, y[:, j].ravel())
for j in range([1]))
self.coef_ = (out[0] for out in outs)
self._residues = (out[3] for out in outs)
else:
self.coef_, self._residues, self.rank_, self.singular_ = \
(X, y) ##找了很久原来在这里##
self.coef_ = self.coef_.T
if == 1:
self.coef_ = (self.coef_)
self._set_intercept(X_offset, y_offset, X_scale)
return self