Linear Regression with machine learning methods

时间:2022-05-05 02:57:02

Ha, it's English time, let's spend a few minutes to learn a simple machine learning example in a simple passage.

Introduction

  • What is machine learning? you design methods for machine to learn itself and improve itself.
  • By leading into the machine learning methods, this passage introduced three methods to get optimal k and b of linear regression(y = k*x + b).
  • The data used is produced by ourselves.
  1. Self-sufficient data generation
  2. Random Chosen Method
  3. Supervised Direction Method
  4. Gradient Descent Method
  5. Conclusion

Linear Regression with machine learning methods

Self-sufficientDataGeneration

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import random #produce data
age_with_fares = pd.DataFrame({"Fare":[263.0, 247.5208, 146.5208, 153.4625, 135.6333, 247.5208, 164.8667, 134.5, 135.6333, 153.4625, 134.5, 263.0, 211.5, 263.0, 151.55, 153.4625, 227.525, 211.3375, 211.3375],
"Age":[23.0, 24.0, 58.0, 58.0, 35.0, 50.0, 31.0, 40.0, 36.0, 38.0, 41.0, 24.0, 27.0, 64.0, 25.0, 40.0, 38.0, 29.0, 43.0]})
sub_fare = age_with_fares['Fare']
sub_age = age_with_fares['Age'] #show our data
plt.scatter(sub_age,sub_fare)
plt.show()

Linear Regression with machine learning methods

def func(age, k, b): return k*age+b
def loss(y,yhat): return np.mean(np.abs(y-yhat))
#here we choose only minus methods as the loss, besides, there are mean-square-error(L2) loss and other loss methods

RandomChosenMethod

min_error_rate = float('inf')

loop_times = 10000
losses = [] def step(): return random.random() * 2 - 1
# random生成 0~1的随机数;(0,1)*2 -> (0,2); 再减1 -> (-1,1), 随机生成+循环:学习动力来源 while loop_times > 0:
k_hat = random.random() * 20 - 10
b_hat = random.random() * 20 - 10
estimated_fares = func(sub_age, k_hat, b_hat)
error_rate = loss(y=sub_fare, yhat=estimated_fares)
if error_rate<min_error_rate:# 自我监督机制体现在此
min_error_rate = error_rate
losses.append(error_rate)
best_k = k_hat
best_b = b_hat loop_times -= 1 plt.scatter(sub_age, sub_fare)
plt.plot(sub_age, func(sub_age, best_k, best_b), c = 'r')
plt.show()

Linear Regression with machine learning methods

show the loss change

plt.plot(range(len(losses)), losses)
plt.show()

Linear Regression with machine learning methods

Explain

  • We can see the loss decrease sometimes quickly, sometimes slowly, anyway, it decreases finally.
  • One shortcoming of this method: the Random Chosen methods is not so valid as it runs random function tons of time.
  • Because even when it comes out a better parameter, it may choose a worse one next time.
  • One improved method see next part.

SupervisedDirectionMethod

change_directions = [
(+1, -1),# k increase, b decrease
(+1, +1),
(-1, -1),
(-1, +1)
]
min_error_rate = float('inf') loop_times = 10000
losses = [] best_direction = random.choice(change_directions)
#定义每次变化(步长)的大小
def step(): return random.random()*2-1
#random生成 0~1的随机数;(0,1)*2 -> (0,2); 再减1 -> (-1,1);
#但是change_directions已经有加减1(改变方向)的操作,所以去掉 *2-1
#但保留*2-1 能增加choise k_hat = random.random() * 20 - 10
b_hat = random.random() * 20 - 10
best_k, best_b = k_hat, b_hat
while loop_times > 0:
k_delta_direction, b_delta_direction = best_direction or random.choice(change_directions)
k_delta = k_delta_direction * step()
b_delta = b_delta_direction * step() new_k = best_k + k_delta
new_b = best_b + b_delta estimated_fares = func(sub_age, new_k, new_b)
error_rate = loss(y=sub_fare, yhat=estimated_fares)
#print(error_rate) if error_rate < min_error_rate:#supervisor learning
min_error_rate = error_rate
best_k, best_b = new_k, new_b best_direction = (k_delta_direction, b_delta_direction) #print(min_error_rate)
#print("loop == {}".format(loop_times))
losses.append(min_error_rate)
#print("f(age) = {} * age + {}, with error rate: {}".format(best_k, best_b, error_rate))
else:
best_irection = random.choice(list(set(change_directions)-{(k_delta_direction, b_delta_direction)}))
#新方向不能等于老方向
loop_times -= 1
print("f(age) = {} * age + {}, with error rate: {}".format(best_k, best_b, error_rate))
plt.scatter(sub_age, sub_fare)
plt.plot(sub_age, func(sub_age, best_k, best_b), c = 'r')
plt.show()

Linear Regression with machine learning methods

show the loss change

plt.plot(range(len(losses)), losses)
plt.show()

Linear Regression with machine learning methods

Explain

  • The Supervised Direction method(2nd method) is better than Random Chosen method(1st method).
  • The 2nd method introduced supervise mechanism, which is more efficiently in changing parameters k and b.
  • But the 2nd method can't optimize the parameters to smaller magnitude.
  • Besides, the 2nd method can't find the extreme value, thus can't find the optimal parameters effectively.

GradientDescentMethod

min_error_rate = float('inf')
loop_times = 10000
losses = []
learing_rate = 1e-1 change_directions = [
# (k, b)
(+1, -1), # k increase, b decrease
(+1, +1),
(-1, +1),
(-1, -1) # k decrease, b decrease
] k_hat = random.random() * 20 - 10
b_hat = random.random() * 20 - 10 best_direction = None
def step(): return random.random() * 1
direction = random.choice(change_directions) def derivate_k(y, yhat, x):
abs_values = [1 if (y_i - yhat_i) > 0 else -1 for y_i, yhat_i in zip(y, yhat)] return np.mean([a * -x_i for a, x_i in zip(abs_values, x)]) def derivate_b(y, yhat):
abs_values = [1 if (y_i - yhat_i) > 0 else -1 for y_i, yhat_i in zip(y, yhat)]
return np.mean([a * -1 for a in abs_values]) while loop_times > 0: k_delta = -1 * learing_rate * derivate_k(sub_fare, func(sub_age, k_hat, b_hat), sub_age)
b_delta = -1 * learing_rate * derivate_b(sub_fare, func(sub_age, k_hat, b_hat)) k_hat += k_delta
b_hat += b_delta estimated_fares = func(sub_age, k_hat, b_hat)
error_rate = loss(y=sub_fare, yhat=estimated_fares) #print('loop == {}'.format(loop_times))
#print('f(age) = {} * age {}, with error rate: {}'.format(k_hat, b_hat, error_rate))
losses.append(error_rate) loop_times -= 1 print('f(age) = {} * age {}, with error rate: {}'.format(k_hat, b_hat, error_rate))
plt.scatter(sub_age, sub_fare)
plt.plot(sub_age, func(sub_age, k_hat, b_hat), c = 'r')
plt.show()

Linear Regression with machine learning methods

show the loss change

plt.plot(range(len(losses)), losses)
plt.show()

Linear Regression with machine learning methods

Explain

  • To fit the objective function given discrete data, we use the loss function to determine how good the fit is.
  • In order to get the minimum loss, it becomes a problem of finding the extremum without constraints.
  • Therefore, the method of gradient reduction of the objective function is conceived.
  • The gradient is the maximum value in the directional derivative.
  • When the gradient approaches 0, we fit the better objective function.

Conclusion

  • Machine learning is a process to make the machine learning and improving by methods designed by us.
  • Random function usually not so efficient, but when we add supervise mechanism, it becomes efficient.
  • Gradient Descent is efficiently to find extreme value and optimal.

Serious question for this article:

Why do you use machine learning methods instead of creating a y = k*x + b formula?

  • In some senarios, complicated formula can't meet the reality needs, like irrational elements in economics models.
  • When we have enough valid data, we can run regression or classification model by machine learning methods
  • We can also evaluate our machine learning model by test data which contributes to the application of the model in our real life
  • This is just an example, Okay.

Reference for this article: Jupyter Notebook

Linear Regression with machine learning methods的更多相关文章

  1. Machine Learning Methods&colon; Decision trees and forests

    Machine Learning Methods: Decision trees and forests This post contains our crib notes on the basics ...

  2. How to use data analysis for machine learning &lpar;example&comma; part 1&rpar;

    In my last article, I stated that for practitioners (as opposed to theorists), the real prerequisite ...

  3. 机器学习&lpar;Machine Learning&rpar;&amp&semi;深度学习&lpar;Deep Learning&rpar;资料&lpar;Chapter 2&rpar;

    ##机器学习(Machine Learning)&深度学习(Deep Learning)资料(Chapter 2)---#####注:机器学习资料[篇目一](https://github.co ...

  4. How do I learn machine learning&quest;

    https://www.quora.com/How-do-I-learn-machine-learning-1?redirected_qid=6578644   How Can I Learn X? ...

  5. booklist for machine learning

    Recommended Books Here is a list of books which I have read and feel it is worth recommending to fri ...

  6. Machine Learning and Data Mining(机器学习与数据挖掘)

    Problems[show] Classification Clustering Regression Anomaly detection Association rules Reinforcemen ...

  7. Why The Golden Age Of Machine Learning is Just Beginning

    Why The Golden Age Of Machine Learning is Just Beginning Even though the buzz around neural networks ...

  8. Introduction to Machine Learning

    Chapter 1 Introduction 1.1 What Is Machine Learning? To solve a problem on a computer, we need an al ...

  9. Machine learning &vert; 机器学习中的范数正则化

    目录 1. \(l_0\)范数和\(l_1\)范数 2. \(l_2\)范数 3. 核范数(nuclear norm) 参考文献 使用正则化有两大目标: 抑制过拟合: 将先验知识融入学习过程,比如稀疏 ...

随机推荐

  1. Linux dirname、basename&lpar;转&rpar;

    首先使用 --help 参数查看一下.basename命令参数很少,很容易掌握. $ basename --help 用法示例: $ basename /usr/bin/sort       输出&q ...

  2. 吴恩达深度学习笔记1-神经网络的编程基础&lpar;Basics of Neural Network programming&rpar;

    一:二分类(Binary Classification) 逻辑回归是一个用于二分类(binary classification)的算法.在二分类问题中,我们的目标就是习得一个分类器,它以对象的特征向量 ...

  3. GitLab 502问题的解决

    问题: 502 Whoops, GitLab is taking too much time to respond. 日志: [root@cs12-66-gitlab ~]# my gitlab-ct ...

  4. Windows系统下如何在cmd命令窗口中切换Python2&period;7和Python3&period;6

    针对在同一系统下我们可能安装多个版本的Python,毕竟Python2.7与Python3.6还是有不同的需求,但是在用Cmd命令窗口是我们可能默认的系统变量环境是其中一个版本,当我们需要在cmd命令 ...

  5. 20155235 2016-2017-1 《Java程序设计》第3周学习总结

    20155235 2016-2017-1 <Java程序设计>第3周学习总结 教材学习内容总结 第四章 认识对象 类与对象 定义类 使用标准类 对象指定与相等性 基本类型打包器 打包基本类 ...

  6. 多线程-join&lpar;&rpar;方法

    在很多情况下,主进程创建并启动子线程,如果子线程中要进行大量的耗时运算,主线程往往将早于子线程结束之前结束.这时,如果主线程想等待子线程执行完成之后再结束,比如子线程处理一个数据,主线程要取得这个数据 ...

  7. regcomp&sol;regexec&sol;regfree--POSIX regex functions

    语法 #include <sys/types.h> #include <regex.h> int regcomp(regex_t *preg, const char *rege ...

  8. Rename Oracle Managed File &lpar;OMF&rpar; datafiles in ASM&lpar;ZT&rpar;

    Recently I was asked to rename a tablespace. The environment was Oracle version 11.2.0.3 (both datab ...

  9. BZOJ2286 &lbrack;Sdoi2011&rsqb;消耗战 和 BZOJ3611 &lbrack;Heoi2014&rsqb;大工程

    2286: [Sdoi2011]消耗战 Time Limit: 20 Sec  Memory Limit: 512 MBSubmit: 6371  Solved: 2496[Submit][Statu ...

  10. Nginx简单了解

    1.静态HTTP服务器 首先,Nginx是一个HTTP服务器,可以将服务器上的静态文件(如HTML.图片)通过HTTP协议展现给客户端. 配置: server { listen80; # 端口号 lo ...