So after I published my last blog post earlier today (the first installment of my Let’s learn together series), it was brought to my attention that the first part of the series (which is about linear regression) was missing something. And in fact, it was.

I had left out the calculation of the square error. That was because I didn’t understand it, wanted some more time to look into it and thought that the post would become too long if I had chosen to include it. But since this morning, I looked into it and think that I can implement it now (lol no) which is why I’m writing this appendix.
So at the risk of this becoming really messy: Let’s get into it!

Disclaimer: this blog series is not really meant for people who want to learn AI themselves but much rather there to document my learning process and my descent into the rabbit hole that is artificial intelligence. I will link all the resources I use to learn somewhere in my blog posts.

Now that we have a rough understanding of linear regression and know how to get a linear function that can approximate our points, we want to find out how accurate this function is. We can do this by drawing the blood of a freshly slaughtered deer in the shape of a pentagram on the floor…wait…wrong tutorial. Actually finding the error of the function (or r²) is much easier than that.

Let’s decipher that. The SE or squared error is the sum of the squared value of $\hat{y}$ or $\bar{y}$ respectively. Note, that $\hat{y} = f(i)-y_{dataset_i}$ and $\bar{y}=\overline{y_{dataset}}-y_{dataset_i}$ .
That means that:

and:

To be honest with you, I had no idea what these formulae meant at first. I know that $Error = Prediction - Actual$ . I understand that. I just had no clue what the values in the sums mean and why I’m dividing $SE\hat{y}$ by $SE\bar{y}$ .
But then it hit me. Here, we’re basically just comparing the sum of the errors of our reference data and the function we got by calculating k and d (so Prediction is $f(i)$ and Actual is $y_{dataset_i}$ , which means that $Error_{linear function}=\hat{y}$ ) with the sum of the errors from our reference data and from the average of our reference data (Actual is the same, Prediction is $\overline{y_{dataset}}$ , so $Error_{mean}=\bar{y}$ ).
So if we draw this, we get something like this: Here, we see the error between the mean and a value and the error between our function and some value.

And why do we square those values? This is still a bit weird to me, but, apparently, this is so we can a) get rid of negative values and b) get rid of outliers.

Let’s try to write something like this in C#.

static void Main(string[] args)
{
var points = new List<(decimal x, decimal y)>
{
(245, 1400),
(312, 1600),
(279, 1700),
(308, 1875),
(199, 1100),
(219, 1550),
(405, 2350),
(324, 2450),
(319, 1425),
(255, 1700)
};

var result = GetLine(points);
points.ForEach(a => Console.WriteLine($"x: {a.x}; y: {a.y}")); Console.WriteLine($"k = {result.k}");
Console.WriteLine($"d = {result.d}"); Console.WriteLine($"r² = {result.error.rSquared}");
Console.WriteLine(\$"r = {result.error.r}");
}

public static (Func<decimal, decimal> function, decimal k, decimal d, (decimal rSquared, decimal r) error) GetLine(List<(decimal x, decimal y)> points)
{
var k = GetKValue(points);
var d = GetDValue(k.k, k.xAverage, k.yAverage);
Func<decimal,decimal> function = i => k.k * i + d;
var error = GetError(points, function, k.yAverage);
return (function, k.k, d, error);
}

private static (decimal rSquared, decimal r) GetError(List<(decimal x, decimal y)> points, Func<decimal, decimal> function, decimal yAverage)
{
decimal squaredError(List<decimal> orig, List<decimal> line)
{
return orig.Select((t, i) => ((line[i] - t) * (line[i] - t))).Sum();
}

var yOrig = points.Select(a => a.y).ToList();
var yLine = points.Select(a => function(a.x)).ToList();
var yMeanLine = points.Select(a => yAverage).ToList();
var rSquared = 1 - (squaredError(yOrig, yLine) / squaredError(yOrig, yMeanLine));

return (rSquared, (decimal)Math.Sqrt((double)rSquared));
}


And again, let’s get out our trusty GeoGebra and check the result by clicking the “Show Statistics” button in the “Data Analysis” view. And now let’s take a look at the result of our implementation. There we go! We came to the same result! I really don’t know if 0.5808 is a “good” result…but at least it’s the same as in GeoGebra. That’s it from my side. I hope I didn’t forget anything cause I’ll be really mad if I did.

P.S.: Also, starting with Part 2 of Let’s learn together: AI, I will switch to Python. I figured that using C# out of spite wasn’t my smartest idea.