Linearization – linear approximation of a nonlinear function

nonlinear function can be approximated with an linear function in a certain operating point. The process of linearization, in mathematics, refers to the process of finding a linear approximation of a nonlinear function at a given point (x0, y0).

Nonlinear function with tangent line

For a given nonlinear function, its linear approximation, in an operating point (x0, y0), will be the tangent line to the function in that point.

Linearization – theoretical background

A line is defined by a linear equation as:



m – the slope of the line
b – the vertical offset of the line

Line with slope and offset

The slope m of the line can be defined as the tangent function of the angle (α) between the line and the horizontal axis:


where dy and dx are small variations in the coordinates of the line.

Another way of defining a line, is by specifying the slope m and a point (x0, y0) through which the line passes. The equation of the line will be:


Replacing equation (2) in (3) gives:


Equation (4) translates into: for a given nonlinear function, its linear approximation in an operating point (x0, y0) depends on the derivative of the function in that point.

In order to get a general expression of the linear approximation, we’ll consider a function f(x) and the x-coordinate of the function a. The y-coordinate of the function will be f(a).

Replacing these in equation (3) gives:


We can now write the general linear approximation L(x) of a nonlinear function f(x) in a point a as:


Linearization – practical example

Let’s find a linear approximation of the function f(x) in the point a = 1.


The graphical representation of the function is:

Nonlinear function with linearization point

Step 1. Calculate f(a)


Step 2. Calculate the derivative of f(x)


Step 3. Calculate the slope of the linear approximation f'(a)


Step 4. Write the equation L(x) of the linear approximation

If we plot the linear approximation L(x) on the same graph, we get:

Nonlinear function with linearization point and linear approximation

As expected, the linear approximation L(x) in the point (a, f(a)) is tangent to the nonlinear function.

If we consider an interval close to our linearization point a, we can see that the results of the linear approximation are very close to the ones of the nonlinear function. For example, let’s plot both nonlinear function f(x) and linear approximation L(x) between 0.9 and 1.1.

Relative error between nonlinear function and linear approximation

If we calculate the relative error between the results of the nonlinear and linear functions, we’ll notice small errors, below 0.5 %. This means that we can use our linear approximation to predict the behavior of the nonlinear function, but only around the linearization point (a, f(a)).

The Scilab instruction to plot the above graphical representations are:


angle between two vector

As a workaround, you can find the norm of the cross product using the CROSS function and the dot product using the DOT function and then find the four quadrant inverse tangent in degrees using the ‘atan2d’ function.

For example:

u = [1 2 0];
v = [1 0 0];
ThetaInDegrees = atan2d(norm(cross(u,v)),dot(u,v));

You can also divide the dot product of the two vectors obtained using the DOT function by the product of magnitudes of the two vectors (NORM function), to get the cosine of the angle between the two vectors. This does not work well for small angles.

For example:


u = [1 2 0];
v = [1 0 0];
CosTheta = dot(u,v)/(norm(u)*norm(v));
ThetaInDegrees = acosd(CosTheta);

Don’t invert that matrix

There is hardly ever a good reason to invert a matrix.

What do you do if you need to solve Ax = b where A is an n x n matrix? Isn’t the solution A-1 b? Yes, theoretically. But that doesn’t mean you need to actually find A-1. Solving the equation Ax = b is faster than finding A-1. Books might write the problem as x = A-1 b, but that doesn’t mean they expect you to calculate it that way.

What if you have to solve Ax = b for a lot of different b‘s? Surely then it’s worthwhile to find A-1. No. The first time you solve Ax = b, you factor A and save that factorization. Then when you solve for the next b, the answer comes much faster. (Factorization takes O(n3) operations. But once the matrix is factored, solving Ax = b takes only O(n2) operations. Suppose n = 1,000. This says that once you’ve solved Ax = b for one b, the equation can be solved again for a new b 1,000 times faster than the first one. Buy one get one free.)

What if, against advice, you’ve computed A-1. Now you might as well use it, right? No, you’re still better off solving Ax = b than multiplying by A-1, even if the computation of A-1 came for free. Solving the system is more numerically accurate than the performing the matrix multiplication.

It is common in applications to solve Ax = b even though there’s not enough memory to store A-1. For example, suppose n = 1,000,000 for the matrix A but A has a special sparse structure — say it’s banded — so that all but a few million entries of A are zero.  Then A can easily be stored in memory and Ax = b can be solved very quickly. But in general A-1 would be dense. That is, nearly all of the 1,000,000,000,000 entries of the matrix would be non-zero.  Storing A requires megabytes of memory, but storing A-1would require terabytes of memory.

Linearizing Systems of First Order Nonlinear Differential Equations

一维函数 y = f(x)x=a 处的一阶近似为:

f(x) \approx \frac{df}{dx}\arrowvert_{x=x_0}(x-a)

我们说这是函数 f(x) 在点 x=x_0 附近的线性化。

二元函数比如 y=f(x_1,x_2) ,有多元函数在点 (a,b) 附近的线性化:

f(x_1,x_2)\approx f(a,b)+\frac{\partial f}{\partial x}\arrowvert_{x_1=a,x_2=b}(x_1-a) + \frac{\partial f}{\partial y}\arrowvert_{x_1=a,x_2=b}(x_2-b)



如果要在 (a,b) 处线性化当然只能分别线性化

\begin{cases}\dot{x}_1\approx f_1(a,b)+\frac{\partial f_1}{\partial x}\arrowvert_{x_1=a,x_2=b}(x_1-a) + \frac{\partial f_1}{\partial y}\arrowvert_{x_1=a,x_2=b}(x_2-b)\\\dot{x}_2\approx f_2(a,b)+\frac{\partial f_2}{\partial x}\arrowvert_{x_1=a,x_2=b}(x_1-a) + \frac{\partial f_2}{\partial y}\arrowvert_{x_1=a,x_2=b}(x_2-b)\end{cases}

然后一般来说 (a,b) 都取工作点(平衡点),即 f_1(a,b)=f_2(a,b)=0


\left[\begin{matrix}\dot{x}_1\\\dot{x}_2\end{matrix}\right]\approx\left[\begin{matrix}\frac{\partial f_1}{\partial x_1}&\frac{\partial f_1}{\partial x_2}\\\frac{\partial f_2}{\partial x_1}&\frac{\partial f_2}{\partial x_2}\end{matrix}\right]_{\arrowvert x_1=a,x_2=b}\left[\begin{matrix}x_1-a\\x_2-b\end{matrix}\right]



What is a non-manifold mesh


The non-manifold mesh arises while you are editing meshes using tools like Blender. Certain mesh operations cannot be performed on non-manifold meshes. A non-manifold mesh might have one or more elements with the following properties:

1. An edge incident to more than two faces.

2. Two or more faces connected only by a vertex and not by an edge.

3. Adjacent faces whose normals are pointing in opposite directions.

The above video explains these concepts beautifully in Maya.


What is a non-manifold mesh