XXX4Fans
3blue1brown from patreon
3blue1brown

patreon


When did Taylor series click for you?

Hey everyone,

I'm putting together a video on Taylor series, and I don't know about you, but when I was a student first learning this topic I definitely did not appreciate how important they are.  It seems that only happens once you run into a situation where you actually need a polynomial approximation to turn a problem from hairy-to-the-point-of-impossible to something simpler to solve.

I'm curious to hear, what was your experience with Taylor series?  When did they first click?  What was the problem you were solving?

-Grant

Comments

When I learned that polynomials are a basis of C. It's such a small thing but it did the trick for me. Similar for Fourier series.

I can't thank you enough

I'm working in research about light transport simulation (creating photorealistic images; CGI and the likes), which is performance-critical (usually solving integrals numerically with Monte Carlo techniques). I'd often see papers use polynomials with funky coefficients to approximate complicated functions (often intractable integrals) while being efficient to evaluate. It took me over a year to realize these were just taylor polynomials, but it felt great when I did! I particularily like how taylor polynomials can approximate integrals with an intractable closed form, because the integral only needs to be evaulated at a single point, which in turn can often be done numerically.

I learned Taylor series approximations in undergraduate math but it was not until about 5 years later studying biostatistics in graduate school that I first used them in a practical setting to approximate standard errors of statistical estimators. After I finally used them to solve a real problem, I started seeing them everywhere!

Just now when I watched Chapter 10 :-)

Daniel Raynaud

Taylor Series clicked for me when I realized how powerful it was as a tool for approximating functions. After learning the basic definition, I created a program in GeoGebra that would graph a function and it's nth order approximation, where n was on a slider, and I could see how each term made the polynomial hug tighter and tighter to the curve. All of the sudden, it made sense how complicated functions (especially in physics) could be approximated as polynomials that were much simpler.

Chuck Larrieu

Thanks for sharing! It's really helpful to read what everyone has to say.

3blue1brown

I recommend approximating a pendulum as a simple harmonic oscillator as a first example. (A couple earlier commenters also mentioned this one,)

Jacob Rus

Hey Grant, I hope this isn't too late, but I think I have some really good ideas for you, as I finally got an intuitive grasp on Taylors when working with a physics analogy. I think intuitively, you're going to want to focus on the time derivatives of position. I would start off with talking about approximating a car moving along a path - what would you want to know? You'd want it's position at the start (the 0th derivative) and its initial velocity. This would give us a good idea of where we would be at time t. But what if we're accelerating? And on top of that, a change in acceleration? If we incorporate each of those correction factors into the polynomial, we'll be better at approximating our final position! If we theoretically knew everything about the car's change in acceleration, jerk, snap, crackle, pop, etc...we would perfectly predict the position. I think this REALLY comes in handy to get an intuitive feel of Taylor's Remainder Theorem / Lagrange Error. It's like if you took into account only initial position and velocity initial, and you want to see how far off you could be, look at how much the car can accelerate (the max n+1 derivative) in that time interval, and then assume that's the acceleration for the entire interval. That's the farthest you could be off from your original estimate. But really, each term is adding this correction factor so that at any time t, we're encorporating not only its velocity (like a linear approximation), but a correction for its acceleration, etc.

Josh B.

I'm not sure I really "got" Taylor series until I started teaching calculus. It's still a learning process to figure out the best ways to explain them to students.One thing that has helped me this semester has been teaching an analysis class from a historical perspective. Getting to discuss how the early calculus pioneers dealt with infinite series was eye-opening; it was much less formal and more intuitive than today's approach (which of course had its drawbacks). The added context made the topic more enjoyable than the usual theorem-proof-theorem-proof pattern. I'm not sure how much this helps for a video pitched at a more introductory level, but I thought it might be worth mentioning.

Ok, so what my teacher did was during the whole linear approximation process of using derivatives to predict, he introduced the idea of a quadratic approximation without talking about taylor series. The idea was the 'best fit parabola' rather than the best fit line. We even did a circular approximation assignment which found the 'best fit circle' (something I find fascinating every day). Once taylor series were introduced to me as a best fit polynomial, everything started to make a lot of sense.

Ankit Agarwal

Thanks for sharing, I agree that this heuristic is incredibly helpful.

3blue1brown

Good to know that the equivalence actually helped things click, I was otherwise planning on focussing mainly on the approximation component. By the way, not *all* functions equal their Taylor expansions, it's only the subset of "analytic functions". This encompasses much of what we deal with on the day-to-day, but if you really broaden your view to all functions (or even all continuous functions), almost none of them are actually analytic.

3blue1brown

Despite having majored in Math at UC Berkeley, honestly Taylor series didn't really click for me until 24 years after I graduated when I took Professor Robert Ghrist's "Calculus Single Variable" MOOC on Coursera to refresh my knowledge in preparation for helping my high-school daughters with calculus. <a href="https://www.coursera.org/learn/single-variable-calculus" rel="nofollow noopener" target="_blank">https://www.coursera.org/learn/single-variable-calculus</a>. His innovative take on the calculus curriculum introduced Taylor Series very early in the course and used "Big-O" notation more familiar from classes on computer algorithms to make infinite polynomials more easy to think about. The parallels he draws between discrete and continuous methods in his course was another eye-opener for me. Finally, his animated graphics made understanding the material even easier.

Steve Muench

When I learned about Newton's methods for convex optimization

I'd seen a bunch of practical examples that gave me a good intuition for what it was doing, but I felt that I hadn't really "understood" what was going on until I saw the derivation of the formula and a proof for why it's perfect in the limit. In particular my brain was resisting why those factorial terms suddenly pop up, they seemed like an arbitrary hack to get thing to work until I saw the derivation carried out.

dim85

Hi Grant!

A (probably unusual) time it clicked for me was when relating it to your linear algebra course video about arbitrary vector spaces. In that video you included Taylor polynomials as a vector space with polynomials listed component by-component as if they were normal vectors. That's when it hit me - when we talk about a function being "equal" to its taylor approximation, we usually think of errors as getting less and less small. However, in the case of taylor polynomials, ANY FUNCTION can be written in this form EXACTLY. It hit me that taylor series are not just better and better approximations, they're another way of writing the same function! The fact that we get better approximations adding more terms is just like writing more decimal parts of a decimal number - it is successive approximations, but in the limit, the things are the same. sin(x) is truly the same as the taylor series which approximates it, which is what makes taylor series such a powerful tool in pure mathematics instead of only applied math (e.g. applied to physics), where actual approximations become more handy. This may be a good thing to bring up as an extra - that while partial sums of a taylor series are "better and better approximations" the series themselves are also precisely equal to the function which they approximate. And the fact that we can write any function down this way (and the fact that it's a vector space, but that's beside the point here) is pretty dang amazing.

Physics is the classic here with small angle approximations letting you solve classic problems like the equation of motion of a pendulum etc. My favourite though would be the proof that any small displacement from a position of stable equilibrium gives simple harmonic oscillation, where you give a Taylor expansion of a general potential and derive simple harmonic motion by removing higher order terms. From a more maths perspective though I'd say random series sums are pretty cool, like using arctan to get π/4.

A memorable use for me comes from FM-radio. Note that the signal from an FM transmitter has a *modulating frequency*, which essentially means we would encode a basic frequency as a sinusoid *composed with another sinusoid*. A useful question as an electrical engineer is to ask what the Fourier series/transform of this composed sinusoid looks like. That is, how can your run-of-the-mill sinusoids be "combined" (with "linear combinations") to give you this frequency-modulating monster? It turns out that the Taylor series gives us a neat way to see what's going on.

Benjamin Grossmann

A big one was approximating the integral of exp(-x^2) from 0 to 1, aka calculating the cdf of the normal distribution. Then that the radius of convergence was the distance to the nearest singularity in the complex plane. The other was perturbation theory / asymptotic analysis, where you discover that Taylor series aren't actually always the best way to represent a function.

The best way I had to internalize it was thus: If you're approximating a function around a point, the first layer of doing that is a constant approximation, where you only care about matching the value of the approximation to the value of the function at the point you're approximating around. Obviously, this is a poor approximation. The next step might be to go for a linear approximation, where you match both the value and the first derivative of the approximation to those of the function at that point. Because now you've encoded some information about how the function has changed into your approximation, your approximation is mildly better, but by no means perfect. Then, you might match the value, first derivative, AND second derivative; this is yet better, as you encode yet more information. If you continue this process ad infinitum--that is, if you have the value and EVERY derivative of the function at a single point--if the function is particularly friendly (i.e. analytic), then since you have EXACT knowledge of how the function changes from that point, now your "approximation" (now a Taylor series) is perfect! Of course, this ignores things like radius of convergence, non-analyticity, and the remainder term, but this heuristic was very enlightening for me, and caused the rest to click into place.

My AP Calc teacher taught it to us using a graphing calculator. First, he plotted a simple y = cos(x) function. Then he plotted y = 1, and asked us what we noticed (the answer was that it intersected at certain points with the other graph. Then he did y = 1 - 1/2(x^2). Then 1 - 1/2(x^2) + 1/(4!)x^4. And so on. What really got me to understand it was plotting y = cos(a + x) and y= cos(a) - sin(a)x - cos(a)/2(x^2) ... in Desmos, and then setting "a" to a slider.

What a delightful story!

3blue1brown

My first exposure to the Taylor series expansion was within the context of Lagrangian method of calculus where no infinitesimals were used (only algebra). This was in a lecture by Norman Wildberger on differential calculus (lecture 4 on youtube) he discusses using the Lagrangian approach to a quartic function, and produces the Taylor series that he dissects to find the set of all derivatives. For me, this was eye opening and a refreshing alternative way to view analysis. The lecture is titled "The differential calculus for curves, via Lagrange!"

It first clicked when I was learning Free-air effect in geophysics, which is the slight change in gravity value as you move away from the Earth's surface. First, we know that gravitational acceleration is proportional to R^(-2), where R is the distance between where the gravity is measured and the center of mass. But since R^(-2) is difficult to have an intuition and to calculate, what geophysicists did was to Taylor expand the first two terms of g(r) = GM/(r)^2 at the point r=R (radius of the Earth), which gives g(h) = g0 - 2g0/R * h, where g(h) is the gravity measured at height h from the surface of the Earth (h = r - R), g0 is the gravity at the surface (h=0). So the constant term gives a reference gravity, and the slope term tells how gravity changes linearly as you move away from the surface, given that h &lt;&lt; R (Even the highest mountain is ~1/1000th of the Re). It is useful not just because you can understand how gravity actually changes near the surface. We can also take our gravity measurements at whatever height we want and eliminate this effect later on.

Sorry for previous comments not used to safari on phone! What clicked for me while studying the topic last year in calculus, was first of all realising that we use functions to fit to events, and that adding derivatives makes our prediction of what will occur next more precise. This was easiest to see when considering the first and second derivative and how the second derivative takes into account the curvature.

Hi grant,

Hi Grant,

Hi Grant. I was unsatisfied with my book's proof of Taylor's theorem (invent a function and apply Rolle's theorem to it) so I wrote my own using the mean value theorem for integrals. That's when I got it. First I considered, the question, "okay, how bad is the linear approximation L to a function f?" So I thought, L(x) - f(x) = integral of L'(t) - f'(t) from 0 to x. So then the question became "so how bad is the approximation L' to f'? Well, L'(t) - f'(t) = integral of f''(s) from 0 to t. What you end up with is an integral formula for the error in terms of f'', and we arrived at it naturally without "inventing a function". That's how I came to understand the proof better. As for explaining the use of Taylor Polys, I agree with the previous poster who just said, it continues the line of thought, "linear approximations are great, quadratic approximations can be better ! ... ". Those are my thoughts, looking forward to the vid.

Jacob Mirra

For me, it was seeing the expansions of sin(x), cos(x), and e^x and then performing various 'operations' (deriving, limits, integrals, combinations, etc) . I know these are convenient examples, but it provided a better way for me to convince myself that sin(x) and cos(x) were more deeply linked. 'Doing calculus' on polynomials was much more concrete to me than on trig or transcendentals. Learning Taylor polynomials/ series helped lots of things click. Series/ expansions quickly became my favorite parts of studying Calculus.

Colin Williams

In university physics, when we were examining pendula, and we needed a small angle approximation for sin in order to solve the otherwise not-very-solvable differential equation resulting from brute-force application of Newton's laws. Ordinarily, we might have just compared the graphs of sin(x) and x, but in this case the prof referenced the Taylor series and wrote sin(x) = x + O(x^2), using the big-O notation I had learned in computer science. I don't know if it'd help, but Taylor series are also how it finally clicked why the derivative of sin is cos.

Okuno Zankoku

I remember being thinking on a problem for days. It was one of these tasks that the teacher left for homework and gives you extra points if you find out the answer but I couldn't figure it out. I don't remember the exact problem but it was something relate to calculate some limit to something relate with an exponential function. I had a dentist appointment so I went to the dentist and I sat down on the chair, the dentist put the anaesthesia and told me to relax and suddenly I saw it so clear. I had to use the Taylor series of the exponential function and that way I could get the limit. I wanted to stand up and start to write but the dentist didn't let me because the anaesthesia was about to kick in and I was worried because I though I was going to forgot. So I repeated to my self the idea again and again while the dentist worked on my tooth. As soon as he finish I went home and wrote my idea. It worked. I got the extra credit.

I wanted an analytical way to prove that the limit as x approached 0 of sin(x)/x was 1. I knew that the squeeze Theorem could be a way of doing that but it didn't satisfy me. It wasn't until I learned of the Taylor series approximation for sin(x) and then learned what the Taylor series approximation was for me to understood and appreciated it

I realized the importance of taylor series while studying optimization, in the context of the netwon method.

I first appreciated them when I was taking Robert Ghrist's Calculus class on Coursera. The approach he took followed his "Funny Little Calculus Text", which is available for free on-line. His course revolved around Taylor series and the exponential function. That was where I realized that 1) Taylor series are really general, and 2) they're really easy to integrate and differentiate!

I always felt that I 'got' tangent line approximations, since they are intuitive, easy to visualize, etc. It was also pretty clear that tangent line approximations are generally much better approximations than constant approximations. Taylor series really clicked for me when I realized that you could think of a Taylor polynomial as taking a tangent line approximation of the nth derivative of the function, and then integrating back up. The factorials made sense, as well as how the error formulas could be easily derived.

This isn't rigorously correct, but I think it is still a valid intuition of what is going on. I also had a lot of trouble with Taylor series when I first took Calc II. After a lot of thinking, I came up with an analogy that emphasizes that the (somewhat imposing) Taylor series formula is just a trick to match the derivatives of the function you are approximating. I think most students have some intuition on derivatives being rates of change. Suppose I have some (analytic) function f(x), and its Taylor series T(x) about 0. How do I know they are the same? Imagine f describes your position in a car, and T my position in a car. The formula guarantees that f(0) = T(0), so at time 0 we are in exactly the same location. How about a little time delta_t later? Well, all of our derivatives are the same, so we've moved exactly the same distance in that amount of time. Alternatively, you can think about what would be required if you were to ever pass me in your car. You would have to accelerate more than me (since at time 0, we were in the same position and the same velocity), which would require your second derivative being bigger than mine. But at time 0 our second derivatives are the same, so this requires your third derivative being bigger than mine. But at time 0 our third derivatives are the same, so this requires your fourth derivative to be bigger than mine... on and on for all of our derivatives. Basically, I imagine us being in lock-step, because if we weren't, our derivatives would differ somewhere, but they can't, because the n+1st derivative describes how the nth derivative changes, and all our derivatives match at time 0. This is my idea for why T(x) = f(x) everywhere. Sorry I am not great at describing intuition, but this is about where the formula made sense to me.

For me it was how the Taylor series can be used to calculate (or approximate) the trigonometric functions. In the case of the trigonometric functions, the function is not explicitly stated in terms of its argument (as for example in f(x) = 2x), so they cannot be calculated from their definition. I think the most important idea about the Taylor series is that we don't necessarily need to know how to compute the original function in order to get a sufficiently precise approximation of it. By taking some property of the original function which is impossible to solve (here the successive derivatives at a point), and by constructing a function that is easy to solve and satisfies these properties, it is actually possible to solve a problem that was initially impossible.

Probably not the first time I understood taylor series, but definitely the memory that stuck with me the most: We were doing Ricatti Comparison. The Ricatti-operator is of the form $A(t)=J^{-1}(t)*J'(t)$ for some $J=t*1-\frac{1}{6} t^3 R(0) + o(t^4)$. So we wanted to solve $J^{-1} J'$ and wrote it out. I had no idea where we were going (please no coordinates!) and suddenly the prof just flips the sign on the t^3 term and drops the ^-1. Noone had any idea what was going on and he just told us to taylor 1/id. In hindsight it's really trivial (and super useful!) but right then it really blew my mind. (Also it meant no coordinates for the proof!) On a (mildly interesting) side note, I got some intuition on the error handling when I watched your last video and noticed, that Taylor expansion is equivalent to repeated nested iteration of f(x)=f(0)+ \int_0^t f'(\tau) d\tau (next iteration would be the same formula for f' in terms of f'' and so on). I always knew there were a bunch of approximations for the error but now I can easily get it as n integrals of f^(n) from 0 to t (which goes to 0 really fast if you know f^(n) is bounded by something that gets dominated by factorials)

Jan Nienhaus

I only realized its importance in my physics classes once I realized that the exact solution is usually too complicated to gain any intuition from. In fact, my first discussion in honors physics was a in-depth review of the Taylor series, so I would say physics is their biggest use case. Also, it may be too high level but in using Newton's Method for minimization, there is a nice intuition of approximating the function as parabola (paraboloids) at each iteration then jumping directly to the minimum of the approximation and continuing. This is a great use case that crosses all disciplines since everyone loves optimization (or non-linear root finding).

They have not really clicked for me either, but for what it is worth, the reason I am interested in them is that they are used to approximate functions in performing gradient descent for machine learning.

For me the key insight was that you could actually create _any_ sequence of functions — polynomial or otherwise — and claim that they approximated any other function, as long as you bundled the difference into an error term. There's nothing about the definition of an error term which says that you're working with an approximation that's any good. What's unique about the Taylor series is that the _sequence of error terms_ always approaches zero. They clicked for me as soon as I realized how special this is.

Karen

I don't remember exactly when I "got it", but I do remember one revolutionary moment: adding more and more terms to a series approximating a sinusoid and watching as it incrementally increased the interval on which the approximation worked.

The Taylor Series clicked for me is after learning Integration By Parts. I was working with an integral like the following: $\int_{}^{} xe^x dx$ The normal way you'd apply integration by parts is to simplify the expression. I started thinking about what would happen if I went the wrong way and made it more complex instead. After doing that for a couple of iterations I saw that it made an infinite series. I then started working with other functions such as doing integration by parts the wrong way on: $\int_{}^{} e^-x dx$ The above function is handy since you always have positive sums when applying integration by parts the wrong way. With the above e^-x integral and the appropriate manipulations you can get the taylor series for e^x.

Not quite an answer to the question you posed. But if you do have the time, would you mind exploring the Indian Mathematician Madhava's contribution to infinite series? I would like an independent assessment of his actual contribution. There is a lot of talk among RW Hindu chauvinists. I suspect much of it is exaggerated!

Wasn't my click experience but a couple of weeks ago Mathologer used them to prove the irrationality of e, which I'd never seen before and it was super easy to follow and exciting.

Matthew O'Connor

I seem to remember that at first only the linear part of the expansion really clicked for me. The higher order derivatives I found more difficult to visualise and I didn't really understand why there was a factorial underneath.

Sanjeevan Ahilan

This came in layers for me so I can't say I really understand it but I'll give you what I understand so far. 1. I got some idea of how it worked in calc 1. Namely that you can start with a linear approximation (that you can get from a slope at a point) and then continue the pattern but I don't think this was really understanding it. 2. I was interested in exact real arithmetic and they came up as approximations you could compute a good upper bound on error for. 3. A friend was showing me how sheaves worked (I still need more work there) but in the process he got me to think about the local structure of a function. Well the local structure of a function is effectively the derivatives of the function at that point. And you can take all of that structure and compute the function at that point! That's what a Taylor series expansion is. It's a characterization of a function at a point.

Jake Ehrlich

It never did so far :) I am still in 12th grade, so I didn't study Taylor series at school yet. My private teacher taught me what it is and stuff, but I still don't know how to build one, so to speak, but I think I can see why you would need one.

Limits. They approximate limits fairly well not being too-much-complex functions.


Related Creators