From "100 Great Problems of Elementary Mathematics: Their History and Solution", by Heinrich Doerrie, translated by David Antin, Dover, 1965. ISBN 0-486-61348-8. Sturm's Problem of the Number of Roots "Find the number of real roots of an algebraic equation with real coefficients over a given interval". This very important algebraic problem was solved in a surprisingly simple way in 1829 by the French mathematician Charles Sturm (1803-1855). The paper containing the famous Sturm theorem appeared in the eleventh volume of the Bulletin des sciences de Ferussac and bears the title, "Memoire sur la resolution des equations numeriques." "With this discovery," says Liouville, "Sturm at once simplified and perfected the elements of algebra, enriching them with new results." SOLUTION. We distinguish two cases: I. The real roots of the equation in question are all simple over the given interval. II. The equation also possesses multiple real roots over the interval. We will first show that the second case leads us back to the first. Let the prescribed equation F(x)=0 have the distinct roots alpha, beta, gamma, ..., and let the root alpha be a-fold, beta b-fold, gamma c-fold, ..., so that F(x) = (x-alpha)^a (x-beta)^b (x-gamma)^c ... For the derivative F'(x) of F(x) we obtain F'(x)/F(x) = a/(x-alpha) + b/(x-beta) + c/(x-gamma) + ... a(x-beta)(x-gamma)(x-delta)...+b(x-alpha)(x-gamma)(x-delta)...+... = ------------------------------------------------------------------ (x-alpha)(x-beta)(x-gamma)... If we then call the numerator of this fraction p(x) and the denominator q(x) and set the whole rational function F(x)/q(x) equal to G(x), then F(x) = G(x) q(x) and F'(x) = G(x) p(x). Now the functions p(x) and q(x) have no common divisor. (The factor x-beta of q(x) may, for example, go into all the terms of p(x) except the second with no remainder.) It follows from this that G(x) is the greatest common divisor of F(x) and F'(x). This can be determined easily from the divisional algorithm and can therefore be considered known, as a result of which q(x) is known also. The equation F(x)=0 then falls into the two equations q(x)=0 and G(x)=0, the first of which possesses only simple roots, while the second can be further reduced in the same way that F(x)=0 was. An equation with multiple roots can therefore always be transformed into equations (with known coefficients) possessing only simple roots. Consequently, it is sufficient to solve the problem for the first case. Let f(x)=0 be an algebraic equation all of whose roots are simple. The derivative f'(x) of f(x) then vanishes for none of these roots and the highest common divisor of the functions f(x) and f'(x) is a constant K that differs from zero. We use the divisional algorithm to determine the highest common divisor of f(x) and f'(x), writing, for the sake of convenience in representation, f0(x) and f1(x) instead of f(x) and f'(x), and calling the quotients resulting from the successive divisions q0(x), q1(x), q2(x), ... and the remainders -f2(x), -f3(x), .... If we also drop the argument sign for the sake of brevity, we obtain the following scheme: (0) f0 = q0 f1 - f2 (1) f1 = q1 f2 - f3 (2) f2 = q2 f3 - f4, etc. In this scheme there must at last appear -- at the very latest with the remainder K -- a remainder -fs(x) that does not vanish at any point of the interval and consequently possesses the same sign over the whole interval. Here we break off the algorithm. The functions involved f0, f1, f2, ..., fs form a "Sturm chain" and in this connections are called Sturm functions. The Sturm functions possess the following three properties: 1. Two neighboring functions do not vanish simultaneously at any point of the interval. 2. At a null point of a Sturm function its two neighboring functions are of different sign. 3. Within a sufficiently small area surrounding a zero point of f0(x), f1(x) is everywhere greater than zero or everywhere smaller than zero. PROOF OF 1. If, for example, f2 and f3 vanish at any point of an interval, f4 [according to (2)] also vanishes at this point, and consequently f5 also [according to (3)], and so forth, so that finally [according to the last line of the algorithm] fs also vanishes, which, however, contradicts our assumption. PROOF OF 2. If the function f3 vanishes at the point sigma, for example, of the interval, then it follows from (2) that f2(sigma) = -f4(sigma). PROOF OF 3. This proof follows from the known theorem: A function [f0(x)] rises or falls at a point depending on whether its derivative [f1(x)] at that point is greater or smaller than zero. We now select any point x of the interval, note the sign of the values f0(x), f1(x), ..., fs(x), and obtain a "Sturm sign chain" (to obtain an unequivocal sign, however, it must be assumed that none of the designated s+1 function values is zero). The sign chain will contain sign sequences (++ and --) and sign changes (+- and -+). We will consider the number Z(x) of sign changes in the sign chain and the changes undergone by Z(x) when x passes through the interval. A change can occur only if one or more of the Sturm functions changes sign, i.e., passes over from negative (positive) values through zero to positive (negative) values. We will accordingly study the effect produced on Z(x) by the passage of a function fv(x) through zero. Let k be a point at which fv disappears, h a point situated to the left, and l a point to the right of k and so close to k that over the interval h to l the following holds true: (1) fv(x) does not vanish except when x=k; (2) every neighbor (fv+1,fv-1) of fv does not change sign. We must distinguish between the cases v>0 and v=0; in the first case we are concerned with the triplet fv-1, fv, fv+1, in the second, with the pair f0, f1. In the triplet, fv-1 and fv+1 possess either the + and - sign or the - and + sign at all three points h, k, l. Thus, whatever the sign of fv may be at these points, the triplet possesses one change of sign for each of the three arguments h, k, l. The passage through zero of the function fv does not change the number of sign changes in the chain! In the pair, f1 has either the + or - sign at all three points h, k, l. In the first case, f0 is increasing and is thus negative at h and positive at l. In the second case, f0 is decreasing and is positive at point h, and negative at l. In both cases a sign change is lost. From our investigation we learn that: The Sturm sign chain undergoes a change in the number of sign changes Z(x) only when x passes through a null point of f(x); and specifically, the chain then loses (with an increasing x) exactly one sign change. Thus, if x passes through the interval (the ends of which do not represent roots of f(x)=0) from left to right, the sign chain loses exactly as many sign changes as there are null points of f(x) within the interval. Result: STURM'S THEOREM: The number of real roots of an algebraic equation with real coefficients whose real roots are simple over an interval the end points of which are not roots is equal to the difference between the numbers of sign changes of the Sturm sign chains formed for the interval ends. NOTE. The same considerations can also be applied unchanged to the series formed when we multiply f0, f1, f2, ..., fs by any positive constants; this series is then likewise designated as a Sturm chain. In the formation of the Sturm function chain all fractional coefficients are accordingly avoided. EXAMPLE 1. Determine the number and situation of the real roots of the equation x^5 - 3 x - 1 = 0. The Sturm chain is f0 = x^5 - 3 x - 1, f1 = 5 x^4 - 3, f2 = 12 x + 5, f3 = 1. The signs of f for x = -2, -1, 0, +1, +2 are ----------------------- x | f0 | f1 | f2 | f3 ----------------------- -2 | - | + | - | + -1 | + | + | - | + 0 | - | - | + | + +1 | - | + | + | + +2 | + | + | + | + ----------------------- The equation thus has three real roots: one between -2 and -1, one between -1 and 0, one between +1 and +2. The other two roots are complex. EXAMPLE 2. Determine the number of real roots of the equation x^5-ax-b=0 when a and b are positive magnitudes and 4^4 a^5 > 5^5 b^4. The Sturm chain reads x^5-ax-b, 5x^4-a, 4ax+5b, 4^4 a^5 - 5^5 b^4. For the values x = -oo [- infinity] and +oo it has the signs -+-+ and ++++, respectively. The equation has three real roots and two complex roots.