Marx's Mathematical Manuscripts 1881

I. First Drafts

Written: August, 1881;
Source: Marx's Mathematical Manuscripts, New Park Publications, 1983;
First published: in Russian translation, in Pod znamenem marksizma, 1933.

Newton: born 1642, †1727 (85 years old). Philosophiae naturalis principia mathematica (first published 1687; c.f. Lemma I and Lemma XI, Schol.)

Then in particular: Analysis per quantitatum series fluxiones etc., first published 1711, but composed in 1665, while Leibnitz first made the same discovery in 1676.

Leibnitz: born 1646, †1716 (70 years old).

Lagrange: born 1736, †during the Empire (Napolean I); he is the discoverer of the method of variations. Théorie des fonctions analytiques (1797 and 1813).

D’Alembert: born 1717, †1783 (66 years old). Traité des fluides, 1744.

1) Newton. The velocities or fluxions, of for example the variables x, y etc. are denoted by x.,y. Etc. For example if u and x are connected quantities (fluents) generated by continuous movement, then u. and x. denote their rates of increase, and therefore u./x. the ratio of the rates at which their increments are generated.

Since the numerical quantities of all possible magnitudes may be represented by straight lines, and the moments or infinitely small portions of the quantities generated = products of their velocities and the infinitely small time intervals during which these velocities exist,⁵⁶ so then [we have] τ denoting these infinitely small time intervals, and the moments of x and y represented by τx. and τy., respectively.

For example: y = uz; [with] y., z., u. denoting the velocities at which y, z, u respectively [are] increasing, then the moments of y., z., u. are τy., τz., τu., and we obtain

y = uz ,

y + τy. = (u + τu.) (z + τz.) = uz + uτz. + zτu. + τ²u.z. ;

hence

τy. = uτz. + zτu. + τ²u.z. .

Since τ is infinitesimal, it disappears by itself and even more as the product τ²u.z. altogether, since it is not that of the infinitely small period of time τ, but rather its 2nd power.

(If τ = 1/million, then τ² = 1/(1 million × 1 million)).

We thus obtain

y. = u.z + z.u ,

or the fluxion of y = uz is u.z +z.u .⁵⁷

2) Leibnitz. The differential of uz is to be found.

u becomes u + du, z becomes z + dz; so that

uz + d(uz) = (u + du) (z + dz) = uz + udz +zdu + dudz .

If from this the given quantity uz is subtracted, then there remains udz + zdu + dudz as the increment; dudz, the product d’un infiniment petit du par un autre infiniment petit dz, (of an infinitely small du times another infinitely small dz)^* is an infinitesimal of the second order and disappears before the infinitesimal udz and zdu of the first order; therefore

d(uz) udz + zdu .⁵⁸

[3)] D’Alembert. Puts the problem is general terms thus:

If [we have]

y = f(x) ,

y₁ = f(x + h) ;

[we are] to determine what the value of (y₁ - y)/h becomes when the quantity h disappears, and thus what is the value of 0/0.⁵⁹

____________

Newton and Leibnitz, like the majority of the successors from the beginning performed operations on the ground of the differential calculus, and therefore valued differential expressions from the beginning as operational formulae whose real equivalent is to be found. All of their intelligence was concentrated on that. If the independent variable x goes to x₁, then the dependent variable goes to y₁⋅x₁ - x, however, is necessarily equal to some difference, let us say, = h. This is contained in the very concept of variables. In no way, however, does it follow from this that this difference, which = dx, is a vanished [quantity], so that in fact it = 0. It may represent a finite difference as well. If, however, we suppose from the very beginning that x, when it increases, goes to x + x. (the τ which Newton uses serves no purpose in his analysis of the fundamental functions and so may be suppressed⁶⁰), or, with Leibnitz, goes to x + dx, then differential expressions immediately become operational symbols (Operationssymbole) without their algebraic origin being evident.

To 15^*2 (Newton).

Let us take Newton’s beginning equation for the product uz that is to be differentiated; then:

y = uz ,

y + τy. = (u + u.τ) (z + z.τ) .

If we toss out the τ, as he does himself if you please, after he develops the first differential equation, we then obtain:

y + y. = (u + u.) (z + z.) ,

y + y. = uz + u.z + z.u + z.u. ,

y + y. - uz = u.z + z.u + u.z. .

So that, since uz = y,

y. = u.z + z.u + u.z. .

And in order to obtain the correct result u.z. must be suppressed.

Now, whence arises the term to be forcibly suppressed, u.z.?

Quite simply from the fact that the differentials y. of y, u. of u, and z. of z have from the very beginning been imparted by definition^*3 a separate, independent existence from the variable quantities from which they arose, without having been derived in any mathematical way at all.

On the one hand one sees what usefulness this presumed existence of dy, dx or y., x. has, since from the very beginning, as soon as the variables increase I have only to substitute in the algebraic function the binomials y + y., x + x. etc. and then may must manipulate (manövrieren) these themselves as ordinary algebraic quantities.

I obtain, for example, if I have y = ax:

y + y. = ax + ax. ;

so that

y - ax + y. = ax. ;

hence

y. = ax. .

I have therewith immediately obtained the result: the differential of the dependent variable is equal to the increment of ax, namely ax.; it is equal to the real value a derived from ax^*4 (that this is a constant quantity here is accidental and does nothing to alter the generality of the result, since it is due to the circumstance that the variable x appears here to the first power). If I generalise this result,⁶¹ then I know y = f(x), for this means that y is the variable dependent on x. If I call the quantity derived from f(x), i.e. the real element of the increment, f’(x), then the general result is:

y. = f’(x)x. .

I thus know from the very beginning that the equivalent of the differential of the dependent variable y is equal to the first derived function of the independent variable, multiplied by its differential, that is dx or x. .

So then, generally expressed, if

y = f(x)

then

dy = f’(x)dx

or y. = the real coefficient in x (except where a constant appears because x is to the first power) times x..

But y. = ax. Gives me immediately y./x. = a, and in general:

y./x. = f’(x) .

I have thus found for the differential and the differential coefficients two fully-developed operational formulae which form the basis of all of differential calculus.

And furthermore, put in general terms, I have obtained, by means of assuming dx, dy etc. or x., y. etc. to be independent, insulated increments of x and y, the enormous advantage, distinctive to the differential calculus, that all functions of the variables are expressed from the very beginning in differential forms.

Were I thus to develop the essential functions of the variables in this manner, such as ax, ax±b, xy, x/y, xⁿ, a^x, log x, as well as the elementary trigonometric functions then the determination of dy, dy/dx would thus become completely tamed, like the multiplication table in arithmetic.

If we now look, however, on the reverse side we find immediately that the entire original operation is mathematically false.

Let us take a perfectly simple example: y = x². If x increases then it contains an indeterminate increment h, and the variable y dependent on it has an indeterminate increment k, and we obtain

y + k = (x + h)² = x² + 2hx + h² ,

a formula which is given to us by the binom[ial theorem].

Therefore

y + k - x² or y + k - y = 2hx + h² ;

hence

(y + k) - y or k = 2hx + h² ;

if we divide both sides by h then:

k/h = 2x + h .

We now set h = 0, and this becomes

2x + h = 2x + 0 = 2x .

On the other side, however, k/h goes to k/0 . Since, however, y only went to y + k because x went to x + h, and then y + k goes back to y when h goes to 0, therefore when x + h goes back to x + 0, to x. So then k also goes to 0 and k/0 = 0/0, which may be expressed as dy/dx or y./x.. We thus obtain:

0/0 or y./x. = 2x .

If on the other hand we [substitute h = 0] in

y + k - x² = 2hx + h² or (y + k) - y = 2xh + h²

(h is only replaced by the symbol dx after it has previously been set equal to 0 in its original form), we then obtain k = 0 + 0 = 0, and the sole result that we have reached is the insight into our assumption, merely that y goes to y + k, if x goes to x + h ... so that if x + h = x + 0 = x, then y + k = y, or k = 0.

In no way do we obtain what Newton makes of it:

k = 2xdx + dxdx

or, in Newton’s way of writing:

y. = 2xx. + x.x. ;

h only becomes x., and therefore k becomes y., as soon as h has passed the hellish ride through 0, that is, subsequent to the difference x₁ - x(or (x + h) - x) and therefore that of y₁ - y as well (= (y + k) - y) having been reduced to their absolutely minimum expressions (Mimimalausdruck), x - x = 0 and y - y = 0.

Since Newton, however, does not immediately determine the increments of the variables x, y, etc by means of mathematical derivation, but instead immediately stamps x., y., etc on to the differentials, they cannot be set = 0; for otherwise, were the result 0, which is algebraically expressed as setting this increment from the very beginning = 0, it would follow from that, just as above in the equation

(y + k) - y = 2xh + h² ,

h would immediately be set equal to 0, therefore k = 0, and consequently in the final analysis we would obtain 0 = 0. The nullification of h may not take place prior to the first derived function of x, here 2x, having been freed of the factor h through division, thus:

(y₁ - y)/h = 2x + h .

Only then may the finite differences be annulled. The differential coefficient

dy/dx = 2x

therefore also must have previously been developed,⁶² before we may obtain the differential

dy = 2xdx

Therefore nothing more remains than to imagine the increments h of the variable to be infinitely small increments and to give them as such independent existence, in the symbols x., y. etc. or dx, dy [etc.] for example. But infinitely small quantities are quantities just like those which are infinitely large (the word infinitely (unendlich) [small] only means in fact indefinitely (unbestimmt) small); the dy, dx etc. or y., x. [etc.] therefore also take part in the calculation just like ordinary algebraic quantities, and in the equation above

(y + k) - y or k = 2xdx + dxdx

the dxdx has the same right to existence as 2xdx does; the reasoning (Raisonnement) is therefore most peculiar by which it is forcibly suppressed, namely, by direct use of the relativity of the concept because it is infinitely small compared to dx, and thus as well to 2xdx, or to 2xx. ...

But (Oder), if in

y. = u.z + z.u + u.z.

the u.z. is suppressed because it is infinitely small compared to u.z or z.u, then one would thereby be forced to admit mathematically that u.z + z.u is only an approximate value (Annäherungswert), in imagination as close as you like. This type of manoeuvre occurs also in ordinary algebra.

But then in walks the still greater miracle that by this method you don’t obtain an approximate value at all, but rather the unique exact value (even when as above it is only symbolically correct) of the derived function, such as in the example

y. = 2xx. +x.x. .

If you suppress here x.x., you then obtain:

y. = 2xx.

and

y./x. = 2x ,

which is the correct first derived function of x², as the binom[ial theorem] has already proved.

But the miracle is no miracle. It would only be a miracle if no exact result emerged through the forcible suppression of x.x.. That is to say, one suppresses merely a computational mistake which nevertheless is an unavoidable consequence of a method which brings in the undefined increment of the variable, i.e. h, immediately as the differential dx or x., a completed operational symbol, and thereby also produces from the very beginning in the differential calculus a characteristic manner of calculation different from the usual algebra.

___________

The general direction of the algebraic method which we have applied may be expressed as follows:

Given f(x), first develop the ‘preliminary derivative’, which we would like to call f¹(x):

1) f¹(x) = Δy/Δx or Δy/Δx = f¹(x) .

From this equation it follows

Δy = f¹(x)Δx .

So that as well

Δf(x) = f¹(x)Δx

(since y = f(x), [thus] Δy = Δf(x) ) .

By means of setting x₁ - x = 0, so that y₁ - y = 0 as well, we obtain

[2)] dy/dx = f’(x) .

Then

dy = f’(x)dx ;

so that also

df(x) = f’(x)dx

(since y = f(x), dy = df(x)).

When we have once developed

1) Δf(x) = f¹(x)dx

then

2) df(x) = f’(x)dx

is only the differential expression of 1).

[____________]

1) If we have x going to x₁, then

A) x₁ - x = Δx ;

whence the following conclusions may be drawn

Aa) Δx = x₁ - x ; a) x₁ - Δx = x ;

Δx, the difference between x₁ and x, is therefore positively expressed as the increment of x; for when it is subtracted again from x₁ the latter returns once more to its original state, to x.

The difference may therefore be expressed in two ways: directly as the difference between the increased variable and its state before the increase, and this is its negative expression ; positively as the increment,^*5 as a result : as the increment of x to the state in which it has not yet grown, and this is the positive expression.

We shall see how this double formulation plays a role in the history of differential calculus.

b) x₁ = x + Δx .

x₁ is the increased x itself; its growth is not separated from it; x₁ is the completely indeterminate form of its growth. This formula distinguishes the increased x, namely x₁, from its original form prior to the increase, from x, but it does not distinguish x from its own increment. The relationship between x₁ and x may therefore only be expressed negatively, as a difference, as x₁ - x. In contrast, in

x₁ = x + Δx

1) The difference is expressed positively as an increment of x.

2) Its increase is therefore not expressed as a difference, but instead as the sum of itself in its original state + its increment.

3) Technically x is expelled from its monomial into a binomial, and wherever x appears to any power in the original function a binomial compose of itself and its increment enter for the increased x; the binomial (x+h)^m in general for x^m. The development of the increase of x is therefore in fact a simple application of the binomial theorem. Since x enter as the first and Δx as the second term of this binomial - which is given by their very relationship, since x must be [there] before the formation of its increment Δx - by means of the binomial, in the event only the functions of x will be derived, while Δx figures next to it as a factor raised to increasing powers; indeed, Δx to the first power must [appear], so that Δx¹ is a factor of the second term of the resulting series, of the first function, that is, of x₁ derived, using the binomial theorem. This shows up perfectly when x is given to the second power. x² goes to (x + h)², which is nothing more than the multiplication of x + Δx by itself, [and which] leads to x² + 2xΔx + Δx² : that is, the first term must be the original function of x and the first derived function of x², namely [2]x here, comprises the second term together with the factor Δx¹, which entered into the first term only as the factor Δx⁰ = 1. So then, the derivative is not found by means of differentiation but rather by means of the application of the binomial theorem, therefore multiplication; and this because the increased variable x₁ takes part from the very beginning as a binomial, x + Δx.

4) Although Δx in x + Δx is just as indefinite, so far as its magnitude goes, as the indefinite variable x itself, Δx is defined as a distinct quantity separate from x like a fruit beside the mother who had previously borne her (als Frucht neben ihrer Mutter, bevor diese geschwangert war).

x + Δx not only expresses in an indefinite way the fact that x has increased as a variable; rather, it [also] expresses by how much it has grown, namely, by Δx.

5) x never appears as x₁; the whole development centres around the increment Δx as soon as the derivative has been found by means of the binomial theorem, by means, that is, of substituting x + Δx for x in a definite way (in bestimmten Grad). On the left-hand side, however, if in (y₁ - y)/Δx, the Δx becomes = 0, it finally appears as x₁ - x again, so that:

(y₁ - y)/Δx = (y₁ - y)/(x₁ - x) .^*6

The positive side, where x₁ - x = 0 takes place, namely x₁ becoming = x, can therefore never enter into the development, since x₁ as such never enters into the side of the resultant series (Entwicklungsreihe); the real mystery of the differential calculus makes itself evident as never before.

6) If y = f(x) and y₁ = f(x + Δx), then we can say that in using this method the development of y₁ solves the problem of finding the derivative.

c) x + Δx = x₁ (so that y + Δy = y₁ as well). Δx here may only appear in the form Δx = x₁ - x, therefore in the negative form of the difference between x₁ and x, and not in the positive form of the increment of x, as in x₁ = x + Δx.

1) Here the increased x is distinguished as x₁ from itself, before it grows, namely from x; but x₁ does not appear as an x increased by Δx, so x₁ therefore remains just exactly as indefinite as x is.

2) Furthermore: however x enters into any original function, so x₁ does as the increased variable in the original function now altered by the increase. For example, if x takes part in the function x³, so does x₁ in the function x₁³.

Whereas previously, by means of substituting (x + Δx) wherever x appeared in the original function, the derivative had been provided ready-made by the use of the binomial, leaving it burdened with the factor Δx and the first of other terms in x burdened with Δx² etc., so no there is just as little which can be derived directly from the immediate form of the monomial - x₁³ - as could be got from x³. It does provide, however, the difference x₁³ - x³. We know from algebra that all differences of the form x³ - a³ are divisible by x - a ; the given case, therefore, is divisible by x₁ - x. In therefore dividing x₁³ - x³ by x₁ - x (instead of, [as] previously, multiplying the term (x + Δx) by itself to the degree specified by the function), we obtain an expression of the form (x₁ - x)P, wherein nothing is affected whether the original function of x contains many terms (and so contains x to various powers) or as in our example is of a single term. This x₁ - x passes by division to denominator of y₁ - y on the left-hand side and thus produces (y₁ - y)/(x₁ - x) there, the ratio of the difference of the function to the difference of the independent variable x in its abstract difference-formula (Differenzform). The development of the difference between the function expressed in x₁ and that expressed in x into terms, all of which have x₁ - x as a factor, may well require algebraic manipulation (Manöver) to a greater or lesser degree, and thus may not always shed as much light as the form x₁³ - x³. This has no effect on the method.

When by is nature the original function does not allow the direct development into (x₁ - x)P, as was the case with f(x) = uz (two variables both dependent on x), (x₁ - x) appears [in] the factor 1/(x₁ - x). Furthermore, after the removal of x₁ - x to the left hand side by means of dividing both sides by it, x₁ - x still continues to exist in P itself (as, for example, in the derivation from y = a^x, where we find

(y₁ - y)/(x₁ - x) = (a^{x}){(a - 1) + ((x₁ - x) - 1)/1⋅2 + etc.},

where setting x₁ - x = 0 produces

= (a^{x}){(a - 1) - (a - 1)²/2 + (a - 1)³/3 - etc.} ;

this is only possible when, as in the example just given, it so happens that setting x₁ - x = 0 [allows] it to disappear, and then always leaves positive result behind in its place. In other words the (x₁ - x)s left behind in P may not be combined with the rest of the elements of P as factors (as multiplicands). P would otherwise be factorable into P = p(x₁ - x), and then, since x₁ - x has already been set = 0, into p⋅0; hence P = 0 ...⁶³

___________

The first finite difference, x₁³ - x³, where y = x³ and y₁ = x₁³, has therefore been evolved to

y₁ - y = (x₁ - x)P ,

hence

(y₁ - y)/(x₁ - x) = P .

P, an expression combining x₁ and x, is = f¹, the derivative of the first finite difference, whence x₁ - x has been quite eliminated, as well as those of higher degree, (x₁ - x)² etc. x₁ and x may therefore only be combined in positive expressions, such as x₁ + x, x₁x, x₁/x, sqrt{x₁x} etc. Were therefore x₁ to be now set = x, these expressions would then become 2x, x², x/x or 1, sqrt{xx} or x etc., respectively, and only on the left-hand side, where x₁ - x comprises the denominator, is 0 produced and therefore the symbolic differential coefficient etc.

____________

⁵⁶ This conclusion (due to Newton) requires clarification: ‘since the numerical quantities of all possible magnitudes may be represented as straight lines’, the variation of any quantity may be represented as a sort of linear motion of a variable velocity. And since during an infinitely small interval of time the speed of motion can be considered to be fixed, then the path, nearly a point, corresponding to this small time interval (of course corresponding also to the variation of our quantity) is equal to the product of this speed (fluxion) and the infinitely small time interval, τ. Therefore ‘moments, or infinitely small portions of the quantities generated = the products of their velocities and the infinitely small time intervals’. Regarding the metaphysical nature of Newton’s attempt to provide a basis for the concepts of ‘fluent’, ‘fluxion’, and ‘moment’, corresponding to our ‘function’, ‘derivative’, and ‘differential’, defining them in terms of mechanics, see Appendix II, pp156-157.
⁵⁷ It was explained in Note 49 that Marx intended to return to the illumination of the history of the development of differential calculus by means of the example of the history of the theorem on the differential of a product. So he left a vacant space following his unfinished extract from Hind’s text. There, after being repeated one more time this section is introduced as an example of the very theorem on the differential of a product in Newton’s treatment. (This theorem is introduced as example 3 in Hind’s textbook; see Hind, p.109.)
^* In French in the original. - Trans.
⁵⁸ In Hind’s textbook Leibnitz’s method is not illustrated in the example of the theorem on the differential of a product, so Marx turned to Boucharlat’s textbook. This paragraph is an extract from the latter work (see Boucharlat, p.165).
⁵⁹ This sentence appears in the extract from Hind’s textbook cited above (Hind, p.106). Further on, however, Marx does not introduce the theorem on the differential of a product as developed by Hind. After this text follow five pages in Marx’s notebook which have been omitted (pp.16-20). They deal primarily with calculations concerning theorems on the differentiation of fractional and compound functions as well as the solution of problems related to the parabolic curve y² = ax. We retain only the comments, written at intervals on pp.16-18, in which Marx emphasizes the fact that Newton and Leibnitz began immediately with the operational formulae of differential calculus. Then under the rubric ‘Ad Newton’ Marx subjects these methods of Newton and Leibnitz to the criticism that all such methods, notwithstanding all the advantages they bring, inevitably imply the introduction of actually infinitely small quantities and their attendant difficulties. Here again the theorem on the differential of a product is used as the basic example.
⁶⁰ By x., y., z. Newton and his followers usually signified the rate of change (fluxion) of the variables x, y, z (fluents) the derivatives, that is, of x, y, z, with respect to that variable which plays the role of ‘time’; by τx., τy., τz. they designated the ‘moments’ corresponding to the Leibnitzian differentials or infinitely small increments. However, the Newtonians often also used x., y., z. for the ‘moments’ or differentials. See Appendix III p.160.
^*2 See pages 49-51 in this edition.
^*3 Original: ‘Difinition’, presumably ‘Definition’ - Trans.
^*4 That is, y./x. = a - Trans.
⁶¹ This discusses the heuristic generalisation where, in the formula y. = ax. , (1) y is simply treated as a certain function f(x), while the constant a becomes a new function f’(x) derived from this f(x); according to this formula (1) becomes a special case of the more general formula y. = f’(x)x. . (2) Since x.,y. Are treated as increments, even though infinitely small, the factor f’(x) is therefore a function not only of x but also of x.; the ‘derived’ function f’(x) in formula (2) turns out not to be independent of x.. It is exactly this fact (which compelled the Newtonians to suppress forcibly the terms containing x., even though the latter must be different from zero for formula (2) to have any meaning) which serves as the basis for the critique of the Newtonian definition of the derivative of the function y = f(x) as the ratio y./x., to which Marx returns several lines below.
⁶² That is, obtained in the form of a ‘real’ expression, not containing differential symbols.
^*5 Marx added here in pencil ‘or decrement’ - Ed.

^*6 Marx added here in pencil : Δy/Δx. - Ed.

⁶³ Several more lines of unclear meaning are omitted.