Talk:Big O notation/Archive 3

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Archive 3

Archive 4

Definition of f(x) = O(g(x)) as x\to a

Currently, the definition reads:

[...] if and only if there exist positive numbers δ and M such that

|f(x)| ≤ M |g(x)|       for  |x − a| < δ

I believe that most people in the field of analysis would not require the inequality to hold when x=a, i.e., the last part should read "for 0<|x-a|<\delta". As a reference, consider "Introduction to perturbation methods" by M. H. Holmes. (Shanghai Horst (talk) 03:46, 31 October 2016 (UTC)) — Preceding unsigned comment added by Shanghai Horst (talk • contribs) 03:45, 31 October 2016 (UTC)

You are right. I modified the definition. Sapphorain (talk) 08:19, 31 October 2016 (UTC)

Equivalent definitions of Big O notation for multivariate functions defined on \mathbb{R}^n

The article currently states "Equivalently, the condition that {\displaystyle x_{i}\geq M} x_{i}\geq M for some {\displaystyle i} i can be replaced with the condition that {\displaystyle \|{\vec {x}}\|_{\infty }\geq M} {\displaystyle \|{\vec {x}}\|_{\infty }\geq M}, where {\displaystyle \|{\vec {x}}\|_{\infty }} {\displaystyle \|{\vec {x}}\|_{\infty }} denotes the Chebyshev norm. "

This is true of all norms, not just the Chebyshev norm, as all norms on a finite dimensional euclidean space are comparable. — Preceding unsigned comment added by 73.44.30.231 (talk) 00:53, 21 November 2016 (UTC)

Inconsistencies

The long quote given in the subsection "Other notation" is not appropriate as it is. Either it contains notation which is not defined in the page ("n belongs to n^2"; θ(…)), or it contains gross misprints. In both cases it should be replaced by another reference. Sapphorain (talk) 08:45, 30 November 2016 (UTC)

There are three long quotes from the same book in the subsection "Other notation". I pointed several times that they contained inconsistencies, either from original misprints from the text, or because the contributor didn't copy them correctly. Some of these inconsistencies were corrected by the said contributor who apparently (?) has a copy of the book available. But some are still in the text (in the definition of O(g(n)), n appears to be also a variable, this is absurd; "f(n) belongs to g(n)" doesn't make sense, etc). Now if the original text is so smeared by misprints that it's unreadable, it is not our role to guess how to correct them, and the source is not a good source and should be replaced by something else. (In addition, two of these quotes are not really about the symbol O(…), but contain explanations involving the notation Θ(…); and the reader doesn't know at this point what Θ is, since it is only briefly refered to at the very end of the page: this is not adequate). Sapphorain (talk) 16:07, 30 November 2016 (UTC)

A sloppy formulation such as

O(g(n))=\{f(n):

there exist positive constants c and n₀ such that

0\leq f(n)\leq cg(n)

for all

n\geq n_{0}\}

,

although understandable by a mathematician, and possibly accepted by a sloppy mathematician, is just not acceptable in an encyclopedia. The symbol n can reasonably only denote each number satisfying the condition described on the right side of the equality, and thus belonging to some variable (depending on the choice of the function f) set of numbers. The two first occurrences of the symbol n simply don't make sense and should be suppressed. Sapphorain (talk) 23:06, 30 November 2016 (UTC)

… I suppressed them in the encyclopedia's text, and (of course) kept them in the quote from the source. Sapphorain (talk) 22:41, 1 December 2016 (UTC)

Issues on "Big omega notation"

The section on omega notation has several issues:

It begins not by giving an introduction to the topic but instead telling that "There are two [...] definitions of the statement [...]" and missing the definitions altogether.
The definition names are not mentioned in the introduction.
The first (Hardy-Littlewood's) definition is written in mathematical jargon, making it difficult to understand to those that are just trying to achieve a basic understanding of the topic.
The second (Knuth's) definition is incomplete. It erroneously presents the statement $f(x)=\Omega (g(x))\Leftrightarrow g(x)=O(f(x))$ as the definition itself, when it is only what Knuth added to Hardy-Littlewood's to make his stronger version of $\Omega$ .
It uses at least the following weasel phrasing to describe a definition: "very widespread"

I propose to address the issues in the following way:

Phrasing the introduction like this, we convey the idea that two definitions exist but we also include the approximate definitions in English: Consider the following statement: $f(x)=\Omega (g(x))$ . In analytic number theory, it means that $f$ is not dominated by $g$ . In computational complexity theory however, it means that $g$ is an asymptotic lower bound for $f$ . Note that the English definitions were taken from the table under the section "Family of Bachmann–Landau notations".
We would then introduce the names and DOB of the two definitions as follows: Chronologically, the first definition was written by Hardy-Littlewood in 1914 and is used in analytic number theory. The second definition was introduced in 1976 by Donald Knuth.
This becomes a non-issue when we introduce the topic as written in (1).
We would change the current text in the following way (changes in bold): In 1976 Donald Knuth published a paper to justify his use of the $\Omega$ -symbol to describe a stronger property. Knuth wrote: "For all the applications I have seen so far in computer science, a stronger requirement […] is much more appropriate". He ~~defined~~ added the following statement to Hardy-Littlewood's definition $f(x)=\Omega (g(x))\Leftrightarrow g(x)=O(f(x))$ with the comment: "Although I have changed Hardy and Littlewood's definition of $\Omega$ , I feel justified in doing so because their definition is by no means in wide use, and because there are other ways to say what they want to say in the comparatively rare cases when their definition applies".
This becomes a non-issue when we introduce the topic as written in (1). Negrulio (talk) 02:00, 20 December 2016 (UTC)

I agree with this modification. I would rather write: The statement $f(x)=\Omega (g(x))$ has two distinct meanings in mathematics. And I would remove the "however" below.
I also agree. With the restriction that a definition is not "written" by his inventor; it is proposed, introduced, stated, etc.
This section is described with precise mathematical notation (and not with "mathematical jargon"). It should stay.
I strongly disagree with this proposition. There is nothing erroneous in presenting $f(x)=\Omega (g(x))\Leftrightarrow g(x)=O(f(x))$ as Knuth's definition (whether or not it was the original way Knuth phrased it). It is the simpler way to express it and it is complete. It is a completely different definition, and is not simply a condition that was "added" to the Hardy-Littlewood definition. (I would rather say that something was removed from it: try for instance to define the symbols $\Omega _{+},\Omega _{-},\Omega _{\pm }$ starting with the Knuth definition…)
"Widespread" is not a weasel word here. Both definitions are very widespread, as a quick look at MathSciNet shows. Sapphorain (talk) 09:30, 20 December 2016 (UTC)

Thanks for your quick reply. Can we compromise on replacing the content under Big Omega notation with the following text?

"The statement $f(x)=\Omega (g(x))$ has two distinct meanings in mathematics. In analytic number theory, it means that $f$ is not dominated by $g$ . In computational complexity theory, it means that $g$ is an asymptotic lower bound for $f$ .

Chronologically, the first definition was introduced by Hardy-Littlewood in 1914 and is used in analytic number theory. The second definition was proposed in 1976 by Donald Knuth.

We now introduce the two definitions."

After this introduction, the The Hardy–Littlewood definition heading would follow. No further changes would be made. Negrulio (talk) 02:03, 21 December 2016 (UTC)

After the displayed

f(x)=\Omega (g(x))

, the precisions

"…

(x\rightarrow a),

where a is some real number, ∞, or −∞, where f and g are real functions defined in a neighbourhood of a, and where g is positive in this neighborhood."

should however not be removed; otherwise neither of the explanations "f is not dominated by g" and "g is an asymptotic lower bound for f" is correct (or makes sense). Moreover both mentioned explanations lack precision: one should rather write "f is not dominated by g asymptotically" and "Cg is an asymptotic lower bound for f for some positive constant C". Sapphorain (talk) 08:46, 21 December 2016 (UTC)

OK. How about,

"Intuitively, the statement $f(x)=\Omega (g(x))$ has two distinct meanings in mathematics. In analytic number theory, it means that $f$ is not dominated by $g$ asymptotically. In computational complexity theory, it means that $g$ is an asymptotic lower bound for $f$ .

"Formally, the statement should be written $f(x)=\Omega (g(x))(x\rightarrow a)$ , where $a$ is some real number, ∞, or −∞, where $f$ and $g$ are real functions defined in a neighbourhood of $a$ , and where $g$ is positive in this neighborhood.

"In analytic number theory, the statement's definition is $\exists k>0\;\forall n_{0}\;\exists n>n_{0}\;f(n)\geq k\cdot g(n)$ , whereas in computational complexity theory it is $\exists k>0\;\exists n_{0}\;\forall n>n_{0}\;f(n)\geq k\cdot g(n)$ .

Chronologically, the first definition was introduced by Hardy-Littlewood in 1914 and is used in analytic number theory. The second definition was proposed in 1976 by Donald Knuth.

We now introduce the two definitions." Negrulio (talk) 15:41, 25 December 2016 (UTC)

In all this I see only one possible improvement to the introduction as it is now, which would be to add after the present version the last paragraph: "In analytic number theory, it means that

|f|

is not dominated by

g

asymptotically. In computational complexity theory, it means that

Cg

is an asymptotic lower bound for

f

, for some positive constant C.

(I see no reason to first give an incomplete and partly false statement; there is nothing "intuitive" about an incomplete and false statement (nor about the fact that this statement has two meanings in mathematics). The third paragraph is confusing where it is as it addresses only the special case where the parameter a is +infinity; it is also redundant, as it is contained anyway just below in the table "Family of Bachmann-Landau notations"; also, it is much clearer to express these definitions directly by using the notions of limsup and liminf and (for the Knuth definition) with reference to the O-notation, as it is also done just below. The last sentence is totally useless). Sapphorain (talk) 22:14, 26 December 2016 (UTC)

Improper use of subscripted Omega symbols

The symbols (Omega_H) and (Omega_K) have only been used in one single article (Vitanyi-Meertens), in which the authors essentially disapprove of the various Knuth definitions of asymptotic symbols, and approve the Hardy-Littlewood definition of Omega: they thus propose to change the definitions of the other Knuth symbols accordingly. (It is of course needless to point it, but let's point it anyway: this proposition has never been adopted by anybody). Moreover the subscripts K and H are used there only in order to label the two different definitions of the various symbols, similarly as a formula label and set in parentheses on one side of the formula or assertion they label. These subscripts have otherwise never been used in the literature, and certainly not in place of the symbol Omega in one of its two usual meanings. It is thus not proper, even for clarity purposes, for a wikipedia article to replace the symbol Omega by subscripted versions which are never used elsewhere. Sapphorain (talk) 22:47, 14 March 2017 (UTC)

Aha, I see we were typing at the same time :). I agree about this. --JBL (talk) 22:57, 14 March 2017 (UTC)

Edits to table

Mathnerd314159 has recent made two edits to the big table in the section Family of Bachmann–Landau notations; I reverted the first, and the second was responsive to my edit summary. However, I still have some qualms about the edit. In particular:

I rather like the informal definitions, which take the (useless and incomprehensible) formal definitions and present them in a few words. (How strongly do I feel that they are better than the limit definitions that have been added instead? Not terribly.)
The citation-dump in the table title sections is silly and not very helpful for purposes of verification.
I do not find the explanation "from smallest to largest" helpful at all; am I correct that it is not supported by the citation? The analogy to orderings on the real line is ok, except for (trivial) it duplicates one symbol and (not trivial) the symbol ≈ has no fixed meaning.
Finally, and most importantly, I object to the invented notations $\Omega _{K}$ and $\Omega _{H}$ . No one uses such notations, and it is not good practice to invent new notations for Wikipedia articles.

I am happy to discuss ways to resolve these issues. (I'll go take care of the trivial one right now.) --JBL (talk) 22:52, 14 March 2017 (UTC)

(1) Can you explain more why "for some positive k" is clearer than "∃ k > 0" or"for every fixed positive number k" is clearer than "∀ k>0"? How do you feel about

\exists k\in \mathbb {R} _{>0}

? (following Positive real numbers) (2) The citation dump is because horizontal space is limited, I had them per-formula before but there's not much room there. (Maybe that wasn't the issue with the table though?) (4) As Sapphorain mentioned, the notations

\Omega _{K}

and

\Omega _{H}

are from the paper "Big Omega versus the wild functions", page 6. I don't feel strongly about them; it does seem like the notation is limited to that section of that paper. (3) The paper also has a line diagram ordering the various notations, which is where I created the phrase "smallest to largest" (i.e., left to right on said diagram). <, = >, etc. are discussed in several places in the paper. ~ and ≈ aren't in there but Knuth mentions that ~ is a subset of Big Theta, so I think the approximation notation is reasonable (≈ is usually used for a "loose sense" of equality; technically I guess ≅ might be more appropriate, as on page 2 Prinsheim is said to have used it for ~, but whatever). --Mathnerd314159 (talk) 23:37, 14 March 2017 (UTC)

Non-sequitur

In response to this edit summary by Sapphorain: the citation is to an article from 1951; presumably 25 = 1976 - 1951. Since Hardy and Littlewood were using their definition in 1914, this is incredibly silly. Since the citation predates Knuth, the cited article can't be addressing his claim, so it can't support "however" anything. This is just some editor writing in their own personal rebuttal to Knuth. But it is also a nonsequitur in that it does not address the substance of Knuth's claims. I would like to re-remove it. --JBL (talk) 23:09, 14 March 2017 (UTC)

The citation is not to some article, but to Titchmarsh's textbook on the Riemann zeta function, published in 1951, and which is in the private library of every mathematician working in analytic number theory. Titchmarsh makes frequent use in it of the symbol Omega, and this just proves that Knuth was wrong in his assertion. That's all. This is not non sequitur, and this is not silly: it deserves to be mentioned. Sapphorain (talk) 23:19, 14 March 2017 (UTC)

If you replace the word "article" in my post with "book" it does not change the substance, which you have not really responded to. Knuth's statement is 100% consistent with "this notation is used in a textbook in 1951." Indeed, it is consistent even with "this notation is commonly understood by analytic number theorists." If you would like to attach a WP policy to the words "silly nonsequitur," it is WP:SYNTH -- you will notice that it is not possible to rewrite this sentence in a way that would pass synth because there is no secondary source here. --JBL (talk) 23:32, 14 March 2017 (UTC)

I just don't understand what you say. Knuth's assertion that "their definition is by no means in wide use" is clearly contradicted by the content of Titchmarsh's book. I think you are commenting on a subject you are not familiar with. Sapphorain (talk) 23:48, 14 March 2017 (UTC)

That something is used among analytic number theorists is completely compatible with it not being "in wide use." The number of analytic number theorists is very small, in absolute terms and even as a fraction of the mathematical community. But even if I accepted 100% the position that Knuth was completely in the wrong as a substantive matter, the sentence is both extremely silly as written (because 1914 < 1951) and a clear case of synth. --JBL (talk) 23:56, 14 March 2017 (UTC)

JBL is applying the rules correctly. We can't add our own disproofs of sources. In order to get this into the article we need (1) a reliable source making the point you want to make, (2) an argument that the point satisfies WP:DUE and other content requirements. McKay (talk) 03:14, 15 March 2017 (UTC)

I rephrased the mention of Titchmarsh's book. Sapphorain (talk) 07:31, 15 March 2017 (UTC)

Example graph

I found the "Example of Big O notation" graph (https://en.wikipedia.org/wiki/File:Big-O-notation.png) to be confusing and not helpful. First, it isn't obvious from the graph whether which curve is the algorithm's performance and which curve is the Big O complexity. It's explained in the corresponding text, but using a formal notation that the article hasn't yet covered. Second, I think it would be better to choose a _common_ Big O complexity for $g(x)$ , such as $1$ , $log(n)$ , $n$ , $n^{2}$ , or $2^{n}$ . Yes, $g(x)$ can have peaks and valleys so long as it meets the formal criteria, but in practice we'll usually choose to call an algorithm, say, $O(n^{2})$ , and not $O(n^{2}-9n+4)$ even if it would give a slightly tighter bound. Third, because both $f(x)$ and $g(x)$ are so wobbly, it's hard to have a sense of how they behave outside the graph's small range. The text asserts that $f(x)\leq cg(x)$ whenever $x\geq x_{0}$ , but the graph doesn't give me much confidence that that's actually the case.

I think the graph would be improved by choosing an $f(x)$ that's $O(n^{2})$ , graphing it along with $y=c\times n^{2}$ for some $c$ , and saying “ $f(x)\in O(n^{2})$ because for $x\geq x_{0}$ , $f(x)\leq c\times n^{2}$ .”

--dzhim (talk) 19:24, 14 August 2017 (UTC)

I agree with your third point. We should use functions for which we can give formal definitions in tbe caption, so the reader can check their behavior outside the shown range by himself, if she wants. If I understood you right, this would also resolve your 2nd point.

Concerning your 1st point, note that the Big-O formalism is defined independently of algorithms and their complexity (although this is its main application). So in the picture, neither f nor g need to be related to any algorithm. - Again I agree with you that the formal definition details might better be omitted from the caption. Instead, we could write

"Example functions f(x)=... and g(x)=... . Everywhere beyond some x (marked in the picture), g grows faster than f; this can be expressed in Big-O notation as f=O(g)."

After the formal definition is given in the article, we could refer back to the image and give the x0 and c for its functions. At that point, they should be understandable. - Jochen Burghardt (talk) 20:38, 14 August 2017 (UTC)

⊂

Near the end of the section titled Little-o notation the right facing "U" is used as follows:

o ( f ) ⊂ O ( f ) {\displaystyle o(f)\subset O(f)} o(f)\subset O(f) (and thus the above properties apply with most combinations of o and O).

However the Set (mathematics) page under Subsets says,

"The expressions A ⊂ B and B ⊃ A are used differently by different authors; some authors use them to mean the same as A ⊆ B (respectively B ⊇ A), whereas others use them to mean the same as A ⊊ B (respectively B ⊋ A)."

So what exactly does ⊂ mean as used in this page?

Thanks.

29Flavors (talk) 21:01, 12 December 2017 (UTC)

Good point 29Flavors. The containment is strict except in some uninteresting degenerate cases; but the use of this notation is nonstandard and not helpful. I've now cleaned up the little-o section substantially. (I hope you will forgive me for replacing the ref tags with a direct wikilink in your comment.) --JBL (talk) 22:34, 12 December 2017 (UTC)

Article organization?

Is there a reason that this article is titled "Big O notation" when it is actually about asymptotic notation in general? It seems more logical to title the page "Bachmann–Landau notation" or "Asymptotic notation" instead and make "Big O notation" redirect to this article. The article would have to be reworked considerably to institute this change, but it would be a much clearer presentation of ideas. Joshua Jurgensmeier (talk) 21:34, 15 December 2017 (UTC)

Hardy-Littlewood notation

> Hence $f(x)=\Omega _{R}(g(x))$ is the negation of $f(x)<o(g(x))$ , and $f(x)=\Omega _{L}(g(x))$ is the negation of $f(x)>o(g(x))$ .

I have never seen the notation $f(x)<o(g(x))$ and $f(x)>o(g(x))$ (which are not introduced on this page either), and for the life of me cannot figure out, based on the definition of $o()$ what they mean and why anyone would ever use them. Is there a reason for this notation to appear at all? — Preceding unsigned comment added by Ceacy (talk • contribs) 17:45, 31 March 2018 (UTC)

Well, there is a first time for everything. It is the shortest way to define properly

f(x)=\Omega _{R}(g(x))

and

f(x)=\Omega _{L}(g(x))

in terms of the previously defined o symbol. If one is not convinced or satisfied, one can stick to the definitions provided just above. Sapphorain (talk) 20:52, 31 March 2018 (UTC)

I don't understand what <o and >o are supposed to mean. o is not a function, the result of which can be compared to something else; it is only a part of a larger piece of notation that also includes an equal sign. So f(n)=o(g(n)) is meaningful (it uses the full =o notation); f(n)<o(g(n)) is not (because only =o has been defined, not <o). Perhaps rather than <o and >o you meant =o and =ω? —David Eppstein (talk) 21:55, 31 March 2018 (UTC)

f(x)<o(g(x))

means that for every

\epsilon >0

,

f(x)<\epsilon g(x)

as soon as

x

is sufficiently large. Keeping in mind that the test function

g

is by convention always positive, negating this is an alternate short way of expressing that

f(x)=\Omega _{R}(g(x))

. Similarly

f(x)>o(g(x))

means that for every

\epsilon >0

,

f(x)>-\epsilon g(x)

as soon as

x

is sufficiently large, and negating this corresponds to the assertion

f(x)=\Omega _{L}(g(x))

. This is the classical intuitive way of presenting the notions now denoted by

\Omega _{\pm }

, and this is the way it was taught to me when I took a first introductory course in analytic number theory in 1975. It might not be completely rigorous, but it has the merit of being short and clear (or so I thought...). In any case it is only a comment to the precise - but not so short and clear - definitions which are just above. Sapphorain (talk) 22:50, 31 March 2018 (UTC)

Regardless of whether it is the "right" way or not (I am not an analytic number theorist, and in theoretical computer science the

<o(\cdot )

and

>o(\cdot )

notations are completely absent): I'd argue that then they need to be defined before they are used. Their definition does not follow from the definition of little-o in the page, and they are never actually introduced nor mentioned before that specific section. Ceacy (talk) 19:15, 1 April 2018 (UTC)

Then suppress the whole thing, I couldn't care less. I don't see any point to argue on that matter any further. Sapphorain (talk) 19:58, 1 April 2018 (UTC)

Erm. Alright. Ceacy (talk) 15:41, 2 April 2018 (UTC)

Subscripted by a parameter

A notation I have seen both in number theory and in the theory of parameterized algorithms is $O_{k}(\dots )$ , used to indicate that the constant of proportionality depends on a parameter $k$ (or more than one parameter, separated by commas). I.e., if the usual notation $f(x)=O(g(x))$ really means something like $\exists C,N>0:\forall x>N:f(x)\leq Cg(x)$ , then this subscripted notation indicates one more level of quantification: $\forall k:\exists C,N>0:\forall x>N:f(x)\leq Cg(x)$ . But I don't see this in our article. Should it be added? (The parameterized algorithms community also uses $O^{*}$ meaning to ignore polynomial factors of the input size and look only at the dependence on the parameter but that may be too specialized to add.) —David Eppstein (talk) 15:20, 27 September 2018 (UTC)

I've seen and I use this notation in talks and lectures (in number theory). But I can't remember having seen it in any published material. Do you? If you do maybe it can be mentioned in the article. Sapphorain (talk) 16:02, 27 September 2018 (UTC)

There's a published example for parameterized algorithm analysis in doi:10.4230/LIPIcs.SWAT.2018.15 (see Thm 2. p. 15:3; this particular example chosen only because I had convenient access to the TeX source making it easier to find). I'm not as familiar with the published number theory literature but I've definitely also seen it in talks in that context so I imagine it also appears in publications. —David Eppstein (talk) 22:24, 27 September 2018 (UTC)

I wasn’t attentive enough: this has been under my eyes all the time! G. Tenenbaum uses the notation (not very often) in « Introduction à la théorie analytique et probabiliste des nombres » (for instance Paris, Belin 2008). He even points in the Notation section at the beginning of the book:

« Nous utilisons indifféremment la notation de Landau f=O(g) et celle de Vinogradov f<<g pour signifier que |f]≤C|g| pour une constante convenable C, qui peut être absolue ou dépendre de certains paramètres — auquel cas la dépendance pourra être indiquée en indice ».

(« We indifferently use Landau’s and Vinogradov’s notations, in order to indicate |f]≤C|g| for some appropriate constant C, that can be absolute or depend on some parameters —in which case the dependency can be indicated with an index » ). Sapphorain (talk) 09:56, 28 September 2018 (UTC)

Removed polylogarithmic

Reinstated my VERY bad. Missed a bracket — Preceding unsigned comment added by 129.78.208.4 (talk) 22:17, 29 November 2006 (UTC)

Inconsistency in definition of the big Omega notation

With the current definitions in the table in the section "Family of Bachmann–Landau notations", we have

f(n)=\Omega (g(n))\Leftrightarrow \exists k>0\;\exists n_{0}\;\forall n>n_{0}\;f(n)\geq k\cdot g(n)

So that

n=\Omega (-n^{2})

This is probably a mistake, which (I think) can be fixed by taking the absolute value of g(n) on the right side of the equation.

Having $n=\Omega (-n^{2})$ leads to an inconsistency: According to the section "The Knuth definition" we have

f(x)=\Omega (g(x))\Leftrightarrow g(x)=O(f(x))

And so we should have

-n^{2}=O(n)

Which is obviously not consistent with the definition of the big O notation in the table in the section "Family of Bachmann–Landau notations":

f(n)=O(g(n))\Leftrightarrow \exists k>0\;\exists n_{0}\;\forall n>n_{0}\;|f(n)|\leq k\cdot g(n)

I understand that this is a minor problem since it is not very usual to take the big O or big Omega of negative functions, but I still think the definitions should be consistent in order to prevent confusion.

I'd like to hear your thoughts (or corrections, if I'm wrong in any of my observations). — Preceding unsigned comment added by 83.85.150.155 (talk) 20:22, 1 December 2018 (UTC)

The test function g is assumed to be positive. This is clearly stated at the beginning of the page. Sapphorain (talk) 21:39, 1 December 2018 (UTC)

"Tight" bounds?

The article refers to terms "tight" and "tighter", but these terms are never defined! Alas, other pages (e.g. "bin packing") refer to this page as the one giving a formal definition of this term; moreover, "Asymptotically tight bound" redirects here. Yes, the term is intuitively clear, but a math-related page should have clear definitions.

I think a short section giving a formal definition of the term "tight bound" and referring to Theta-notation is needed (e.g. as a subsection of section "8. Related asymptotic notations"), and once such a section is created the redirection from "Asymptotically tight bound" should link directly there.— Preceding unsigned comment added by 128.97.82.220 (talk) 20:21, 10 September 2009 (UTC)

I don't feel a definition fits here, but I added a wikilink the first time the term is used. LudwikSzymonJaniuk (talk) 08:36, 9 April 2019 (UTC)

Confusion in using O in small data structures

We used to say heapsort like heap uses O(log n) for both Insert and ExtractMin. When we use it in an algorithm analysis we bound the maximal size of the heap during the course of the algorithm, use it in the bound and multiply it by the number of times we have used it. In the extreme case (Insert, ExtractMin) operation pairs repeated M times this calculation gives O(0) total time. (What is obviously wrong.) This is why we implicitly assume the call of the method costs at least O(1), so we either have to say for example the heapsort operation complexities are O(1+log n) or we have to say there is a convention to implicitly add 1+. Hippo.69 (talk) 09:32, 10 December 2020 (UTC)

I have moved your comment to the bottom of the page, per standard Wikipedia talk-page conventions. Big O (etc.) is an asymptotic notation: it applies a restriction for sufficiently large inputs only. --JBL (talk) 12:23, 10 December 2020 (UTC)

I disagree with JBL, and User:Hippo.69 is right. It is of course true that O is an asymptotic notation, and it is true that it applies a restriction for sufficiently large inputs only. But this « sufficiently large input » (the x_0 of the formal definition) can very well and is often explicitly defined. From the above I gather it is the case in the theory of algorithms, but it is also the case in analytic number theory. When the test function is log x (natural log) x_0 is very often chosen to be 2, in order to avoid the obvious problem described above. But not necessarily. I have right here on my desk three papers constantly using the estimate O( log(ex)), so that it is valid for every x≥1 (as log(ex)= log x+1, this corresponds exactly to what User:Hippo.69 is talking about. And the notation is so common in number theory that the authors don’t even take anymore the trouble of stating that their estimates are valid for all x≥1.--Sapphorain (talk) 16:02, 10 December 2020 (UTC)

Sorry, but I don't understand Hippo.69's problem description. I guess, worst-case time complexity is meant, right? Which code is analyzed to have complexity O(0), and why? - Jochen Burghardt (talk) 23:23, 10 December 2020 (UTC)

I don't understand the description either. But it is clear to me that if for instance an algorithm requires O(log n) steps in order to treat an n-digits number when n is large, then if one wants the estimate to be valid for a 1-digit number as well, some expression like O(log n+1) or O(log(en)) has to be used.--Sapphorain (talk) 08:39, 11 December 2020 (UTC)

I think maybe they are referring to the issue that bounding computational complexity by

\log n

runs into technical difficulties if you allow

n=0

, for instance as the time to perform certain operations on an empty heap. So it would not be correct to say that the time for all

n

is at most

C\log n

for some constant

C

, because that statement involves undefined quantities for

n=0

. But this problem goes away when you expand the definition of

O

-notation and notice that it does not involve all

n

, but rather all sufficiently large values of

n

. When we write

f(n)=O(g(n))

, it is ok for

g

to be undefined (or unreasonably small) for some of the values on which

f

is defined, as long as

g

is defined and bounds

f

for all sufficiently large values. So it is valid to write that the time is

O(\log n)

, without special cases or 1+ faffery, even though the time is defined for

n=0

and

\log n

is not. Log's lack of definition there merely implies that 0 is not sufficiently large, which should not be surprising. And similarly, Sapphorain's example merely tells us that 1 is also not sufficiently large. —David Eppstein (talk) 08:46, 11 December 2020 (UTC)

If

g(n)

and

f(n)

are defined, and

g(n)>0

, for

n\geq n_{1}

, then

f(n)=O(g(n))

means that for some constant

C

we have

|f(n)|\leq Cg(n)

for

n\geq n_{1}

. Even though the definition only gives an inequality for

n\geq n_{0}

with

n_{0}

unspecified, the constant

C

can always be increased to cover the finite number of cases between

n_{1}

and

n_{0}

. McKay (talk) 01:55, 15 December 2020 (UTC)

Yes but the objection (appears to be) about the situation in which

g(n)=0

for small n. And in this case increasing the constant doesn't work. (But also it's not necessary, because of the cutoff.) --JBL (talk) 14:19, 22 December 2020 (UTC)

Formatting of the O

Astro-Tom-ical (talk · contribs) has been adding \mathcal LaTeX formatting to all the O's here. I believe this is an uncommon affectation. I checked two standard algorithms texts, CLRS Introduction to Algorithms and Goodrich&Tamassia; both just use a normal italic capital O. I have seen script O's in research papers (and probably occasionally used them myself) but in my experience they are less commonly used than just a normal italic capital O. I spot-checked some analytic number theory sources (as the other big user of O-notation and a source of possible divergence in notation) but they used the normal italic capital O as well. I think we should go with the common usage here rather than the script font, but I welcome discussion of this. See also Talk:Big O notation/Archive 2#calligraphic O from 2011 where the previous consensus was to avoid script O's. If we are going to revisit this, then in the meantime per WP:BRD I think the O's should stay in their former formatting, without script fonts. —David Eppstein (talk) 07:09, 1 August 2020 (UTC)

I agree that the standard notation is

O(n)

and not

{\mathcal {O}}(n)

. I worry that some people are being misled by the logo at the upper right of the page, misled into thinking that it should be

{\mathcal {O}}(n)

. Something should be done about that logo. Ebony Jackson (talk) 23:04, 12 December 2020 (UTC)

(I'm used to

{\mathcal {O}}

, but never explicitly checked textbooks or research articles about this issue, so I guess, David Eppstein and Ebony Jackson are right.) Nevertheless, I suggest to mention the notation

{\mathcal {O}}

in the section "Typesetting", as a deviating one often seen. Reasons are: (1) Readers may have seen

{\mathcal {O}}

in some book and be trying to find out in wikipedia what it is supposed to mean; they should find an explanation here. (2) Without any discussion of variants, the section doesn't make sense. If everyone would use

O

, why state that Knuth does so, too? - Jochen Burghardt (talk) 11:51, 13 December 2020 (UTC)

OK, I understand your points. We can mention it for now, though I will reword slightly so as not to suggest that this is the standard thing to do when using LaTeX. By the way, it would be good if someone could find some references to the use of

{\mathcal {O}}

in actual publications by top experts in the field. The link to the blog is not convincing. Ebony Jackson (talk) 17:12, 13 December 2020 (UTC)

Your change to "some authors" is fine. I couldn't find the calligraphic notation in a textbook, but a search for "mathcal O computational complexity" found a couple of peer reviewed articles: [1] [2] [3] (the last one is open-access, and may be an unreviewed technical report). It is hard to search for that, since the string "mathcal" usually doesn't appear in Pdf documents, and the TeX sources are usually not found in the web. - Jochen Burghardt (talk) 19:17, 13 December 2020 (UTC)

I don't have it with me, but I think

{\mathcal {O}}

is used in Concrete Mathematics. McKay (talk) 01:40, 15 December 2020 (UTC)

I found a pdf in the web (Ronald L. Graham and Donald E. Knuth and Oren Patashnik (1994). Concrete Mathematics: A Foundation for Computer Science (2nd ed.). Addison-Wesley Publishing Company.), it uses a non-calligraphic, italic O (sect.9.2, p.443). - Jochen Burghardt (talk) 18:18, 17 December 2020 (UTC)

The old template image File:Big-o-approx-logo.png (showing "

{\mathcal {O}},\sim

")was introduced by Krauss in Apr 2016; maybe (s)he can provide a textbook source for that notation? As for the new template image File:BigOnotationfunctionapprox popular.svg (showing "

O(),\sim

"), wouldn't it be better to include an example expression between the parantheses, such as "

O(n),\sim

" ? - Jochen Burghardt (talk) 10:45, 22 December 2020 (UTC)

As someone who regularly uses

{\mathcal {O}}

in their own publications, I must admit that I was a little bit puzzled by this discussion. While I would not feel confident to make a statement on Informatics, I'd dare say that this notation, while not universal, is definitely not uncommon in Numerics. A quick check through my PDFs e.g. returned the following textbooks where it is employed:

Logg et al., "The FEniCS Book"
Quarteroni et al., "Scientific Computing with Matlab and Octave"
Turner et al., "Applied Scientific Computing With Python"
Braess, "Finite Elements"
Hackbusch, "Theorie und Numerik elliptischer Differentialgleichungen"

InfoBroker2020 (talk) 15:05, 23 April 2021 (UTC)

Edit war over math markup

I just rolled the page all the way back to this revision dated June 22. Can we please not fight over how big function application parentheses should be? Elwoz (talk) 23:31, 26 July 2021 (UTC)

Thanks. I have restored two edits that I made in the interim (that have nothing to do with the parentheses nonsense). --JBL (talk) 00:02, 27 July 2021 (UTC)

= vs. ϵ and footnote 11

If Knuth really said that, he was having an off day. Mathematicians, like the rest of us, would indeed use the English word "is" that way, but the = relation is the paradigmatic equivalence relation: it's reflexive, symmetric, and transitive. A mathematician might write something like

∃ man | man = Aristotle

(although multi-letter variable names are unusual), but never

man = Aristotle

unadorned. And of course the most natural and common way to express "Aristotle is a man" formally would be

Aristotle ϵ men

or perhaps

is_a(Aristotle, man)

The ϵ and is_a relations are definitely not equivalence relations.

Sorry, I know practically everyone abuses the notation, and it's fine for the article to say so, but I think it's not fine for it to pretend that there's some justification for it, especially not by an appeal to the authority of mathematicians. Briankharvey (talk) 01:00, 12 May 2021 (UTC)

Not that I wish to endorse the inane substance of your comment (the symbol "=" is used for many things, not all of them equivalence relations, as in the relevant example f = O(g)), but the Knuth quote is not about the question being considered in that paragraph (indeed Knuth does not address the question in the cited reference) and is redundant with the discussion that precedes it, so I have removed it. --JBL (talk) 02:25, 13 May 2021 (UTC)

Seeing this discussion only now. The Knuth quote was not about the equals sign everywhere, but about the equals sign in the context of O-notation and the like. Here's what he says in the 1998 ocalc letter:

I would begin my ideal calculus course by introducing a simpler “A notation,” which means “absolutely at most.” For example, A(2) stands for a quantity whose absolute value is less than or equal to 2. […] I would of course explain that the equality sign is not symmetric with respect to such notations; we have 3=A(5) and 4=A(5) but not 3=4, nor can we say that A(5)=4. We can, however, say that A(0)=0. As de Bruijn points out in [1, 1.2], mathematicians customarily use the = sign as they use the word “is” in English: Aristotle is a man, but a man isn’t necessarily Aristotle. […] Once students have caught on to the idea of A notation, they are ready for O notation, which is even less specific.

There's a similar quote in Concrete Mathematics, written about ten years earlier:

If `=' really means `⊆', why don't we use `⊆' instead of abusing the equals sign? There are four reasons. […] Third, tradition. We often read `=' as the word `is'. For instance we verbalize the formula $H_{n}=O(\log n)$ by saying "H sub n is Big Oh of log n." And in English, this `is' is one-way. We say that a bird is an animal, but we don't say that an animal is a bird; “animal’ is a crudification of “bird.’

Hope this makes sense, Shreevatsa (talk) 14:36, 5 September 2021 (UTC)

Definition

The restriction that the function $g$ should be strictly positive for large $x$ , in the definitions of $O(g)$ and $o(g)$ , say for $x\to +\infty$ , is not heavy, yet a bit annoying (for in computations it forces to unnecessary checks on $g$ ) and in fact not needed (the only use of the restriction is to define $f=o(g)$ as $f/g\to 0$ , and $f=O(g)$ analogously). A better definition (and I think the standard one) is as follows. Define $o(1)$ and $O(1)$ with the exact meaning as in the article; then by definition $o(g):=g\cdot o(1)$ and $O(g):=g\cdot O(1)$ . Is there any objection to this enlarged definition? pm a 13:20, 11 August 2021 (UTC)

Yes, I object! This proposition doesn’t make sense (and a fortiori is certainly not « standard »). In the standard notation in analysis and number theory, the symbols

o(1)

and

O(1)

are not defined and have no meaning just by themselves. Only the properties

f(x)=o(1)

and

f(x)=O(1)

are defined (in which the symbol

=

has very little to do with the standard equality sign and has no meaning just by itself either). There is undoubtedly a way to extend the usual definition to test functions

g

that vanish at some points, but in order to cite it in the article one requires a reliable source referring to it. --Sapphorain (talk) 15:53, 11 August 2021 (UTC)

”This proposition (...)“ : it refers to what you wrote after, right? :) Of course

O(1)

(for

x\to x_{0}

) formally denote a set: the set of all functions in the given domain that are locally bounded at

x_{0}

, and yes, “

=

“ in “

f=O(1)

“ is not a standard equality sign but rather stands for “

\in

”. So what? By

f(x)O(1)

, we denote the set of functions of the form

f(x)g(x)

with

g

in

O(1)

, or equivalently such that there exist a nbd

U

of

x_{0}

and a constant

M

such that

|f(x)\leq M|g(x)|

for all

x\in U

, which is exactly the use you want (dominate

f

by

g

). We also use the notation

O(f(x))

in place of

f(x)O(1)

and

o(f)

in place of

f(x)o(1)

. No need to write it as a quotient, since you then have to make needless restrictions on

g.

I don’t know who introduced this silly quotient. pm a 21:53, 17 September 2021 (UTC)

If you think

O(\dots )

denotes a set, how do you interpret formulas like

\ln(n!)=n\ln n-n+O(\ln n)

? —David Eppstein (talk) 22:43, 17 September 2021 (UTC)

In the usual way: If H is a linear subspace in a vector space and v a vector, then v+H is the affine subspace we get by adding v to every element of H. The “it’s a set” perspective is I think consistent with std ways one writes about rings, modules, & algebras. (Of course in the absence of any RS discussion this has nothing to do with what the article should say.) —JBL (talk) 11:19, 18 September 2021 (UTC)

It's not hard to work out what set the larger expression should refer to, but what is unclear from the "it's a set" attempts at definition are what arithmetic operations are allowed on sets in this way. Vector spaces are the wrong type of data to be using here, for instance, because O-notation also needs to appear in exponents. —David Eppstein (talk) 18:03, 18 September 2021 (UTC)

The space of functions from some domain to the reals is certainly an algebra over a field and that gets you both the multiplicative notation PMajer wants to use and the additive notation you asked about (but not exponentiation). I doubt that "algebras over

\mathbb {R}

with exponentiation" (or composition, or whatever) are a heavily studied family of algebraic objects, but I am confident that a definition could be made, and given such a definition it would be entirely natural to write

2^{S}=\{2^{s}\colon s\in S\}

when S is a subset of such an object. The problem is not that it's wrong to think of

O(\ldots )

as a set (perhaps in some algebraic structure), it's that the relevant algebraic structures aren't important/interesting. --JBL (talk) 19:15, 18 September 2021 (UTC)

The difficulty with trying to do things this way is that you need to continue defining set operations ad infinitum and ad nauseam for every monotonic function people might want to apply to an O-notation quantity; for instance, it is not uncommon to see notations like

\log ^{O(1)}n

, so it is not just exponentiation with a constant base that needs to be handled here. I think it is better and clearer to understand [something = something else containing an O] as a syntactical shorthand for [

\exists C,N:\forall n>N:

something ≤ something else with the O replaced by C] (for the version of O where the limit is at infinity) rather than as a notation in which the O is meaningful by itself divorced from its surrounding expression. This point of view also makes clearer where the problems are with notations like

O(m+n\log n)

(used frequently by computer scientists with no idea that there might be a problem with this notation): if you just think "it means some kind of set of bivariate functions" it looks like an easy step from single-step O-notation, but if you try to translate it into explicit quantifiers the questions about whether it only applies to situations where both variables are large, or what happens when one stays bounded and the other doesn't, become more apparent. —David Eppstein (talk) 20:48, 18 September 2021 (UTC)

Note that the issue of the thread was originally not about the interpretation of

O(g)

and

o(g)

as sets. I only quoted the "set theoretic definition", because I think it was alluded to by user Sapphorain (precisely, in their sentence: (..) in which $=$ has very little to do with the standard equality sign ), as a (not very relevant) objection to my original post. Rather, the point of my OP was: is it necessary to require the function $g$ to be non-vanishing in a dotted nbd of $x_{0}$ ? Of course it is not: it becomes an unnecessary restriction, if we adopt the (customary) definition: "

f=O(g)

as

x\to x_{0}

means there exists a nbd

U

of

x_{0}

and a constant

M

such that

|f(x)|\leq M|g(x)|

for all

x\in U

"). So what are pro and contra of the two definitions ("restricted" vs "generalized")? As I see it, if you use

O(g)

mainly to describe growth rates or speed of approximatins, you do not actually need the generalized definition:

g

will always be some special comparison function, like

x^{n},\log x,\log \log x,e^{x},etc.

, always strictly positive and usually even monotone in a (dotted) nbd of

x_{0}.

But if you use the O&o notation within computations, then "

g

non-vanishing" appears as a really unnecessary and annoying restriction; most rules of calculus of Landau's notation do not need it, and you'd like to write "f=O(g)" in a computation without having to check whether

g

is locally strictly positive (like in "

{\big (}f=O(g)\,\&\,g=o(h){\big )}\implies f=o(h)

", that is the rule "

O(o)=o

", etc). The only reason to require

g

locally non-vanishing, is to state equivalently "f=O(g)" as "the ratio

f(x)/g(x)

is locally bounded at

x_{0}

", which is believed to be somehow easier to understand for a freshman. Yet in my experience the best (and more didactically clear) approach is: first, define

o(1)

(infinitesimal) and

O(1)

(locally bounded) and start using it; then at some moment introduce

O(g)

and

o(g)

as alternative symbols/forms for

g\cdot O(1)

and

g\cdot o(1)

respectively, of course without restrictions on

g

. pm a 15:44, 30 September 2021 (UTC)

As to the interpretation of

O(g)

and

o(g)

as sets (of which I’m not a fan, for this article) I'd refer to a nice discussion in the book Concrete Mathematics, where they give a very clear explanation of Landau's notation --in particular to its asymmetry, in terms of set inclusions: according to it,

f=O(g)

formally stands for

f\in O(g)

, and

O(f)=O(g)

stands for

O(f)\subset O(g)

;

f(O(g))

is the image of the set

O(g)

through

f,

etc. In other words the verb to be in "f is

O(g)

is like when they say "Socrates is a biped" :). But, of course, we may as well explain it saying that, e.g.

o(1)

stands for some infinitesimal quantity that we can't/don't want to make explicit, so that in any new occurrence of

o(1)

we are possibly denoting a new different function, as in

o(1)+o(1)=o(1)

.pm a 15:44, 30 September 2021 (UTC)

Historical Precursors

Big-O notation did have numerous historical precursors going back to Sharaf al-Din al-Tusi and even, to a point, Archimedes. The idea that the behavior of two functions being extremely different from each other for small values but approaching each other as they approach infinity predates Calculus and even predates the conception of a "function." Should these historical precursors be discussed? CessnaMan1989 (talk) 15:46, 14 October 2021 (UTC)

Are there any reliable sources on this that make the connection? --JBL (talk) 18:32, 14 October 2021 (UTC)

Sure. In "Data Structures and Algorithms in Python" by Goodrich and others, they introduce asymptotic analysis by mentioning the precursors to Big-O notation(https://www.oreilly.com/library/view/data-structures-and/9781118290279/08_chap03.html#ch003-sec021). CessnaMan1989 (talk) 16:19, 15 October 2021 (UTC)

I am not able to view the whole chapter via that link, only a portion of the introductory paragraph. It does mention Archimedes, apparently as a framing device, but the idea that Archimedes' principle has any concrete connection (rather than a vague affinity, if you squint right) to big O notation is ridiculous. Information about historical precursors would be great, provided it is sourced to reliable sources -- and here a reliable source should be one that is likely to have been vetted for historical accuracy (e.g., a textbook on data structures in Python seems like a mediocre option at best). If you do have access to such sources, I encourage you to be bold and introduce content directly. --JBL (talk) 18:17, 15 October 2021 (UTC)

I'm referring to one of the footnotes that references Archimedes' other work, the work that is seen as a precursor to Calculus where he has notation for indicating that the characteristics of some shapes begin to resemble those of other shapes when the quantity of such characteristics are "exhausted" to infinity. This is described in other books too. I'll try to find another one that everybody can view. CessnaMan1989 (talk) 15:01, 16 October 2021 (UTC)

The connection between vague notions of limiting processes and this specific notation is too tenuous to make sense for this article. Maybe Limit (mathematics) would be a better place. Precursor notions that would make sense here are ones from asymptotic analysis where constant factors are considered unimportant. Probably this started sometime around the work of Euler and the Bernoullis, not really Archimedes (who was I think more interested in limiting processes for exact values rather than growth rates). —David Eppstein (talk) 18:48, 16 October 2021 (UTC)

The problem with placing it in the Limit article is that a Limit is fundamentally different than the limiting behavior of a function, albeit to the chagrin of many math students in highschool and college, including myself at one time. A limit is either a numerical value, infinity, or, I suppose, undefined. This is different. Here, we're talking about how people denoted that families of various mathematical objects would come to resemble other mathematical objects in certain zones, usually when extended to infinity. Archimedes's notation that he used to denote that the concept/form of the regular polygon approached that of the circle when the regular polygon had an infinite number of sides is a precursor. CessnaMan1989 (talk) 15:43, 17 October 2021 (UTC)

Not a precursor to O-notation. To the notion of limits used to define O-notation, maybe. —David Eppstein (talk) 16:38, 17 October 2021 (UTC)

I am inclined to agree with DE; Archimedes might have recognized some version of the modern notion of limit as related to his ideas, but he certainly would not have recognized the big O. Also, again, a textbook on data structures in Python seems like a mediocre source at best for historical claims, even moreso when we're talking about asides in footnotes. --JBL (talk) 20:43, 17 October 2021 (UTC)

@David Eppstein and JayBeeE You both seem to be saying the notations used to denote concepts that are precursors to other concepts aren't necessarily precursors themselves. While I agree in a some sense, I think they're functionally relevant. For example, I think the article on the Greek Alphabet should contain references to Linear B even though the writing systems aren't closely related, and it does. However, I know when I'm outvoted. However, what about the more direct precursors such as those of Euler that Bachmann himself references in "Die analytische Zahlentheorie"? CessnaMan1989 (talk) 21:03, 17 October 2021 (UTC)

It is possible in principle to imagine a situation where A is a precursor to B and B is a precursor to C and we talk about A in the article on C; but you would want to base that discussion on good sources that discuss A in the context of C. I think that instead of asking questions like your latest you should look at good sources (if you have any available) and either propose language based on them or add it directly to the article. (Do you notice a theme in my comments?) --JBL (talk) 02:05, 18 October 2021 (UTC)

I want to add the error notation for Euler's Method because Bachmann mentions it as a precursor in his own work, but I want to make sure I have a copy of the work in English first. CessnaMan1989 (talk) 20:13, 1 November 2021 (UTC)