Primary Number Theory

This note is a brief summary of important concepts in primary number theory which might be useful in computer science and quantum computation. Start writing from Nov 6 2018.

Primary Number TheoryFundamentalsDivideFundamental Theorem of ArithmeticFactorization of FactorialModular ArithmeticBasicGreatest Common DivisorRepresentation Theorem for the gcdEuclid's algorithmTheorem for the equivalence of gcdThe algorithm finding gcdComplexity analysisFinding multiplicative inverseFrom Chinese remainder theorem to Fermat's little theoremChinese remainder theoremFermat's little theoremEuler totient functionEuler's TheoremReduction of factoring to order-findingOrderThe First stepThe Second StepThe algorithm for factoringReduce order finding to factoringContinued FractionsNotationContinued fraction algorithmApproximation AccuracyIndefinite EquationLinear indefinite equation with one variableLinear indefinite equation with multiple variablesMore on equations in modular arithmeticSolve higher order equationsDefinition of quadratic residueModulo primeLegendre symbolPrimitive rootIndexReferences

Fundamentals

Divide

Definition: d divides n, written as , if there exits an integer k, s.t. .

Properties of divide:

  1. (Transitivity) If , , then .
  2. If , , then , where x and y are intergers.

Fundamental Theorem of Arithmetic

Theorem 1.1: Any integer a greater than 1 has a prime factorization of the form , where are distinct prime numbers. And such prime factorization is unique.

Proof: 1. It is easy to show the existence of such factorization by induction and the definition of composite number. 2. The uniqueness of such factorization can be easily shown by Euclid's lemma. And there is also direct argument by contradiction, see here.

Comment: (Complexity class of prime factorization problem) The prime factorization problem is neither known in P nor in NPC. It is conjectured this problem might be in NPI. There is an efficient algorithm for this problem in term of quantum computer, which is well-known Shor algorithm and hence the problem is in BQP.

Euclid's Lemma: If prime p, , then or at least one of them is true.

Proof: We prove this lemma without Bezout's identity. Say , since , there is integer m s.t. , namely . Since is coprime, then is the simplest fraction. And all other fractions of the same value take the form . That is to say, , namely . QED.

Factorization of Factorial

Definition: give the largest integer no more than a. eg. for .

Lemma 1.1: Suppose a and b two positive integers, count of integers t which and is .

Theorem 1.2: In the prime factorization of , the index of p is given by

Proof: Note only a few terms are nonzero, so the sum to infinite make sense. Imagine we factorize every integer from 2 to n, say among them there are numbers with factor p of power , then

where is just the count of numbers which can be divide by and no more than , which truns out as . QED.

Corollary 1.1: is an integer.

Proof: . Therefore we have

The above formula is just

Modular Arithmetic

Basic

The arithmetic of remainders, add, minus and times are easily defined. Such formulas are written with in the end. The inverse operation of multiplication: an effective division can sometimes defined, which we will elaborate in detail below.

properties of modular arithmetic:

Suppose , we have

Greatest Common Divisor

Definition: The greates common divisor of a and b, written as , is the greatest common elements from the divisor lists of a and b.

Comment: (Naive complexity) Based on the definition of gcd, the problem of finding gcd of two numbers is not easier than prime factorization. We will show clever algorithms finding gcd below which is in P.

Definition of coprime: a,b are said to be coprime, if .

Representation Theorem for the gcd

Theorem 2: is the least positive integer that can be written as , where x and y are integers (not necessarily positive ones).

Proof: Let be the least positive number. Since , then . Namely, . On the other hand, we can show , but among such numbers, gcd is the largest one, therefore, . So .

To show by contradiction, suppose , then , recall the expression of s, we have , which is contradict with the fact than s is the least positive number of such form. The same argument applies to . QED.

Corollary 2.1: If , then .

Definition of multiplicative inverse: An integer b is said to be the integer a's multiplicative inverse modulo n if .

Corollary 2.2: If there are integers x and y, s.t. , then , namely, a is coprime to b.

Proof: Obvious from the representation theorem.

Corollary 2.3: Let n be integer greater than 1. An integer a has a multiplicative inverse modulo n iff .

Proof: Suppose , then . And hence .

Comment: (the effective division) We define as the number satisfying , and such an operation plays the role of division in modular arithmetic. Note must be coprime to n s.t. the inverse is well defined, then the elements of arithetic are subset of . The inverse in unique in terms of modular arithmetic, suppose , then . Since , leads to .

Euclid's algorithm

In this section, we will focus on the euclid's algorithm for finding gcd, including its correctness and complexity analysis. In practical, there can be faster implementation than Euclid's method which heavily make use of the shift operations, see here for the optimal algorithm in implementation.

Theorem for the equivalence of gcd

Theorem 3.1: Let r be the remainder when a is divided by b (a>b), then .

Proof: Since and , we have . By corollary 2.1, we have .

On the other hand, , we have , then . QED.

The algorithm finding gcd

Euclid's Algorithm: Iteratively replace the larger number with the remainder, until the remainder is zero. Then the divider (smaller number) in the last round is the gcd. The correctness of the algorithm is ensured by theorem 3.

Comment: (Algorithm finding the representation of gcd) As a byproduct, this algorithm can also be used to decompose gcd into standard representation as in theorem 2. Namely, find integer x,y, s.t. . This is done by going into the reverse direction on Euclid's algorithm, which take the larger and larger numbers into the expression, until we are back with the two numbers a and b, and the prefactor of them are x and y.

Complexity analysis

Suppose a, b can be represented by L bit strings. Namely . The key observation is that during the algorithm , where is the remainder of i round. (Since if ). Therefore, we need only times of iterations. If we further consider the complexity of arithmetic operations in each iteration, the overall complexity is . And the same complexity applies to the problem of finding x and y of the representation for gcd.

Finding multiplicative inverse

Euclid's Algorithm can also be used to find the modular inverse. This idea is directly from corollary 2.3. Since , we only need to run Euclid's algorithm to find the representation of . The corresponding coefficient is .

Now we are equiped with algorithms to solve linear equation in modular arithmetic as , where a and n are coprime. The solution is , and can be solved efficiently.

Theorem 3.2: has solution iff . If it has solution, then the number of solutions (in the sense of modulo m) is .

Proof: The eq. has solution iff . According to Bezout's identity (thm 7.2), this is equivalent to . If there is a solution , then all solution can be expressed as for any integer d. Under modulo m, there is d different solutions. QED.

From Chinese remainder theorem to Fermat's little theorem

Chinese remainder theorem

Theorem 4.1: Suppose are positive integers s.t. any pair of them are coprime. Then the system of equations

has a solution. Moreover, any two solutions are equal modulo .

Proof: Construct the solution explicitly. Define . Note , we have . Then the solution is . This can be verified noting the fact .

Suppose x and x' are two different solutions to the system of equations. Then and thus . QED.

Comment: (existence of a small solution) Based on the proof, there must be a solution . If the explicitly constructed solution , we use the solution .

Comment: (deep structure of this thm) In fact, Chinese remainder theorem is deeply correlated with the rings decomposition and hence play a central role in the proof of primary number theory. That is to say, if we take a high level view from advanced algebra, most of the conclusions and progress from primary number theory are correlated with the decomposition of rings. In fact, thm 4.1 gives the duality between and which is trivial. Besides thm 4.1 also gives the duality between and which is nontrivial and actually the proof of Lemma 4.3 below.

Fermat's little theorem

Lemma 4.1: Suppose p is prime and , then .

Proof: Consider . Since , then . However , so .

Theorem 4.2: (Fermat's little theorem) Suppose p is prime, then for any integer a: . Moreover, if , then .

Proof: The second half is easy. Since means . Then is well defined. We have .

WLOG, we prove the first half with positive a by induction on a. The case a=1 works for the theorem. Suppose the thm holds true for a, namely . For , we have . Due to lemma 4.1, we have . QED.

Euler totient function

Definition: is defined as the number of positive integers that are coprime to n and less than n.

For prime p, it is obvious . And for , the only non coprime number is . Therefore, which is the following lemma.

Lemma 4.2: For prime p, .

Lemma 4.3: If a and b are coprime, then .

Proof: Cosider the system of equations, . From theorem 4.1, there is a one to one correspondence between the solution and the input . Furthermore, if we add constrait , this is equivalent to (as x has no common nontrivial divisor with a or b). Therefore, there is a bijector between the two side vs . There are such pairs of and pairs of .

Theorem 4.3: (Expression of Euler function) For integer n with prime factorization , we have .

Proof: Combine lemma 4.2 and lemma 4.3, obvious.

Euler's Theorem

Theorem 4.4: Suppose is coprime to n, we have .

Proof: We first show , namely the case where p is prime. When , we are back to Fermat's little theorem. By induction, assume this is true for , namely , then by lemma 4.2, we have .

For the more general case, note every integer has prime factorization. And these terms are coprime with each other. Then since . Since are coprime to each other, we have the conclusion .

Comment: (The algebra structure of modular arithmetic) Define all the elements from which is coprime to n as . Such a structure is actually a ring with two operations (add and times in the modular sense). And . If we only focus on multiplication operation, this is a group. And we have the following theorem describing the structure of such groups.

Theorem 4.5: is cyclic iff , where p is an odd prime and k>0.

Proof: It is straightforward to show the results if we are equiped knowledge with abstract algebra, see this note for the proof.

Reduction of factoring to order-finding

Order

Definition: Suppose N is a positive integer and x is coprime to N, the order of x modulo N is defined to be the least positive integer r such that .

Comment: Note the periodic behavior of and .

The First step

Theorem 5.1: Suppose N is a composite number and a is coprime to N. Suppose r is the order of a modulo N. If r is even and , then N has a nontrivial factor as .

Proof: Since , and thus N must have a common factor with at least one of them. But since r is order of a, namely . Besides, we have the condition , therefore both term contains nontrivial factor of N. Specifically, is a nontrivial factor of N.

Comment: nontrivial solution means that .

The Second Step

Lemma 5.1: Let p be an odd prime and be the largest power of 2 that . Then with probability , divides the order modulo of a randomly chosen element from .

Proof: Note that is even so . By theorem 4.5, there exist a generator g, s.t. every element in can be written as for k range from 1 to . Let r be the order of modulo , consider two cases. 1) When k is odd, from , we have . And thus . 2) When k is even, . Thus . But since is the largest power of 2 which can divide , we conclude that .

Comment: (structure of ) There are two classes of elements . For even k, where r is the order of . For odd k, .

Theorem 5.2: Suppose is the prime factorization of an odd composite positive integer. Let x be chosen randomly from , and let r be the order of x modular N. Then

Proof: We show that . And label them as condition 1 and condition 2.

By theorem 4.1, choosing x from is equivalent to choosing independetly from and requiring the system of equations for each j. Let be the order of modulo and be the largest power of 2 that divides . We will show that to satisfy condition 1 or 2, all have to take the same value as .

Note , since leads to , namely . 1) If r is odd, all is odd, then . 2) If , then . Therefore, . Together with the fact , we have , where s and t are both odd integers. Thus we have .

Let's now figure out the probability of all are equal. Say we randomly pick and determine , then all terms must be the same. Suppose , which is the largest power of 2 that divides phi. The probability of such event is no more than 1. Then based on the lemma 5.1, . In total, we show the probability is no more than .

Comments: In other words, let N be odd and not a power of a prime (N at least be factorized into two different primes). is picked randomly from (pick which is coprime to N), say the order of a modulo N is r, then the probability that r is even and is no less than one half. This fact is directly from thm 5.2. is due to the fact that r is the order of a (smallest integer make ).

The algorithm for factoring

Algorithm: (factoring with subroutine of order finding)

  1. Make sure the input N is not even (else return 2 as a factor) and not power of some prime (else return such prime as factor).
  2. Randomly choose . Compute , if , return it as a factor, else continue.
  3. Determine the order r of x modulo N.
  4. If r is even and , continue, else the algorithm fails, try another x to repeat the algorithm.
  5. Compute , and test to see which is a non-trivial factor of N. Return such factor.

Correctness: The correctness of the above algorithm is obvious following theorem following thm 5.1 and 5.2. In the first two steps, we are creating the condition of thm 5.2. The step 4 is successful with probability more than one half. So the randomized algorithm can achieve exponential success probability with several more trials. Step 5 is following thm 5.1.

Complexity analysis: It is obvious all steps can be done in polynomial time except step 3. The factor finding subroutine has no known classical algorithm with P complexity (though no theoretical lower bound of complexity, either). However, step 3 can be achieved by quantum algorithm named after Shor in polynomial times.

The complexity of step 1: check whether N is power of some prime. The approach is as following. We calculate for . If any of these is an integer, we have found a factor. Otherwise, such N is not a power of an integer. So the complexity is still in polynomial time.

Finally the input of the above algorithm is promised to be a composite number. What if the input might be prime itself. Firstly, we already have determinstic primality test known as AKS primality test which is in P. And we can use such a subroutine before step 1 to make sure N is not a prime. Secondly, the algorithm itself is a good randomized primality test procedure. If the algorithm fails sufficient rounds, we are confident that the input N is a prime with high probability.

Reduce order finding to factoring

Let's finally explore the other direction and finish concluding the equivalence between the two problems. The idea is from this answer. Suppose we have efficient way to factor composite number and get prime factorizations. To find the least r such that where a and N is coprime, we first factorize N to get using Euler thm 4.3. Then we factorize . Note based on thm 4.4, r must be of the form , where . Let , we divide by each time, and compute , if this value is 1, we let . Continue the process until for all , .

Algorithm: (order finding with subroutine of factoring)

  1. With input N and a coprime with each other, factorize to compute and then factor to get its prime factors .
  2. Let , for each , if is an integer and , let , continue step 2. Else, m is the final result of the order.

Complexity analysis: The correctness of the above algorithm is obvious. The complexity except the factoring part is also in polynomial time. To see this, note , and . We try loop in step 2 at most round, in each round we reduce with one prime factor, in each loop we test at most integers and each check can also be carried out in steps by divide and conquer of the multiplication. To sum up, the total complexity is .

Comment: (main conclusion) Factoring problem is equivalent to order finding problem in the sense of complexity. No one is harder than the other.

Continued Fractions

Notation

Say real number which can be written as , such an expression for real numbers are continued fractions. The idea is to describe real number with integer alone and without base dependence.

A finite simple continued fraction is finite, such an expression occurs iff the number to be expressed is rational. Simple denotes that all are integers. Finite denote the sequence of is finite.

nth convergent of a continued fraction is the first n digit of the continued fraction.

Continued fraction algorithm

Theorem 6.1: A rational number great than 1 has a representation as a finite continued fraction. Such representation can be found by continued fraction algorithm.

Proof: Directly construct by the algorithm. The algorithm iteratively decompose the number into integer part and true fraction part (which can be treat as the reciprocal of the denominator), continue the process until we meet 1 in the numerator of the decomposition. In this algorithm, the numerator decreases strictly, so rational number must have finite continued fraction.

Comment: (ambiguity of the above algorithm) The only possible ambiguity comes at the final stage of the above algorithm. Namely, either or . That is to say, .

Theorem 6.2: (expression for given continued fraction) Given , we have . are defined inductively by and . And for ,

Proof: Proof by induction, case for is easy to check. For , . By induction hypothesis, let , based on the recusive formula and the hypothesis that (the same is true for q), we have . QED.

Comment: Note in thm 6.2, needn't be integers. But if they are integers, so are p and q.

Corollary 6.1: . And thus .

Proof: Use induction on n to show the first half. If the first half is true, based on corollary 2.2, are coprime.

Approximation Accuracy

Comment: (length of continued fraction to represent a rational number) Based on thm 6.2, is increasing with n and and similarly . Therefore and thus and which is the legth of corresponding continued fraction.

Theorem 6.3: Let x be a rational number and suppose . Then is a convergent of the continued fraction for x.

Proof: Let , and is the approximation given by thm 6.2. Namely . Define as , so . Define as

Then we have (check this by plug the definition of ). Namely, . Choosing n even and based on corollary 6.1, we have . Therefore and , where is the convergent of x. QED.

Theorem 6.4: (optimal approximation) is a real number, is its kth convergent. Then for any rational number with denominator no more than , is the closest one wth .

The summary above are complete for understanding on RSA or Shor algorithm. Below are some more results relevant to number theory, and the proof might be omitted.

Indefinite Equation

Linear indefinite equation with one variable

Theorem 7.1: Equation with a,b,c as integers and a,b nonzero, suppose such equation has a solution . Let and , all solutions can be expressed as , where t is any integer.

Proof: The given expressions are easy to check are solutions. To show they are all solutions, use the fact . Since , then we have and similar for . QED.

Theorem 7.2: (Bezout's identity) Eq. in thm 7.1 has integer solution iff .

Proof: obvious considering the fact in thm 2.

Comment: (solve such eq. in practical) Namely how we can solve , where a and b are coprime. This is just the problem we meet in the comment of Euclid's algorithm and we have solved such problem by utilizing Euclid's algorithm there.

Linear indefinite equation with multiple variables

Theorem 7.3: Eq. , where are all integers and WLOG let all nonzero. The equation has integer solution iff .

Proof: If there is a set of solution , then .

If , we prove it by induction on n. case is thm 7.2. Let , then . Following the induction hypothesis, we have solution s.t. . Now we are left with . According to thm 7.2 again, this equation has integer solution.QED.

Comment: The above proof also provide an approach to solve the linear indefinite equation with multiple variables in practice. Just do it recursively as equations with two variables.

More on equations in modular arithmetic

Solve higher order equations

Comment: Firstly, for , suppose m has prime factorization , due to thm 4.1, we have shown it has one to one correspondence with a set of equations . We now explore how to solve such equations.

Theorem 8.1: Let is a solution for , where p is a prime and . Then such a solution () is also a solution for .

Example: Instead of a general proof, let's show how to do this by an specific example. Problem let , solve .

. has solution . For , by Taylor expansion, we have , since the higher order of the expansion is 0 modulo 9. This is . We are back to . Therefore, , and is the solution for . Repeat the process by pluging the solution into , we have

, which is , therefore, . We have as the final solution.

Comment: Base on thm 8.1, we can use solution for mod p equation to get solution for mod power of p equation and finally get solution for mod any integer equation. The problem is reduced to solve equation with mod prime. We introduce some property of such equation below without proof.

Theorem 8.2: Consider the equation , where is order n polynomial and p is prime, such a equation is equivalent to , where is polynomial with order less than p.

Theorem 8.3: If the order of is no more than p, then has n solution iff is a polynomial with all coefficients as mutiple of p. In other case, such equation has solutions less than n.

Definition of quadratic residue

We focus on second order equation now.

Definition: where a is coprime to m. If such equation has a solution, we call a is qudratic residue modulo m. Otherwise a is called qudratic nonresidue modulo m.

Comment: The backgroud of this is that, for the existence of solutions, we can reduce to equation where a and are coprime.

Modulo prime

Theorem 8.4: (Euler criteria) , (a is quadratic residue module p) iff . And a is qudratic nonresidue iff .

Theorem 8.5: For , the number of quadratic residue and quadratic nonresidue are both .

Legendre symbol

Definition: Define where p is prime. This function gives 1 if a is quadratic residue modulo p and -1 if a is quadratic nonresidue. Otherwise gives 0 if . According to thm 8.4, we have .

Propeties of Legendre symbol:

Theorem 8.6: (Quadratic reciprocity) .

Combined these tools, one can quickly determin whether some value is quadratic residue.

Primitive root

Definition: Suppose r is the order of a modulo n. If , we call a primitive root modulo n.

Comment: Due to thm 4.5, there is primitive root modulo n iff n is where p is odd prime The primitive root is just the generator of corresponding cyclic group. For one modulo n, there might be more than one primitive root.

Index

Definition: Say m has primitive root g, , we call the index of a based g modulo m. (the least nonnegative one is written as ).

Theorem 8.7: There is primitive root modulo m and , , the equation has solution iff , the number of solutions is d.

Comment: For a cyclic group , if is a generator, than is a generator iff . Therefore, the number of generators of cylic group is . And for , if it is a cylic group, the number of generators is . More general, for cylic group and generator , the subgroup generated by is of the size .

Since there may be more than one primitive root modulo m, then the index is defined with ambguity. However we can show the consistence of thm 8.7. Namely, suppose is different indexes of same x based different root, then iff . (It is obvious , where are two different roots. Combine the fact , we have and thus .)

References