Research Interests of Hugh C. Williams
My work in computational number theory extends from analyzing the complexity of number theoretic algorithms to actually implementing and testing such algorithms. I have worked and continue to work in applications of computing techniques to solving algebraic and elementary number theoretic problems. I am currently interested in developing parallel algorithms for solving certain number theoretic algorithms, particularly regulator and class number computation in real quadratic number fields.
Computing invariants in quadratic fields
One of the oldest and most difficult problems facing anyone attempting to perform arithmetic in number fields or function fields is the computation of the regulator and the ideal class number. These invariants are not only fundamental to the structure of the number fields being investigated, but the difficulty of computing them provides some idea of how hard it is to break cryptosystems based on the corresponding field. Thus, developing efficient techniques for computing these invariants is of great importance.
The best method for computing the class number and regulator that is currently known is Mike Jacobson’s implementation of the subexponential algorithm; however, this technique computes these invariants of a real quadratic field only conditionally. Jacobson and I were able to extend slightly his earlier methods to compute regulators and class numbers of real quadratic fields with discriminants of 80 digits. Later, we were able to compute the regulator and class number of a quadratic field with a discriminant of 101 digits. These are the largest such invariants ever computed, and although they are conditional, display the power of these new techniques.
These ideas were also applied to the problem of finding certain quadratic polynomials that will produce many prime values. In order to compute the density of primes (under Hardy and Littlewood's conjecture F) for these polynomials, it is necessary to evaluate a very slowly converging infinite product. This can be transformed into a more rapidly converging product if the class number and regulator of a certain quadratic number field can be evaluated. As a result of this work, we were able to display a polynomial that has the largest asymptotic density of prime values for any polynomial of this type currently known.
Unconditional evaluation of the regulator
Although the regulator determined by Jacobson’s algorithm is only conditionally correct, it is easy to verify rigorously that this algorithm computes an integral multiple of the regulator. Having made this determination, the next problem is to establish that the integral multiplier of the regulator in Jacobson's regulator is 1. There are two phases to doing this:
Establish that the regulator must exceed some predetermined bound.
Prove that for no integer less than a certain amount can we have the regulator being Jacobson's regulator divided by that integer.
It is interesting that both of these phases can be parallelized. Furthermore, the technique should with an appropriate representation of the ideals involved be completely integral. That is, we should not at any point have to work with any numbers but integers. This means that we do not have to deal with approximations to irrational numbers and the concomitant loss of rigour that often occurs as a result. Currently, Jacobson and I are collaborating on this problem and we already have some preliminary results. Recently, we were able to compute unconditionally a regulator for a quadratic field with a 55-digit discriminant. This was done with only 16 processors; thus, when fully parallelized the new algorithm may allow us to compute regulators for fields of perhaps 60 or 65 digit discriminants.
Other problems involving real quadratic fields
Our ability to compute the regulator for fields of large discriminant allows us to develop some new algorithms for attacking certain types of Diophantine equations. We used this information to solve some difficult Diophantine equations that were left unsolved in a paper of Attila Pettö. We have also developed improved algorithms for computing the regulator and class number for many fields of more modest sized discriminants. These were used to verify the very difficult Ankeny-Artin-Chowla conjecture for all primes less than 200,000,000,000, and to test the Cohen-Lenstra heuristics on the distribution of the odd part of the class number. Indeed, with our new techniques Herman te Riele and I were able to compute all the class numbers for all real quadratic fields of prime discriminant less than 200,000,000,000. The results obtained agreed with what the Cohen-Lenstra heuristics predicted. This is the most extensive numerical verification of this conjecture ever performed and was only possible because of our new techniques.
Development of continued fractions for quadratic irrationalities
It is an extremely interesting and important problem to detect infinite families of positive integers D for which one can readily describe the fundamental unit (and the set of principal reduced ideals) in the quadratic field formed by adjoining the square root of D to the rationals. This problem is also connected to that of finding real quadratic fields of large class number. The problem is very old and several examples of such families are given in the 12th chapter of Dickson's History of the Theory of Numbers.
I have been very interested in the problem of exhibiting new families of such D values and determining the fundamental unit in the corresponding field. Alf van der Poorten and I have discussed a class of cases D=F(X) where F is a quadratic polynomial with leading coefficient a square, for which one obtains particularly small units essentially because the period length of the continued fraction expansion of the square root of D is, somewhat surprisingly, independent of the integer parameter X. Indeed, if a condition of Schinzel is satisfied and the absolute term C of F(X) is positive and not a perfect square, then the continued fraction of the square root of D can usually be expressed very simply in terms of the continued fraction of the square root of C for any sufficiently large X.
There are certain parametric families of values of D for which the period length of the continued fraction expansion of the square root of D is a linear function of one of the parameters used in the representation of D. Such families have been the subject of much study by certain authors, particularly Leon Bernstein who noticed that there is a pattern of cycles within the continued fraction period. I have been able to show that this phenomenon of cycles occurs in much greater generality than Bernstein was able to detect. I am currently working on extending these results to more general families of values of D.
Cryptography, the design and implementation of secure communication systems, plays a vital role in e-commerce, general Internet usage, large data base security and defence system security, to mention only a few. Indeed, cryptography is an essential component of any installation in which secure communication is needed.
If a sender and receiver of a message wish to ensure that no other unauthorized party can read their transmission, they will make use of a particular cryptosystem. A conventional cryptosystem can be thought of as a large collection of transformations, any one of which will render the original message unintelligible, but in order for the receiver to read the message, he must know which particular transformation was used by the sender.
The information that identifies the transformation used by the sender is called the key. To maintain security, it is essential that the key be kept secret from any possible adversary. It is important to recognize that there are several possible objectives of the eavesdropper. One of these may simply be to read the transmissions, but it is also possible that he may wish to forge messages or send false messages or simply delete transmissions. We distinguish between a privacy system and an authentication system. A privacy system protects against the unauthorized extraction of information from a given transmission, whereas an authentication system protects against the unauthorized injection of information into a transmission.
In both privacy and authentication systems it is vital that the key be known to only the sender and receiver of the messages. This means at some point the key must be communicated between the sender and receiver over a different and very secure transmission channel than that used for the transformed (encrypted) messages. As such communication channels are expensive and often inconvenient to use, the objective of much of modern cryptography has been to try to eliminate them altogether.
The security of the Digital Signature Standard (DSS), a standard for electronic signatures endorsed by the United States National Institute for Standards and Technology (NIST), is based on the presumed difficulty of the Discrete Logarithm Problem (DLP) in mathematical structures called finite fields. Also, a widely used procedure for exchanging secret cryptographic keys (the Diffie-Hellman protocol) derives its security from the belief in the difficulty of the DLP in finite fields. Intense investigation of this DLP has revealed that it is easier to solve than was previously thought. The technique (the Number Field Sieve) that was used to determine this result does not seem to be applicable to the DLP over other mathematical structures such as algebraic number fields and function fields. In particular, the DLP in function fields of low genus has proved so far to be very resistant to attack by all known methods. Indeed, the elliptic curve system, a special case of function field based schemes, is widely used in commercial applications. The study of the DLP in algebraic number fields and, particularly, in function fields is still very much in its infancy, and more research must be done before it can be adopted by institutions such as banks, securities firms and the insurance industry.
Although function fields as a basis for cryptography have several advantages over number fields, such as greater speed and cleaner computer implementation, it should be pointed out that from an algorithmic and cryptographic perspective, they are less well understood than number fields. It is important to investigate number fields for possible applications in cryptographic settings. Furthermore, it is always sound cryptographic practice to have access to as many different systems as possible. This ensures that the sender has a choice of possible schemes, a very useful feature if one or more of them is compromised.
The simplest number fields are the quadratic fields; performing arithmetic in these structures is relatively efficient and simple compared to doing this in other algebraic number fields. Nevertheless, they still possess many of the complicating features that make them resistant to methods that have proved to be successful in other structures. There are two types of quadratic fields: imaginary and real. The cryptographic aspects of the former have been the subject of intense scrutiny by Professor Buchmann and his research team at the Technical University of Darmstadt. One of my interests is the investigation of algorithms for implementing key exchange protocols and for solving the DLP in real quadratic fields.
In 1994, Scheidler, Buchmann and I described a technique for conducting a secure cryptographic key exchange by making use of the infrastructure of the principal ideal class of a real quadratic field. This system involves the communication partners agreeing on a certain reduced principal ideal of the number field from which the information needed to produce the key can be extracted by a mutually agreed-upon process. The partners never exchange this ideal over their communication channel, but instead provide each other with enough information that each can compute it. However, anyone who is eavesdropping on his or her communication channel will not get enough information to determine what this ideal is. The security of this scheme depends upon the number of reduced principal ideals in the quadratic number field and the difficulty of solving the DLP in the field.
The first of these problems is easily handled by use of the Cohen-Lenstra heuristics on the distribution of the odd part of the class number. This strongly suggests that the class number is very likely to be small when the discriminant, an important invariant of the field, is a prime or 4 times a prime. Thus, if the discriminant is chosen in this way and is about six hundred bits, we can be assured that there will very likely be at least 2^{250} reduced principal ideals, making it impossible to break the system by exhaustive search.
The only reasonably likely security problem is the possibility that the DLP can be solved quickly in the field. This problem has been examined by a number of authors, and Jacobson’s implementation of the sub exponential algorithm is the current state of the art in solving this problem. It must be emphasized that all of this work is based on the assumption of the truth of a generalized Riemann hypothesis (GRH), an assumption about the location of the zeros of a certain function of a complex variable. Nevertheless, there are still a number of problems associated with Jacobson's algorithm. Giesbrecht, Jacobson and Strorjohann have very recently addressed some of these regarding the matrix operations, but there remains a still difficult problem, essentially involving precision, that is very tricky to deal with. This is a subject that I am currently investigating.
It is important to realize that even the most improved version of Jacobson's algorithm is still very slow when the values of the discriminant become large. For example, it took more than 87 days for his algorithm (implemented on 16 550MHz Pentium III computers) to compute the regulator, another important invariant, of a certain real quadratic field with a discriminant of 101 decimal digits. As computation of the regulator is essentially an instance of the DLP, this gives some indication of how difficult the DLP is to solve, particularly when one considers that the discriminant in this case was an easy one. It seems that it is possible to construct discriminants where the solution of the DLP appears to be very difficult indeed.
Improved implementation
The current version of our key exchange protocol is still somewhat slow. This is not necessarily that serious a problem because key exchanges are done infrequently. However, if we want to use the method as a signature scheme, speed becomes a more important consideration. By developing a new representation for the ideals being considered, we have been able to lower the precision needed at the expense of increasing the complexity of the second communication round.
This latter has proved to be no real problem as it executes much more rapidly than the first round protocol and seems not to be needed after all because after extensive testing we have found that the communication partners always get the same key ideal after the first round. After profiling our implementation we found that by far the most time-consuming part of the protocol is the ideal reduction phase--97% of the time is spent on ideal reduction. Clearly, any further attempts to optimize the algorithm must begin there. One possible improvement is to incorporate the NUCOMP algorithm of Shanks, which lowers the size of the numbers needed in the computations. Our preliminary work on this indicates that this should effect a considerable saving.
Benchmarking
As there is no rigorous mathematical proof of the security of our (or almost any other) cryptosystem, the only way we can certify its security and effectiveness is to test it extensively. It will be necessary to conduct very large-scale numerical experiments in order to acquire the data needed to determine accurately the security of our cryptographic scheme.
This should be possible because the data generated by our hardware configuration should allow for accurate and reliable extrapolation. We have funding from Alberta Ingenuity, the Alberta Science and Research Investments Program, the Canada Foundation for Innovation and the University of Calgary to set up a high-speed computing laboratory for testing and benchmarking cryptographic systems. The lab should be up and running early in 2003.
Design and Development of Special-purpose Hardware Devices
A number sieve is a special-purpose computing device that finds solutions to large systems of single variable linear congruences with various distinct coprime moduli. The mechanism detects these solutions simply by searching through all integers up to a certain, preselected bound. While this approach may sound naive, it is in fact possible through a judicious exploitation of parallelism to make these devices execute at a very rapid rate. For solving certain problems, the use of these machines is the fastest method known. The construction of these devices has a very long history, with the first having been built in 1912. They were originally developed to assist in the integer factoring problem, but as better methods of doing this have evolved, they are now being used for other purposes.
For many years I have participated in the design and construction of number sieves. The most recent of these devices, the MSSU, completed in 1995, is the fastest and most powerful in existence. For most problems this machine can canvass the integers at the rate of 1 to 3 trillion numbers per second. This rate is over two orders of magnitude faster than that of any previous sieve, and while it may not be as fast as some supercomputers, it operates at only a very small fraction of the cost. I am currently working with a graduate student, utilizing FPGA (Field Programmable Gate Array) technology, to produce a much faster and more versatile device.
For the past several years I have been much involved in applying number sieves to the solution of a number of problems arising in number theory and cryptography. These range through the tabulation of a difficult function of Erdös and Selfridge, an investigation and tabulation of pseudopowers, calculation of certain quadratic polynomials which have a high density of prime values, development of primality tests and tabulation of pseudosquares, to the determination of primes for which a certain character sum must be positive.
A. Granville, R. Mollin and I were able to use the MSSU to assist in proving a rather difficult conjecture of Mollin concerning an upper bound on the least inert prime in a real quadratic number field. I should also point out that we also used the MSSU to show for the first time that both 110 and 435 can be written as a sum of three cubes, a problem of much interest to
diophantine analysts.
We have also been able to find a use for a number sieve in cryptography. In the real quadratic field based Diffie-Hellman protocol, it is necessary to use a field with a discriminant that will frustrate as much as possible any attempt by a cryptanalyst to break the system. We discovered that such a discriminant should be a quadratic nonresidue for as many of the small primes as possible. By using the MSSU, we were very efficiently able to find quite large discriminants that are quadratic nonresidues for all primes less than 527. It should be possible to improve this by using a new idea for sieve implementation due to Dan Bernstein.
History of Mathematics and Computation
I have an interest in many aspects of the history of computation in mathematics, particularly in number theory. For example, I have collected and published much information on number sieves and I maintain a keen interest in the history of the Pell equation. Also, my book, Édouard Lucas and Primality Testing, is a research monograph on the history of primality testing from classical antiquity to today. In particular, I show the tremendous influence on primality testing wielded by Édouard Lucas, the first individual to show that primality testing could be performed, at least on certain forms of numbers, without recourse to the laborious and tedious process of trial division.
I then go on to show how the modern methods of primality testing have evolved since Lucas' time. Indeed, half of the book is devoted to developments that took place since Lucas' death in 1891. These methods culminate in the two main techniques that are used today: the Jacobi sums algorithm and the elliptic curve algorithm. I also was able to show that Lucas, before elliptic curves were even considered by mathematicians, was very close indeed to a test for primes that was similar to today's elliptic curve methods. Unfortunately, his premature death precluded any further development of his early ideas.