Assignment 3 ============ Linear congruential generator ----------------------------- In class, we discussed the linear congruential generator scheme. Starting from a single initial seed :math:`\mathsf{x_0}` (a non-negative integer), the sequence unfolds according to :math:`\mathsf{x_{n+1} = {(ax_n + c) \mod m}}`. The most convenient choice of modulus is :math:`\mathsf{m = 2^{64}}`, such that the modulus operation is effected automatically as overflow on a 64-bit fixed-width binary register. In that case, if we take ``x`` to be of type ``UInt64``, each next number in the sequence is found with a single multiplication and addition: ``x = a*x + c``. The trick is to choose good values of the multiplier and increment. Two popular choices are shown in the table. .. list-table:: :widths: 12 18 18 :header-rows: 1 * - scheme - multiplier (:math:`\mathsf{a}`) - increment (:math:`\mathsf{c}`) * - LLNL/SPRNG - 2862933555777941757 - 3037000493 * - Knuth (MMIX) - 6364136223846793005 - 1442695040888963407 These numbers are designed to satisfy the Hull–Dobell Theorem and thus ensure a repeat time of :math:`\mathsf{2^{64}}`, the maximum period. A known deficiency is that the least significant bits are somewhat less random than the most significant bits. A convenient way to write this in Julia is with a *closure*, in which a function returns a function that captures a set of parameters provided by the user as arguments. .. code-block:: Julia function create_lcg(a0, c0, x0) a = a0 c = c0 x = x0 return function() x = a*x + c return x end end my_seed = UInt64(1975) lcg = create_lcg(UInt64(2862933555777941757), UInt64(3037000493), my_seed) # generate pseudorandom 64-bit sequences lcg() => 0x8516d481892a6308 lcg() => 0x0ac522dfb57e5215 # interpret as signed, two's complement integers reinterpret(Int64,lcg()) => 712743970554937838 reinterpret(Int64,lcg()) => -6940842903748281501 # rescale to produce numbers in the closed interval [0,1] lcg()/(UInt64(0)-1) => 0.660163672390821 lcg()/(UInt64(0)-1) => 0.037941720613110005 1. Explain why the period of the linear congruential generator described above can never exceed :math:`\mathsf{2^{64}}`. 2. It's sad/amusing to read about `IBM's RANDU generator `_, a notorious linear congruential generator developed by the company in the 1960s for use on its mainframes. It is widely considered one of the most flawed random number generators ever created. Many scientific results from that era that were based on Monte Carlo and stochastic methods are suspect. Make an instance of RANDU using ``randu = create_lcg(UInt32(65539),UInt32(0), UInt32(seed))}``. Generate 3000 numbers and output them to a file ``randu.dat`` in three-column format as follows: .. list-table:: :widths: 10 10 10 :header-rows: 0 * - :math:`\mathsf{x_0}` - :math:`\mathsf{x_1}` - :math:`\mathsf{x_2}` * - :math:`\mathsf{x_3}` - :math:`\mathsf{x_4}` - :math:`\mathsf{x_5}` * - :math:`\mathsf{x_6}` - :math:`\mathsf{x_7}` - :math:`\mathsf{x_8}` * - - :math:`\vdots` - * - :math:`\mathsf{x_{2997}}` - :math:`\mathsf{x_{2998}}` - :math:`\mathsf{x_{2999}}` In ``gnuplot``, type ``splot 'randu.dat' with points pointtype 7 pointsize 0.5``, which can be abbreviated to ``splot 'randu.dat' w p pt 7 ps 0.5``. Rotate the three-dimensional image until you can identify the flaw in the randomness. Include a screenshot or pdf of the revealing plot. Manipulating floating-point representations at the bit Level ------------------------------------------------------------ In Julia the ``reinterpret`` function can be used to read an integer's bit pattern as a floating-point type. This is distinct from standard *type casting*, which tries to preserve value. .. code-block:: Julia # Reinterpret the integer 4607182418800017408 as a Float64 UInt64(4607182418800017408) => 0x3ff0000000000000 Int64(0x3ff0000000000000) => 4607182418800017408 reinterpret(Float64, 0x3ff0000000000000) => 1.0 # On the other hand, casting gives Float64(0x3ff0000000000000) # = > 4.6071824188000174e18 Here, ``UInt64`` and ``Float64`` are the types we'd call ``unsigned long int`` (or ``uint64_t``) and ``double`` in C++. Recall that the 64-bit IEEE Standard 754 uses a bit pattern with 1 sign bit, 11 exponent bits, and 52 mantissa bits: .. list-table:: :header-rows: 0 * - :math:`\mathsf{s}` - :math:`\mathsf{e_{10}}` - :math:`\mathsf{e_{9}}` - :math:`\cdots` - :math:`\mathsf{e_{0}}` - :math:`\mathsf{m_{51}}` - :math:`\mathsf{m_{50}}` - :math:`\cdots` - :math:`\mathsf{m_{0}}` 3. Here, you will generate pseudorandom numbers in the range :math:`\mathsf{[0,1)}`. Furthermore, you will implement a shuffling scheme that renders the numbers more randomized. Start by populating an array ``num_bank`` with 1024 entries, each a 64-bit integer produced by a high-quality linear congruential generator. Once this initialization step is complete, carry out the following steps to produce the next number. Generate a 64-bit unsigned integer :math:`\mathsf{x = b_{63} b_{62} \cdots b_2 b_1 b_0}`. Ignore the two least significant bits (:math:`\mathsf{b_0}` and :math:`\mathsf{b_1}`). Harvest the next 10 bits to create a number :math:`\mathsf{k = 1 + b_{11}b_{10}\cdots b_4b_3b_2}` (with :math:`\mathsf{1 \le k \le 1024}`), which we will treat as a (one-based) array index. Swap the values of :math:`\mathsf{x}` and the :math:`\mathsf{k}\!` th element of `num_bank`. From the new :math:`\mathsf{x}`, create a floating point number :math:`\mathsf{1 \le f < 2}` by transcribing the 52 most significant bits as follows: .. list-table:: :header-rows: 0 * - :math:`\mathsf{0}` - :math:`\mathsf{0}` - :math:`\mathsf{1}` - :math:`\mathsf{1}` - :math:`\mathsf{1}` - :math:`\cdots` - :math:`\mathsf{1}` - :math:`\mathsf{1}` - :math:`\mathsf{b_{63}}` - :math:`\mathsf{b_{62}}` - :math:`\cdots` - :math:`\mathsf{b_{14}}` - :math:`\mathsf{b_{13}}` - :math:`\mathsf{b_{12}}` Then return :math:`f-1`. Examine the program ``shuffled-lcg.jl``. Complete the missing return value so that each call to ``next_rnd()`` produces a floating-point number in the interval :math:`\mathsf{[0,1)}` from the highest 52 bits in ``x`` following the shuffling scheme described above. .. code-block:: Julia num_bank = Vector{UInt64}(undef, 1024) for i in 1:1024 num_bank[i] = lcg() end function next_rnd() x = lcg() k = x & ((1 << 12) - 1) k = (k >> 2) + 1 # swap values num_bank[k], x = x, num_bank[k] return ???? end All you need to report for this question is a valid Julia expression for the correct return value. *Hint:* my solution used the ``reinterpret`` function, a bit shift, a bitwise OR with a mask, and a floating-point subtraction. 4. There are various `run tests `_ that are used evaluate the apparent randomness of a pseudorandom sequence. Here is one scheme. For each number :math:`\mathsf{x}`, let :math:`\mathsf{x < 1/2}` map to symbol ":math:`\mathsf{-}`" and :math:`\mathsf{x \ge 1/2}` map to ":math:`\mathsf{+}`". Then we can break up sequences :math:`\mathsf{(x_0, x_1, x_2, \ldots)}` into contiguous runs of common symbols; e.g. .. math:: \mathsf{\underbrace{+ + +}_3 \overbrace{- - - - -}^5 \underbrace{+ + +}_3 - \underbrace{+ + + + + +}_6 - \underbrace{+ + +}_3} shows 7 runs of size 3, 5, 3, 1, 6, 1, and 3. Prove that :math:`\mathsf{\textsf{Prob}(1\textsf{-run}) = 1/2}`, :math:`\mathsf{\textsf{Prob}(2\textsf{-run}) = 1/4}`, ..., and :math:`\mathsf{\textsf{Prob}(n\textsf{-run}) = 2^{-n}}`. 5. Modify ``shuffled-lcg.jl`` so that it outputs a histogram of the frequency of :math:`\mathsf{n}`-runs for :math:`\mathsf{n = 1, 2, \ldots, 20}`. You should be able to reproduce the terminal session below. For the two seeds shown, report the histogram entries for :math:`\mathsf{n = 5, 6, 7}`. .. code-block:: Julia $ julia shuffled-lcg.jl Usage: julia shuffled-lcg.jl $ julia shuffled-lcg.jl 1975 1 0.5004000000 2 0.2484000000 3 0.1244000000 4 0.0657000000 . . 20 0.0000000000 $ julia shuffled-lcg.jl 87263 1 0.5046000000 2 0.2529000000 3 0.1194000000 4 0.0607000000 . . 20 0.0000000000 6. Generate :math:`\mathsf{N=20}` such histograms, based on 20 different seeds. Collate the data sets to report the aggregate histogram in the format :math:`\mathsf{n}`, :math:`\mathsf{\textsf{AVE}[\textsf{Prob}(n\textsf{-runs})]}`, :math:`\mathsf{\textsf{ERR}[\textsf{Prob}(n\textsf{-runs})]}`, where :math:`\mathsf{\textsf{AVE}}` denotes the mean over the :math:`\mathsf{N}` instantiations and :math:`\mathsf{\textsf{ERR} = \sqrt{\textsf{VAR}/N}}` is the square root of the variance from the mean divided by :math:`\mathsf{N}`. By typing ``plot 'hist-all.dat' u 1:2:3 w e, 2**(-x)`` and then ``set logscale y; replot`` in gnuplot, you should be able to produce a plot similar to this: .. image:: figures/runs.pdf :width: 560 :alt: Alternative text for accessibility :align: center