本文作者:合肥工业大学 管理学院 钱洋 email:1563178220@qq 内容可能有不到之处,欢迎交流。
未经本人允许禁止转载

文章目录

  • 背景
  • python中numpy生成随机数
    • 产生一组随机数
    • 产生二维随机数
    • 归一化随机数
    • 标准正太分布随机数
    • 多元正太分布随机数
  • Java中math3产生各种随机数
    • CorrelatedRandomVectorGenerator
      • 使用案例
    • GaussianRandomGenerator
      • 使用案例
    • HaltonSequenceGenerator
      • 使用案例
    • JDKRandomGenerator
      • 使用案例
    • SobolSequenceGenerator
      • 使用案例
    • UniformRandomGenerator
      • 使用案例

背景

在编写机器学习算法时,经常需要对各类参数进行初始化,例如一些使用变分推断算法的模型。无论是在Java中,还是Python中,随机数生成器使用都非常重要。

在Python中,我们可以使用numpy中的函数产生各类随机数。以下将先介绍numpy产生随机数的方式,之后介绍Java中Math3 如何产生用户想要的随机数。

python中numpy生成随机数

产生一组随机数

import numpy as np
a1 = np.random.sample(3)
print(a1)

程序输出结果为:

[0.894403   0.75327423 0.0598    ]

产生二维随机数

import numpy as np
a2 = np.random.random([3, 5])
print(a2)

程序输出结果为:

[[0.74259942 0.14265614 0.39788471 0.24822603 0.70212864]
 [0.24499887 0.10752136 0.87938368 0.66949099 0.60077382]
 [0.93464286 0.12540026 0.23024034 0.01755745 0.1791168 ]]

归一化随机数

import numpy as np
a2 = np.random.random([3, 5])
a2 = a2 / a2.sum(1)[:, np.newaxis]  # normalize
print(a2)

程序输出结果为:

[[0.20371704 0.13704266 0.07736116 0.07489164 0.50698749]
 [0.16472605 0.10869033 0.44036911 0.10437194 0.18184257]
 [0.12031567 0.04564694 0.0950007  0.33511749 0.40391919]]

标准正太分布随机数

import numpy as np
c = np.random.normal(size=(3,4))
print(c)

程序输出结果为:

[[-0.3369725  -1.05351817 -0.84444184  0.43715886]
 [-0.56812588  0.15303606  0.50248202  0.95384482]
 [-0.63582981  0.44559096 -1.91725906 -0.70182715]]

多元正太分布随机数

import numpy as np
V = np.random.multivariate_normal(np.zeros(5), np.identity(5) * (5),size=3)
print(V)
                                               

即均值为0,协方差为5的多元正太分布中,产生随机数,输出结果为:

[[ 2.91755746 -1.67030031 -1.0542531   0.13214101 -2.03207468]
 [-1.86659205  0.14574427 -4.24525326 -3.91111677 -2.81316827]
 [-1.57533411  2.54300223 -0.69052118 -3.19566595  3.21427621]]

Java中math3产生各种随机数

Math3中的选择器有:

即:

CorrelatedRandomVectorGenerator

  • CorrelatedRandomVectorGenerator:这个选择器用于从多元正太分布中抽取随机数,其中的方法包括:

通过源码可以看到,CorrelatedRandomVectorGenerator中的构造方法有两个,分别是:

    /**
     * Builds a correlated random vector generator from its mean
     * vector and covariance matrix.
     *
     * @param mean Expected mean values for all components.
     * @param covariance Covariance matrix.
     * @param small Diagonal elements threshold under which  column are
     * considered to be dependent on previous ones and are discarded
     * @param generator underlying generator for uncorrelated normalized
     * components.
     * @throws org.apachemons.math3.linear.NonPositiveDefiniteMatrixException
     * if the covariance matrix is not strictly positive definite.
     * @throws DimensionMismatchException if the mean and covariance
     * arrays dimensions do not match.
     */
    public CorrelatedRandomVectorGenerator(double[] mean,
                                           RealMatrix covariance, double small,
                                           NormalizedRandomGenerator generator) {
        int order = covariance.getRowDimension();
        if (mean.length != order) {
            throw new DimensionMismatchException(mean.length, order);
        }
        this.mean = mean.clone();

        final RectangularCholeskyDecomposition decomposition =
            new RectangularCholeskyDecomposition(covariance, small);
        root = decomposition.getRootMatrix();

        this.generator = generator;
        normalized = new double[decomposition.getRank()];

    }

从这个构造方法中,可以看到其输入是均值数组,协方差矩阵,一个double类型的值,和实例化的NormalizedRandomGenerator。在后面会介绍这个构造方法的使用。
另外一个构造方法是:

/**
     * Builds a null mean random correlated vector generator from its
     * covariance matrix.
     *
     * @param covariance Covariance matrix.
     * @param small Diagonal elements threshold under which  column are
     * considered to be dependent on previous ones and are discarded.
     * @param generator Underlying generator for uncorrelated normalized
     * components.
     * @throws org.apachemons.math3.linear.NonPositiveDefiniteMatrixException
     * if the covariance matrix is not strictly positive definite.
     */
    public CorrelatedRandomVectorGenerator(RealMatrix covariance, double small,
                                           NormalizedRandomGenerator generator) {
        int order = covariance.getRowDimension();
        mean = new double[order];
        for (int i = 0; i < order; ++i) {
            mean[i] = 0;
        }

        final RectangularCholeskyDecomposition decomposition =
            new RectangularCholeskyDecomposition(covariance, small);
        root = decomposition.getRootMatrix();

        this.generator = generator;
        normalized = new double[decomposition.getRank()];

    }

使用多元正太分布,产生一组随机数,需要使用这里面的一个方法:

 /** Generate a correlated random vector.
     * @return a random vector as an array of double. The returned array
     * is created at each call, the caller can do what it wants with it.
     */
    public double[] nextVector() {

        // generate uncorrelated vector
        for (int i = 0; i < normalized.length; ++i) {
            normalized[i] = generator.nextNormalizedDouble();
        }

        // compute correlated vector
        double[] correlated = new double[mean.length];
        for (int i = 0; i < correlated.length; ++i) {
            correlated[i] = mean[i];
            for (int j = 0; j < root.getColumnDimension(); ++j) {
                correlated[i] += root.getEntry(i, j) * normalized[j];
            }
        }

        return correlated;

    }

使用案例

下面以具体的案例讲解如何使用CorrelatedRandomVectorGenerator。

import org.apache.commons.math3.linear.MatrixUtils;
import org.apache.commons.math3.linear.RealMatrix;
import org.apache.commons.math3.random.CorrelatedRandomVectorGenerator;
import org.apache.commons.math3.random.GaussianRandomGenerator;
import org.apache.commons.math3.random.JDKRandomGenerator;
import org.apache.commons.math3.random.RandomGenerator;

public class MultivariateGaussianGeneratorTest2 {

	public static void main(String[] args) {
		RandomGenerator rg = new JDKRandomGenerator();
		rg.setSeed(17399225432l);  // 随机种子
		GaussianRandomGenerator rawGenerator = new GaussianRandomGenerator(rg);
		double[] mean = {1, 2, 5};
		double[][] arrA = {{1, 2, 3}, {3, 4, 5}, {4, 5, 6}};
		RealMatrix matrixA = MatrixUtils.createRealMatrix(arrA); 
		//生成协方差矩阵
		RealMatrix covariance = matrixA.multiply (matrixA.transpose());
		// 调用函数
		CorrelatedRandomVectorGenerator generator = 
		    new CorrelatedRandomVectorGenerator(mean, covariance, 1.0e-12 * covariance.getNorm(), rawGenerator);
		double[] randomVector = generator.nextVector();
		for(double d : randomVector){
			System.out.println(d);
		}
	}
}

如上面程序所示,设置了均值数组,通过矩阵和矩阵的逆相乘得到协方差,以此作为输入产生一组来自多元正太分布的随机数。
上面程序的输出结果为:

Array2DRowRealMatrix{{14.0,26.0,32.0},{26.0,50.0,62.0},{32.0,62.0,77.0}}
8.241122654913196
15.481679575594983
21.601958035935876

GaussianRandomGenerator

这个类,在上面的代码中已有使用。该类用于从标准正太分布中产生一个值。其中,该类的构造方法如下:

    /** Create a new generator.
     * @param generator underlying random generator to use
     */
    public GaussianRandomGenerator(final RandomGenerator generator) {
        this.generator = generator;
    }

该类中产生,产生一个随机数的方法如下:

    /** Generate a random scalar with null mean and unit standard deviation.
     * @return a random scalar with null mean and unit standard deviation
     */
    public double nextNormalizedDouble() {
        return generator.nextGaussian();
    }

使用案例

下面以一个案例讲解其使用:

import org.apache.commons.math3.random.GaussianRandomGenerator;
import org.apache.commons.math3.random.JDKRandomGenerator;
import org.apache.commons.math3.random.RandomGenerator;

public class GaussianRandomTest {

	public static void main(String[] args) {
		RandomGenerator rg = new JDKRandomGenerator();
//		rg.setSeed(17399225432l);  // 随机种子
		GaussianRandomGenerator rawGenerator = new GaussianRandomGenerator(rg);
		for (int i = 0; i < 10; i++) {
			double g = rawGenerator.nextNormalizedDouble();
			System.out.println(g);
		}
	}
}

执行该程序,会在控制台输出10个随机数,如下所示:

HaltonSequenceGenerator

Halton sequences常用于 Monte Carlo估计中。其产生随机数的原理是以一个质数为基,例如2或者3等,然后开始在0-1之间进行划分。例如:

​1⁄2, ​1⁄4, ​3⁄4, ​1⁄8, ​5⁄8, ​3⁄8, ​7⁄8, ​1⁄16, ​9⁄16,...
​1⁄3, ​2⁄3, ​1⁄9, ​4⁄9, ​7⁄9, ​2⁄9, ​5⁄9, ​8⁄9, ​1⁄27,...

在math3中,HaltonSequenceGenerator类中的构造方法有:

    /**
     * Construct a new Halton sequence generator for the given space dimension.
     *
     * @param dimension the space dimension
     * @throws OutOfRangeException if the space dimension is outside the allowed range of [1, 40]
     */
    public HaltonSequenceGenerator(final int dimension) throws OutOfRangeException {
        this(dimension, PRIMES, WEIGHTS);
    }

即设置产生随机数的维度。
另外,一个构造方法是:

/**
     * Construct a new Halton sequence generator with the given base numbers and weights for each dimension.
     * The length of the bases array defines the space dimension and is required to be &gt; 0.
     *
     * @param dimension the space dimension
     * @param bases the base number for each dimension, entries should be (pairwise) prime, may not be null
     * @param weights the weights used during scrambling, may be null in which case no scrambling will be performed
     * @throws NullArgumentException if base is null
     * @throws OutOfRangeException if the space dimension is outside the range [1, len], where
     *   len refers to the length of the bases array
     * @throws DimensionMismatchException if weights is non-null and the length of the input arrays differ
     */
    public HaltonSequenceGenerator(final int dimension, final int[] bases, final int[] weights)
            throws NullArgumentException, OutOfRangeException, DimensionMismatchException {

        MathUtils.checkNotNull(bases);

        if (dimension < 1 || dimension > bases.length) {
            throw new OutOfRangeException(dimension, 1, PRIMES.length);
        }

        if (weights != null && weights.length != bases.length) {
            throw new DimensionMismatchException(weights.length, bases.length);
        }

        this.dimension = dimension;
        this.base = bases.clone();
        this.weight = weights == null ? null : weights.clone();
        count = 0;
    }

即需要产生数据的维度,以及所使用的基(质素数组)以及权重。
其中,在该类中,默认的基有:

    /** The first 40 primes. */
    private static final int[] PRIMES = new int[] {
        2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67,
        71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139,
        149, 151, 157, 163, 167, 173
    };

默认的权重为:

  /** The optimal weights used for scrambling of the first 40 dimension. */
    private static final int[] WEIGHTS = new int[] {
        1, 2, 3, 3, 8, 11, 12, 14, 7, 18, 12, 13, 17, 18, 29, 14, 18, 43, 41,
        44, 40, 30, 47, 65, 71, 28, 40, 60, 79, 89, 56, 50, 52, 61, 108, 56,
        66, 63, 60, 66
    };

如果需要产生一组随机数,需要调用该类中的两个方法:

 /** {@inheritDoc} */
    public double[] nextVector() {
        final double[] v = new double[dimension];
        for (int i = 0; i < dimension; i++) {
            int index = count;
            double f = 1.0 / base[i];

            int j = 0;
            while (index > 0) {
                final int digit = scramble(i, j, base[i], index % base[i]);
                v[i] += f * digit;
                index /= base[i]; // floor( index / base )
                f /= base[i];
            }
        }
        count++;
        return v;
    }
        /**
     * Skip to the i-th point in the Halton sequence.
     * <p>
     * This operation can be performed in O(1).
     *
     * @param index the index in the sequence to skip to
     * @return the i-th point in the Halton sequence
     * @throws NotPositiveException if index &lt; 0
     */
    public double[] skipTo(final int index) throws NotPositiveException {
        count = index;
        return nextVector();
    }

使用案例

下面将以具体的案例讲解HaltonSequenceGenerator的使用。

import org.apache.commons.math3.random.HaltonSequenceGenerator;
public class HaltonSequenceTest {

	public static void main(String[] args) {
		/*****第一种方式产生一组随机数*****/
		HaltonSequenceGenerator randomVectorGenerator = new HaltonSequenceGenerator(3);
		//设置
		randomVectorGenerator.skipTo(999999); 
		//产生一组随机数
		double[] b = randomVectorGenerator.nextVector();
		for (int i = 0; i < b.length; i++) {
			System.out.println(b[i]);
		}
		/*****第二种方式产生一组随机数*****/
		System.out.println(".......第二种构造方法产生随机数.........");
		HaltonSequenceGenerator randomVectorGenerator1 = new HaltonSequenceGenerator(4, new int[] { 3, 5, 7,11, 13 }, null);
		//设置
		randomVectorGenerator1.skipTo(999999); 
		//产生一组随机数
		double[] b1 = randomVectorGenerator1.nextVector();
		for (int i = 0; i < b1.length; i++) {
			System.out.println(b1[i]);
		}
	}
}

执行该程序,输出结果为:

调整skipTo()方法中的数字,可以产生不同的随机数。

JDKRandomGenerator

JDKRandomGenerator类继承了java.util中的Random类,其使用方式较为简单。其构造方法主要有:

    /**
     * Create a new JDKRandomGenerator with a default seed.
     */
    public JDKRandomGenerator() {
        super();
    }

    /**
     * Create a new JDKRandomGenerator with the given seed.
     *
     * @param seed initial seed
     * @since 3.6
     */
    public JDKRandomGenerator(int seed) {
       

另外,其还包括两个方法,用于设置随机数种子。

该类可以调用Random类中的next, nextBoolean, nextBytes, nextDouble, nextFloat, nextGaussian, nextInt, nextInt, nextLong, setSeed方法。

使用案例

下面为使用案例:

import org.apache.commons.math3.random.JDKRandomGenerator;
import org.apache.commons.math3.random.RandomGenerator;

public class JDKRandomTest {

	public static void main(String[] args) {
		RandomGenerator rg = new JDKRandomGenerator();
		for (int i = 0; i < 2; i++) {
			System.out.println("double:" + rg.nextDouble());
			System.out.println("boolean:" + rg.nextBoolean());
			System.out.println("float:"  + rg.nextFloat());
			System.out.println("gaussian:" + rg.nextGaussian());
			System.out.println("int:" + rg.nextInt());
			System.out.println("long:" + rg.nextLong());
		}
		
	}
}

执行该程序,输出结果为:

SobolSequenceGenerator

SobolSequenceGenerator类的构造方法有两种,常使用第一种:

 /**
     * Construct a new Sobol sequence generator for the given space dimension.
     *
     * @param dimension the space dimension
     * @throws OutOfRangeException if the space dimension is outside the allowed range of [1, 1000]
     */
    public SobolSequenceGenerator(final int dimension) throws OutOfRangeException {
        if (dimension < 1 || dimension > MAX_DIMENSION) {
            throw new OutOfRangeException(dimension, 1, MAX_DIMENSION);
        }

        // initialize the other dimensions with direction numbers from a resource
        final InputStream is = getClass().getResourceAsStream(RESOURCE_NAME);
        if (is == null) {
            throw new MathInternalError();
        }

        this.dimension = dimension;

        // init data structures
        direction = new long[dimension][BITS + 1];
        x = new long[dimension];

        try {
            initFromStream(is);
        } catch (IOException e) {
            // the internal resource file could not be read -> should not happen
            throw new MathInternalError();
        } catch (MathParseException e) {
            // the internal resource file could not be parsed -> should not happen
            throw new MathInternalError();
        } finally {
            try {
                is.close();
            } catch (IOException e) { // NOPMD
                // ignore
            }
        }
    }

    /**
     * Construct a new Sobol sequence generator for the given space dimension with
     * direction vectors loaded from the given stream.
     * <p>
     * The expected format is identical to the files available from
     * <a href="http://web.maths.unsw.edu.au/~fkuo/sobol/">Stephen Joe and Frances Kuo</a>.
     * The first line will be ignored as it is assumed to contain only the column headers.
     * The columns are:
     * <ul>
     *  <li>d: the dimension</li>
     *  <li>s: the degree of the primitive polynomial</li>
     *  <li>a: the number representing the coefficients</li>
     *  <li>m: the list of initial direction numbers</li>
     * </ul>
     * Example:
     * <pre>
     * d       s       a       m_i
     * 2       1       0       1
     * 3       2       1       1 3
     * </pre>
     * <p>
     * The input stream <i>must</i> be an ASCII text containing one valid direction vector per line.
     *
     * @param dimension the space dimension
     * @param is the stream to read the direction vectors from
     * @throws NotStrictlyPositiveException if the space dimension is &lt; 1
     * @throws OutOfRangeException if the space dimension is outside the range [1, max], where
     *   max refers to the maximum dimension found in the input stream
     * @throws MathParseException if the content in the stream could not be parsed successfully
     * @throws IOException if an error occurs while reading from the input stream
     */
    public SobolSequenceGenerator(final int dimension, final InputStream is)
            throws NotStrictlyPositiveException, MathParseException, IOException {

        if (dimension < 1) {
            throw new NotStrictlyPositiveException(dimension);
        }

        this.dimension = dimension;

        // init data structures
        direction = new long[dimension][BITS + 1];
        x = new long[dimension];

        // initialize the other dimensions with direction numbers from the stream
        int lastDimension = initFromStream(is);
        if (lastDimension < dimension) {
            throw new OutOfRangeException(dimension, 1, lastDimension);
        }
    }

使用该方法产生的随机数如下图所示:

使用案例

以下为一个使用案例:

import java.util.ArrayList;
import java.util.List;
import org.apache.commons.math3.linear.Array2DRowRealMatrix;
import org.apache.commons.math3.linear.RealMatrix;
import org.apache.commons.math3.random.SobolSequenceGenerator;

public class SobolSequenceTest {

	public static void main(String[] args) {
		//产生一组随机数---测试案例
		SobolSequenceGenerator generator = new SobolSequenceGenerator(5);
	    generator.skipTo(999999); //这里必须使用,否则产生的全部是0
	    System.out.println("..............................");
	    double[] vector = generator.nextVector();
	    for (int i = 0; i < vector.length; i++) {
	    	System.out.println(vector[i]);
		}
	    System.out.println("...........SobolSequenceGenerator产生一组随机数.......");
	    //产生多组随机数,并添加到矩阵中
	    System.out.println(".............生成随机数矩阵...............");
	    List<RealMatrix> points = new ArrayList<RealMatrix>();
	    for (double i = 0; i < 3; i++) {
	        double[] vector1 = generator.nextVector();
	        RealMatrix pointMatrix = new Array2DRowRealMatrix(vector1);
	        points.add(pointMatrix);
	    }
	    for (int i = 0; i < points.size(); i++) {
			System.out.println(points.get(i));
		}
	}
}

程序的输出结果为:

UniformRandomGenerator

从均匀分布中产生随机数。UniformRandomGenerator类实现了NormalizedRandomGenerator接口。UniformRandomGenerator类的构造方法为:

   /** Create a new generator.
     * @param generator underlying random generator to use
     */
    public UniformRandomGenerator(RandomGenerator generator) {
        this.generator = generator;
    }

下面为其部分源码:

   /** Generate a random scalar with null mean and unit standard deviation.
     * <p>The number generated is uniformly distributed between -&sqrt;(3)
     * and +&sqrt;(3).</p>
     * @return a random scalar with null mean and unit standard deviation
     */
    public double nextNormalizedDouble() {
        return SQRT3 * (2 * generator.nextDouble() - 1.0);
    }

从该源码中可以看到, nextNormalizedDouble()方法产生的随机数在 [-√3, +√3].之间,因为SQRT3 设置为√3,generator.nextDouble()产生的值范围是[0-1]。

使用案例

import org.apache.commons.math3.random.JDKRandomGenerator;
import org.apache.commons.math3.random.RandomGenerator;
import org.apache.commons.math3.random.UniformRandomGenerator;

public class UniformRandomTest {

	public static void main(String[] args) {
		RandomGenerator rg = new JDKRandomGenerator();
		rg.setSeed(10);
		UniformRandomGenerator generator = new UniformRandomGenerator(rg);
		double[] sample = new double[10];
		for (int i = 0; i < sample.length; ++i) {
			sample[i] = generator.nextNormalizedDouble();
			System.out.println(sample[i]);
		}
	}
}

更多推荐

Java中Math3 各种随机数生成器的使用(Random Generator)