Heads or tails?

  • Topic: random numbers, statistics

  • Task A:
    1. Make a guess: if you toss a coin 100 times, how long is the longest streak of heads or tails you will get?

    2. Implement a method to generate a random sequence of coin tosses.

    3. Generate sequences of 100 tosses and count the longest streak. Are the results close to what you guessed?

  • Task B:
    1. Implement a function to calculate the average, standard deviation and error of mean for a sequence or a distribution of values.

    2. Generate many 100 coin toss sequences and calculate the average length of the longest streak. Also calculate the error estimate for the average.

    3. Study how the average and error estimates change as you create more sequences.

  • Template: coins.py

  • Further reading:

coins.py

coins.calculate_statistics(samples)[source]

Calculates and returns the sample mean, variance, standard deviation and error of mean for the given set of samples.

For a sequence of \(N\) values for a random variable \(X\), the mean is estimated by the average

\[\mu_X = \langle X \rangle = \frac{1}{N} \sum_i X_i\]

and the variance by the sample variance

\[\sigma^2_X = \frac{1}{N-1} \sum_i (X_i - \mu_X)^2.\]

Standard deviation is the square root of variance

\[\sigma_X = \sqrt{ \sigma^2_X }\]

and the standard error of mean is

\[\Delta X = \frac{ \sigma_X }{ \sqrt{ N } }.\]

Note

This function is incomplete!

Parameters:

samples (list) – sequence of random results, \(X_i\)

Returns:

mean, variance, standard deviation, error of mean

Return type:

float, float, float, float

coins.calculate_statistics_for_distribution(distribution)[source]

Calculates and returns the sample mean and variance for the given discrete distribution.

The distribution \(f\) is a list or an array specifying the number of times each result 0, 1, 2, 3, … was measured. For instance, if 4 is measured 12 times, \(f(4) = 12\).

For a distribution, the mean is estimated by

\[\mu_X = \langle X \rangle = \frac{1}{N} \sum_k k f(k)\]

and the variance by the sample variance

\[\sigma^2_X = \frac{1}{N-1} \sum_k f(k)(k - \mu_X)^2.\]

Standard deviation is the square root of variance

\[\sigma_X = \sqrt{ \sigma^2_X }\]

and the standard error of mean is

\[\Delta X = \frac{ \sigma_X }{ \sqrt{ N } }.\]

Note

This function is incomplete!

Parameters:

distribution (list) – vector containing the number of times each result was obtained

Returns:

mean, variance, standard deviation, error of mean

Return type:

float, float, float, float

coins.count_max_streak(sequence)[source]

Calculates the length of the longest subsequence of identical values in a sequence.

Parameters:

sequence (list) – a list of values

Returns:

the length of the longest subsequence

Return type:

int

coins.get_random_sequence(n_toss=100)[source]

Generate a random sequence of coin tosses.

The function returns a random list of ones (heads) and zeros (tails).

Note

This function is incomplete!

Parameters:

n_toss (int) – number of tosses

Returns:

random sequence of 0’s and 1’s

Return type:

list

coins.main(n_toss, n_repeats)[source]

The main program.

Performs either one or several coin toss sequences and analyses the results.

Parameters:
  • n_toss (int) – the length of each sequence

  • n_repeats (int) – the number of sequences

coins.print_progress(step, total)[source]

Prints a progress bar.

Parameters:
  • step (int) – progress counter

  • total (int) – counter at completion

coins.run_series(n_toss, n_repeats)[source]

Runs a series of coin toss sequences and calculates statistics for the longest streak of heads or tails.

The method flips a coin a given number of times using get_random_sequence() and then counts the length of the longest sequence of heads or tails in the sequence with count_max_streak().

Then it repeats this experiments a given number of times and records the length of the longest streak in each sequence.

In the end, the method calculate_statistics() or calculate_statistics_for_distribution() is used to calculate statistics for the length of this longest streak.

Parameters:
  • n_toss (int) – the length of each sequence

  • n_repeats (int) – the number of sequences

Returns:

distribution, mean, standard deviation, error of mean

for the longest streak

Return type:

array, float, float, float