Python samples according to discrete probability distribution

Keywords: Python

1, Probability list + sample list

Task description: we often have a probability list and a sample list to represent the probability that each sample is selected, and in the probability list, the sum of the probabilities is 1. For example, [0.7, 0.2, 0.1] and ['iron man', 'Captain America', 'Thor'], the elements in the two lists correspond one by one; Moreover, the two lists jointly indicate that 'iron man' has a probability of 0.7 selected, 'Captain America' has a probability of 0.2 selected and 'Thor' has a probability of 0.1 selected. Our purpose is to sample ['iron man', 'Captain America' and 'Thor' through discrete probability distributions such as [0.7, 0.2 and 0.1], and only one sample (of course, multiple samples can be taken).

In fact, such tasks can be implemented in a fairly simple way in Python. Please see my code for details.

code

import random

# input: probability distribution and correspondence
list_probability = [0.005, 0.015, 0.08, 0.25, 0.3, 0.25, 0.08, 0.015, 0.005]
list_player_role = ['Black widow', 'Spider-Man', 'The Incredible Hulk', 'Thor', 'Iron Man', 'Dr. strange', 'Captain America', 'panther', 'Eagle eye']
# sampling
result = random.choices(list_player_role, weights=list_probability, k=1)[0]
# output: sampling one by probability distribution
print(result)

# check the sampling whether is following the probability distribution or not
frequency = [0, 0, 0, 0, 0, 0, 0, 0, 0]
trying_times = 100000
for i in range(trying_times):
    result = random.choices(list_player_role, weights=list_probability, k=1)[0]
    if result == list_player_role[0]:
        frequency[0] += 1
    elif result == list_player_role[1]:
        frequency[1] += 1
    elif result == list_player_role[2]:
        frequency[2] += 1
    elif result == list_player_role[3]:
        frequency[3] += 1
    elif result == list_player_role[4]:
        frequency[4] += 1
    elif result == list_player_role[5]:
        frequency[5] += 1
    elif result == list_player_role[6]:
        frequency[6] += 1
    elif result == list_player_role[7]:
        frequency[7] += 1
    elif result == list_player_role[8]:
        frequency[8] += 1
    else:
        raise Exception('There is something wrong in sampling...')
for i in range(len(frequency)):
    print('Role:%s\t probability: %.3f\t frequency: %d/%d=%.4f' % (list_player_role[i], list_probability[i], frequency[i], trying_times, frequency[i]/trying_times))

output

Iron Man
Role: Black Widow    Probability: 0.005    Frequency: 489 / 100000 = 0.0049
Role: spider man    Probability: 0.015    Frequency: 1558 / 100000 = 0.0156
Role: hulk    Probability: 0.080    Frequency: 8011 / 100000 = 0.0801
Role: Thor    Probability: 0.250    Frequency: 25094 / 100000 = 0.2509
Role: Iron Man    Probability: 0.300    Frequency: 29957 / 100000 = 0.2996
Role: Dr. strange    Probability: 0.250    Frequency: 24958 / 100000 = 0.2496
Role: Captain America    Probability: 0.080    Frequency: 7867 / 100000 = 0.0787
Role: Panther    Probability: 0.015    Frequency: 1551 / 100000 = 0.0155
Role: eagle eye    Probability: 0.005    Frequency: 515 / 100000 = 0.0052

It can be seen that each frequency in the output result is close to its corresponding probability, which shows that the sampling process does follow the probability distribution specified by us.

2, Probability list only

Task description: do not specify a sample list, only a probability list, and then output an index in the probability list after sampling. For example, if you input [0.7, 0.2, 0.1] and output 1, then 1 indicates that the probability of acquisition is 0.2. If the output is 2, it indicates that the probability of acquisition is 0.1; If the output is 0, it means that the probability of acquisition is 0.7.

code

import random

# input: probability distribution and correspondence
list_probability = [0.005, 0.015, 0.08, 0.25, 0.3, 0.25, 0.08, 0.015, 0.005]

# sampling
index = list(range(len(list_probability)))
probability_index = random.choices(index, weights=list_probability, k=1)[0]

# output: sampling one by probability distribution
print(probability_index)

output

The above sampling process is only tested on the list in Python. It is reasonable that open source libraries such as numpy and pytorch will also have corresponding implementation methods.

3, Reference

Choose element(s) from List with different probability in Python

Posted by project168 on Tue, 26 Oct 2021 06:27:52 -0700

Programmer Group

Python samples according to discrete probability distribution

1, Probability list + sample list

2, Probability list only

3, Reference

Hot Keywords