问题
I have a dict structure with length 5. The dict structure is called "mat_contents". The information is located in "traindata" and their respective labels in "trainlabels". I want to extract a given number of samples from a given label value. For instance, 60 samples (out of 80) from "traindata" with label "trainlabels" equal 1. I have seen some examples in here but they are different from my request.
Assuming this as an example of Input
traindata trainlabels
a 1
b 2
c 2
d 1
e 1
f 2
The result if I want to extract two random samples of traindata with trainlabels value of 2 could be:
b
f
回答1:
labels = [k for k, v in mat_contents.items() if v == 1]
result = np.random.choice(labels, 2, replace=False)
The first line extracts the relevant labels from your dictionary, and the second line chooses a random subset of 2 elements from these labels (without replacement), if numpy is imported as np.
回答2:
Can you not use a pandas data frame to do this? Link:Pandas Dataframe Sampling. This is an example that i have used in the past:
import pandas as pd
keeping = 0.8
source = "/path/to/some/file"
df = pd.DataFrame(source)
ones = df[df.trainlabels == 1].sample(frac=keeping)
twos = df[df.trainlabels == 2].sample(frac=keeping)
来源:https://stackoverflow.com/questions/45219770/python-subtract-a-number-of-samples-from-a-given-in-a-dictionary-structure