问题
I have few doc in .csv - 20 000 record or more.
Basically it's easy - something like that:
numer,produkt,date
202,produkt A its sad,20.04.2019
203,produkt A its sad,21.04.2019
204,produkt A its sad,22.04.2019
etc
I want to print info:
A "produkt A its sad" appears 6 times A "produkt B" appers 3 times A "produkt C" appers 2 times
Base on another answer on stack overflow I wrote:
import csv
from collections import Counter
with open ('base2.csv', encoding="utf8") as csv_file:
csv_reader = csv.reader(csv_file)
produkt = [row[0] for row in csv_file]
for (k,v) in Counter(produkt).items():
print ("A %s appears %d times" % (k, v))
I'm newbie on python so its probably something stupid :)
output is:
A n appears 1 times
A 2 appears 11 times
回答1:
Your issue is when you u se a list comprehension to build the list of products, you are reading from the file not the CSV reader object.
produkt = [row[0] for row in csv_file]
Says read each line of the file and store the line one at a time in variable name row, and from row, take the first char (index 0) from the string that row holds.
Instead assuming you want the produkt which is field one you should update this line to be
produkt = [row[1] for row in csv_reader]
Although that would also read the header line, Since you have headers i would use dictReader and select the column name your interested in like:
csv_reader = csv.DictReader(csv_data)
produkts = [row['produkt'] for row in csv_reader]
for (k, v) in Counter(produkts).items():
print("A %s appears %d times" % (k, v))
That way its clear what column your counting without havint to just use numeric index
回答2:
In your produkt = [row[0] for row in csv_file] the variable row is of string type and row[0] is just the 0-th character. I've replaced it with row.split(",")[1] and got the intended answer.
回答3:
Im reading from the csv_file instead of the csv_reader.
So produkt = [row[0] for row in csv_file] essentialy says read each line from the file and store as row, then take the first char of that line.
I replace csv_file to csv_reader and its works.
Thanks to @chrisdoyle
回答4:
You need to use the csv_reader object and not the csv_file.
import csv
from collections import Counter
with open ("base2.csv", encoding="utf8") as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
frequency = Counter([row[1] for row in csv_reader])
#In the above line, you have typed csv_file rather it should
# be csv_reader
for k, v in frequency.items():
print("{} appears {} times".format(k, v))
来源:https://stackoverflow.com/questions/61343769/how-to-check-frequency-in-csv-file-on-python