So I have this google sheets API, and I am grabbing data from it and running a KS test. However, I only want to run the KS test on a number. But, the string consists of words as
Given strings coming from Google Sheets API, run kstest on the last number of each string.
A better way would be getting the numbers straight from Google Sheets API, store them and feed to stats.kstest.
You can split the string using str.split then covert the it to float.
>>> s = '2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,'
>>> s.split(',')
['2020-09-15 00:05:43', 'chemsense', 'co', 'concentration', '-0.75889', '']
>>> s.split(',')[4] # get the number (5th item in the list)
'-0.75889'
>>> float(s.split(',')[4]) # convert to float type
-0.75889
>>> round(float(s.split(',')[4]), 2) # round to 2 decimal place
-0.76
from scipy import stats
# Assuming strings coming back from API are in a list
str = [
'2020-09-15 00:05:13,chemsense,co,concentration,-0.51058,',
'2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,',
'2020-09-15 00:06:09,chemsense,co,concentration,-1.23385,',
'2020-09-15 00:06:33,chemsense,co,concentration,-1.23191,',
'2020-09-15 00:06:58,chemsense,co,concentration,-0.94495,',
'2020-09-15 00:07:23,chemsense,co,concentration,-1.16024,'
]
x = []
for s in str:
x.append(float(s.split(',')[4]))
stats.kstest(x, 'norm')