not sure what the problem is here... all i want is the first and only element in this series
>>> a
1 0-5fffd6b57084003b1b582ff1e56855a6!1-AB87696
Look at the following Code:
import pandas as pd
import numpy as np
data1 = pd.Series(['a','b','c'],index=['1','3','5'])
data2 = pd.Series(['a','b','c'],index=[1,3,5])
print('keys data1: '+str(data1.keys()))
print('keys data2: '+str(data2.keys()))
print('base data1: '+str(data1.index.base))
print('base data2: '+str(data2.index.base))
print(data1['1':'3']) # Here we use the dictionary like slicing
print(data1[1:3]) # Here we use the integer like slicing
print(data2[1:3]) # Here we use the integer like slicing
keys data1: Index(['1', '3', '5'], dtype='object')
keys data2: Int64Index([1, 3, 5], dtype='int64')
base data1: ['1' '3' '5']
base data2: [1 3 5]
1 a
3 b
dtype: object
3 b
5 c
dtype: object
3 b
5 c
dtype: object
For data1, the dtype of the index is object, for data2 it is int64. Taking a look into Jake VanderPlas's Data Science Handbook he writes: "a Series object acts in many ways like a one-dimensional NumPy array, and in many ways like a standard Python dictionary". Hence if the index is of type "object" as in the case of data1, we have two different ways to acces the values:
1. By dictionary like slicing/indexing:
data1['1','3'] --> a,b
By integer like slicing/indexing:
data1[1:3] --> b,c
If the index dtype is of type int64 as in the case of data2, pandas has no opportunity to decide if we want to have index or dictionry like slicing/indexing and hence it defaults to index like slicing/indexing and consequently for data2[1:3] we get b,c just as for data1 when we choose integer like slicing/indexing.
Nevertheless VanderPlas mentions to keep in mind one critical thing in that case:
"Notice that when you are slicing with an explicit index (i.e., data['a':'c']), the final index is included in
the slice, while when you’re slicing with an implicit index (i.e., data[0:2]), the final index is excluded from the slice.[...] These slicing and indexing conventions can be a source of confusion."
To overcome this confuction you can use the loc for label based slicing/indexing and iloc for index based slicing/indexing
like:
import pandas as pd
import numpy as np
data1 = pd.Series(['a','b','c'],index=['1','3','5'])
data2 = pd.Series(['a','b','c'],index=[1,3,5])
print('data1.iloc[0:2]: ',str(data1.iloc[0:2]),sep='\n',end='\n\n')
# print(data1.loc[1:3]) --> Throws an error bacause there is no integer index of 1 or 3 (these are strings)
print('data1.loc["1":"3"]: ',str(data1.loc['1':'3']),sep='\n',end='\n\n')
print('data2.iloc[0:2]: ',str(data2.iloc[0:2]),sep='\n',end='\n\n')
print('data2.loc[1:3]: ',str(data2.loc[1:3]),sep='\n',end='\n\n') #Note that contrary to usual python slices, both the start and the stop are included
data1.iloc[0:2]:
1 a
3 b
dtype: object
data1.loc["1":"3"]:
1 a
3 b
dtype: object
data2.iloc[0:2]:
1 a
3 b
dtype: object
data2.loc[1:3]:
1 a
3 b
dtype: object
So data2.loc[1:3] searches explicitly for the values of 1 and 3 in the index and returns the values which lay between them while data2.iloc[0:2] returns the values between the zerost element in the index and the second element in the index excluding the second element.