pandas series can't get index

后端 未结 2 1243

not sure what the problem is here... all i want is the first and only element in this series

>>> a
1    0-5fffd6b57084003b1b582ff1e56855a6!1-AB87696         


        
2条回答
  •  情歌与酒
    2020-12-21 10:11

    Look at the following Code:

    import pandas as pd
    import numpy as np
    
    
    data1 = pd.Series(['a','b','c'],index=['1','3','5'])
    data2 = pd.Series(['a','b','c'],index=[1,3,5])
    
    print('keys data1: '+str(data1.keys()))
    print('keys data2: '+str(data2.keys()))
    
    
    print('base data1: '+str(data1.index.base))
    print('base data2: '+str(data2.index.base))
    
    
    
    print(data1['1':'3']) # Here we use the dictionary like slicing
    print(data1[1:3]) # Here we use the integer like slicing
    print(data2[1:3]) # Here we use the integer like slicing
    
    keys data1: Index(['1', '3', '5'], dtype='object')
    keys data2: Int64Index([1, 3, 5], dtype='int64')
    base data1: ['1' '3' '5']
    base data2: [1 3 5]
    1    a
    3    b
    dtype: object
    3    b
    5    c
    dtype: object
    3    b
    5    c
    dtype: object
    

    For data1, the dtype of the index is object, for data2 it is int64. Taking a look into Jake VanderPlas's Data Science Handbook he writes: "a Series object acts in many ways like a one-dimensional NumPy array, and in many ways like a standard Python dictionary". Hence if the index is of type "object" as in the case of data1, we have two different ways to acces the values:
    1. By dictionary like slicing/indexing:

    data1['1','3'] --> a,b
    
    1. By integer like slicing/indexing:

      data1[1:3] --> b,c

    If the index dtype is of type int64 as in the case of data2, pandas has no opportunity to decide if we want to have index or dictionry like slicing/indexing and hence it defaults to index like slicing/indexing and consequently for data2[1:3] we get b,c just as for data1 when we choose integer like slicing/indexing.

    Nevertheless VanderPlas mentions to keep in mind one critical thing in that case: "Notice that when you are slicing with an explicit index (i.e., data['a':'c']), the final index is included in the slice, while when you’re slicing with an implicit index (i.e., data[0:2]), the final index is excluded from the slice.[...] These slicing and indexing conventions can be a source of confusion."
    To overcome this confuction you can use the loc for label based slicing/indexing and iloc for index based slicing/indexing

    like:

    import pandas as pd
    import numpy as np
    
    
    data1 = pd.Series(['a','b','c'],index=['1','3','5'])
    data2 = pd.Series(['a','b','c'],index=[1,3,5])
    
    print('data1.iloc[0:2]: ',str(data1.iloc[0:2]),sep='\n',end='\n\n')
    # print(data1.loc[1:3]) --> Throws an error bacause there is no integer index of 1 or 3 (these are strings)
    print('data1.loc["1":"3"]: ',str(data1.loc['1':'3']),sep='\n',end='\n\n')
    
    print('data2.iloc[0:2]: ',str(data2.iloc[0:2]),sep='\n',end='\n\n')
    print('data2.loc[1:3]: ',str(data2.loc[1:3]),sep='\n',end='\n\n') #Note that contrary to usual python slices, both the start and the stop are included 
    
    
    data1.iloc[0:2]: 
    1    a
    3    b
    dtype: object
    
    data1.loc["1":"3"]: 
    1    a
    3    b
    dtype: object
    
    data2.iloc[0:2]: 
    1    a
    3    b
    dtype: object
    
    data2.loc[1:3]: 
    1    a
    3    b
    dtype: object
    

    So data2.loc[1:3] searches explicitly for the values of 1 and 3 in the index and returns the values which lay between them while data2.iloc[0:2] returns the values between the zerost element in the index and the second element in the index excluding the second element.

提交回复
热议问题