How to change column values or create values in a new column based on values in existing column?

你说的曾经没有我的故事 提交于 2021-02-10 10:56:30

问题


I am new to ML and Data Science (recently graduated from Master's in Business Analytics) and learning as much as I can by myself now while looking for positions in Data Science / Business Analytics.

I am working on my personal project to build ML algorithms to predict if a customer will show up to their existing appointment.

Upon initial data analysis, I notice that my "No-show" column contains values "Yes" and "No" (Metadata: if a customer scheduled an appointment and showed up for an appointment, the value in "No-show" column is "No"; if a customer scheduled an appointment and did not show up for an appointment, "No-show" column value is "Yes"). For ML algorithms, I need values "Yes" to become "1", and values "No" become "0".

I realize that there are 2 ways to tackle this problem:

  1. write a code to change values of "No-show" column
  2. create a new "Outcome" column whose values will depend on values in "No-show"

I tried writing code for both cases, but I continue to get different errors. Below are the 2 methods I attempted, and neither work. I appreciate your help in advance!

1.

if my_df["No-show"] == "Yes":
    my_df["No-show"] == 1
elif my_df["Outcome"] == "No":
    my_df["No-show"] == 0
else:
    print("Something went wrong")

Error: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

2.1

my_df["Outcome"] = 0

if my_df["No-show"] == "Yes":
    my_df["Outcome"] == 1
elif my_df["No-show"] == "No":
    my_df["Outcome"] == 0
else:
    print("Something went wrong")

Error: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

2.2

my_df["Outcome"] = 0

for val in my_df.iterrows():
    if my_df["No-show"] == "Yes":
        my_df["Outcome"] == 1
    elif my_df["No-show"] == "No":
        my_df["Outcome"] == 0
    else:
        print("Something went wrong")

Error: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Thank you for your help, and congratulate me on my first question on StackOverflow! Looking forward to give back to community! :)

来源:https://stackoverflow.com/questions/60855163/how-to-change-column-values-or-create-values-in-a-new-column-based-on-values-in

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!