问题
I'm new and studying machine learning. I stumble upon a tutorial I found online and I'd like to make the program work so I'll get a better understanding. However, I'm getting problems about loading the CSV File into the Jupyter Notebook.
I get this error:
File "<ipython-input-2-70e07fb5b537>", line 2
student_data = pd.read_csv("C:\Users\xxxx\Desktop\student-intervention-
system\student-data.csv")
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in
position 2-3: truncated \UXXXXXXXX escape
and here is the code:
I followed tutorials online regarding this error but none worked. Does anyone know how to fix it?
3rd attempt with r"path"
I've tried also "\" and utf-8 but none worked.
I'm using the latest version of Anaconda Windows 7 Python 3.7
回答1:
Use raw string notation for your Windows path. In python '\' have meaning in python. Try instead do string like this r"path":
student_data = pd.read_csv(r"C:\Users\xxxx\Desktop\student-intervention-
system\student-data.csv")
If it doesnt work try this way:
import os
path = os.path.join('c:' + os.sep, 'Users', 'xxxx', 'Desktop', 'student-intervention-system', 'student-data.csv')
student_data = pd.read_csv(path)
回答2:
Either replace all backslashes
\with frontslashes/or place arbefore your filepath string to avoid this error. It is not a matter of your folder name being too long.
As Bohun Mielecki mentioned, the \ character which is typically used to denote file structure in Windows has a different function when written within a string.
From Python3 Documentation: The backslash
\character is used to escape characters that otherwise have a special meaning, such as newline, backslash itself, or the quote character.
How this particularly affects your statement is that in the line
student_data = pd.read_csv("C:\Users\xxxx\Desktop\student-intervention-
system\student-data.csv")
\Users matches the escape sequence \Uxxxxxxxx whereby xxxxxxxx refers to a Character with 32-bit hex value xxxxxxxx. Because of this, Python tries to find a 32-bit hex value. However as the -sers from Users doesn't match the xxxxxxxx format, you get the error:
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
The reason why your code works now is that you have placed a r in front of 'C:\Users\xxxx\Desktop\project\student-data.csv'. This tells python not to process the backslash character / as it usually does and read the whole string as-is.
I hope this helps you better understand your problem. If you need any more clarification, do let me know.
Source: Python 3 Documentation
回答3:
Try
pd.read_csv('file_name',encoding = "utf-8")
回答4:
I found the problem. The problem is my folder name that is really long. I changed my folder name into "project" and the data is now finally loaded! Silly!
回答5:
Try this student_data = pd.read_csv("C:/Users/xxxx/Desktop/student-intervention-
system/student-data.csv").
Replacing backslashes in that code it will work for you.
来源:https://stackoverflow.com/questions/55209661/how-to-load-csv-file-in-jupyter-notebook