ipython-notebook

PySpark in iPython notebook raises Py4JJavaError when using count() and first()

白昼怎懂夜的黑 提交于 2019-11-27 15:10:16
I am using PySpark(v.2.1.0) in iPython notebook (python v.3.6) over virtualenv in my Mac(Sierra 10.12.3 Beta). 1.I launched iPython notebook by shooting this in Terminal - PYSPARK_PYTHON=python3 PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" /Applications/spark-2.1.0-bin-hadoop2.7/bin/pyspark 2.Loaded my file to Spark Context and ensured its loaded- >>>lines = sc.textFile("/Users/PanchusMac/Dropbox/Learn_py/Virtual_Env/pyspark/README.md") >>>for i in lines.collect(): print(i) And it worked fine and printed the result over my console as shown: # Apache Spark Spark is a fast

Pandas dataframe hide index functionality?

安稳与你 提交于 2019-11-27 14:21:20
Is it possible to hide the index when displaying pandas dataframes, so that only the column names appear at the top of the table? This would need to work for both the html representation in ipython notebook and to_latex() function (which I'm using with nbconvert). Ta. Set index=False For ipython notebook: print df.to_string(index=False) For to_latex: df.to_latex(index=False) As has been pointed out by @waitingkuo, index=False is what you need. If you want to keep the nice table layout within your ipython notebook, you can use: from IPython.display import display, HTML display(HTML(df.to_html

How do I run Python asyncio code in a Jupyter notebook?

孤人 提交于 2019-11-27 13:40:46
问题 I have some asyncio code which runs fine in the Python interpreter (CPython 3.6.2). I would now like to run this inside a Jupyter notebook with an IPython kernel. I can run it with import asyncio asyncio.get_event_loop().run_forever() and while that seems to work it also seems to block the notebook and doesn't seem to play nice with the notebook. My understanding is that Jupyter uses Tornado under the hood so I tried to install a Tornado event loop as recommended in the Tornado docs: from

Can you capture the output of ipython's magic methods? (timeit)

会有一股神秘感。 提交于 2019-11-27 13:34:00
问题 I want to capture and plot the results from 5 or so timeit calls with logarithmically increasing sizes of N to show how methodX() scales with input. So far I have tried: output = %timeit -r 10 results = methodX(N) It does not work... Can't find info in the docs either. I feel like you should be able to at least intercept the string that is printed. After that I can parse it to extract my info. Has anyone done this or tried? PS: this is in an ipython notebook if that makes a diff. 回答1: This

Inline animations in Jupyter

风格不统一 提交于 2019-11-27 11:48:35
I have a python animation script (using matplotlib's funcAnimation), which runs in Spyder but not in Jupyter. I have tried following various suggestions such as adding "%matplotlib inline" and changing the matplotlib backend to "Qt4agg", all without success. I have also tried running several example animations (from Jupyter tutorials), none of which have worked. Sometimes I get an error message and sometimes the plot appears, but does not animate. Incidentally, I have gotten pyplot.plot() to work using "%matplotlib inline". Does anyone know of a working Jupyter notebook with a SIMPLE inline

How to display line numbers in IPython Notebook code cell by default

可紊 提交于 2019-11-27 11:10:00
问题 I would like my default display for IPython notebook code cells to include line numbers. I learned from Showing line numbers in IPython/Jupyter Notebooks that I can toggle this with ctrl-M L, which is great, but manual. In order to include line numbers by default, I would need to add something to my ipython_notebook_config.py file. Unless I've missed something, there is not an explanation of how to do this in the documentation. 回答1: In your custom.js file (location depends on your OS) put

Verifying PEP8 in iPython notebook code

半城伤御伤魂 提交于 2019-11-27 10:29:57
问题 Is there an easy way to check that iPython notebook code, while it's being written, is compliant with PEP8? 回答1: In case this helps anyone, I'm using: conttest "jupyter nbconvert notebook.ipynb --stdout --to script | flake8 - --ignore=W391" conttest reruns when saving changes to the notebook flake8 - tells flake8 to take input from stdin --ignore=W391 - this is because the output of jupyter nbconvert seems to always have a "blank line at end of file", so I don't want flake8 to complain about

Calling pylab.savefig without display in ipython

断了今生、忘了曾经 提交于 2019-11-27 10:21:09
I need to create a figure in a file without displaying it within IPython notebook. I am not clear on the interaction between IPython and matplotlib.pylab in this regard. But, when I call pylab.savefig("test.png") the current figure get's displayed in addition to being saved in test.png . When automating the creation of a large set of plot files, this is often undesirable. Or in the situation that an intermediate file for external processing by another app is desired. Not sure if this is a matplotlib or IPython notebook question. This is a matplotlib question, and you can get around this by

How do I change the autoindent to 2 space in IPython notebook

房东的猫 提交于 2019-11-27 10:20:15
问题 I find that developing functions in IPython notebook allows me to work quickly. When I'm happy with the results I copy-paste to a file. The autoindent is 4 spaces, but the coding style for indentation at my company is 2 spaces. How do I change the autoindent to 2 spaces? 回答1: Based on this question and the options found here: In your custom.js file (location depends on your OS) put IPython.Cell.options_default.cm_config.indentUnit = 2; On my machine the file is located in ~/.ipython/profile

Jupyter notebook display two pandas tables side by side

ε祈祈猫儿з 提交于 2019-11-27 10:14:42
I have two pandas dataframes and I would like to display them in Jupyter notebook. Doing something like: display(df1) display(df2) Shows them one below another: I would like to have a second dataframe on the right of the first one. There is a similar question , but it looks like there a person is satisfied either with merging them in one dataframe of showing the difference between them. This will not work for me. In my case dataframes can represent completely different (non-comparable elements) and the size of them can be different. Thus my main goal is to save space. You could override the