Converting to (not from) ipython Notebook format

后端 未结 10 970
刺人心
刺人心 2020-11-29 17:10

IPython Notebook comes with nbconvert, which can export notebooks to other formats. But how do I convert text in the opposite direction? I ask because I already ha

相关标签:
10条回答
  • 2020-11-29 17:21

    Python code example how to build IPython notebook V4:

    # -*- coding: utf-8 -*-
    import os
    from base64 import encodestring
    
    from IPython.nbformat.v4.nbbase import (
        new_code_cell, new_markdown_cell, new_notebook,
        new_output, new_raw_cell
    )
    
    # some random base64-encoded *text*
    png = encodestring(os.urandom(5)).decode('ascii')
    jpeg = encodestring(os.urandom(6)).decode('ascii')
    
    cells = []
    cells.append(new_markdown_cell(
        source='Some NumPy Examples',
    ))
    
    
    cells.append(new_code_cell(
        source='import numpy',
        execution_count=1,
    ))
    
    cells.append(new_markdown_cell(
        source='A random array',
    ))
    
    cells.append(new_raw_cell(
        source='A random array',
    ))
    
    cells.append(new_markdown_cell(
        source=u'## My Heading',
    ))
    
    cells.append(new_code_cell(
        source='a = numpy.random.rand(100)',
        execution_count=2,
    ))
    cells.append(new_code_cell(
        source='a = 10\nb = 5\n',
        execution_count=3,
    ))
    cells.append(new_code_cell(
        source='a = 10\nb = 5',
        execution_count=4,
    ))
    
    cells.append(new_code_cell(
        source=u'print "ünîcødé"',
        execution_count=3,
        outputs=[new_output(
            output_type=u'execute_result',
            data={
                'text/plain': u'<array a>',
                'text/html': u'The HTML rep',
                'text/latex': u'$a$',
                'image/png': png,
                'image/jpeg': jpeg,
                'image/svg+xml': u'<svg>',
                'application/json': {
                    'key': 'value'
                },
                'application/javascript': u'var i=0;'
            },
            execution_count=3
        ),new_output(
            output_type=u'display_data',
            data={
                'text/plain': u'<array a>',
                'text/html': u'The HTML rep',
                'text/latex': u'$a$',
                'image/png': png,
                'image/jpeg': jpeg,
                'image/svg+xml': u'<svg>',
                'application/json': {
                    'key': 'value'
                },
                'application/javascript': u'var i=0;'
            },
        ),new_output(
            output_type=u'error',
            ename=u'NameError',
            evalue=u'NameError was here',
            traceback=[u'frame 0', u'frame 1', u'frame 2']
        ),new_output(
            output_type=u'stream',
            text='foo\rbar\r\n'
        ),new_output(
            output_type=u'stream',
            name='stderr',
            text='\rfoo\rbar\n'
        )]
    ))
    
    nb0 = new_notebook(cells=cells,
        metadata={
            'language': 'python',
        }
    )
    
    import IPython.nbformat as nbf
    import codecs
    f = codecs.open('test.ipynb', encoding='utf-8', mode='w')
    nbf.write(nb0, f, 4)
    f.close()
    
    0 讨论(0)
  • 2020-11-29 17:27

    Some improvement to @p-toccaceli answer. Now, it also restores markdown cells. Additionally, it trims empty hanging lines for each cell.

        import nbformat
        from nbformat.v4 import new_code_cell,new_markdown_cell,new_notebook
    
        import codecs
    
        sourceFile = "changeMe.py"     # <<<< change
        destFile = "changeMe.ipynb"    # <<<< change
    
    
        def parsePy(fn):
            """ Generator that parses a .py file exported from a IPython notebook and
        extracts code cells (whatever is between occurrences of "In[*]:").
        Returns a string containing one or more lines
        """
            with open(fn,"r") as f:
                lines = []
                for l in f:
                    l1 = l.strip()
                    if l1.startswith('# In[') and l1.endswith(']:') and lines:
                        yield ("".join(lines).strip(), 0)
                        lines = []
                        continue
                    elif l1.startswith('# ') and l1[2:].startswith('#') and lines:
                        yield ("".join(lines).strip(), 0)
    
                        yield (l1[2:].strip(), 1)
                        lines = []
                        continue
                    lines.append(l)
                if lines:
                    yield ("".join(lines).strip(), 0)
    
        # Create the code cells by parsing the file in input
        cells = []
        for c, code in parsePy(sourceFile):
            if len(c) == 0:
                continue
            if code == 0:
                cells.append(new_code_cell(source=c))
            elif code == 1:
                cells.append(new_markdown_cell(source=c))
    
        # This creates a V4 Notebook with the code cells extracted above
        nb0 = new_notebook(cells=cells,
                           metadata={'language': 'python',})
    
        with codecs.open(destFile, encoding='utf-8', mode='w') as f:
            nbformat.write(nb0, f, 4)
    
    0 讨论(0)
  • 2020-11-29 17:28

    Given the example by Volodimir Kopey, I put together a bare-bones script to convert a .py obtained by exporting from a .ipynb back into a V4 .ipynb.

    I hacked this script together when I edited (in a proper IDE) a .py I had exported from a Notebook and I wanted to go back to Notebook to run it cell by cell.

    The script handles only code cells. The exported .py does not contain much else, anyway.

    import nbformat
    from nbformat.v4 import new_code_cell,new_notebook
    
    import codecs
    
    sourceFile = "changeMe.py"     # <<<< change
    destFile = "changeMe.ipynb"    # <<<< change
    
    
    def parsePy(fn):
        """ Generator that parses a .py file exported from a IPython notebook and
    extracts code cells (whatever is between occurrences of "In[*]:").
    Returns a string containing one or more lines
    """
        with open(fn,"r") as f:
            lines = []
            for l in f:
                l1 = l.strip()
                if l1.startswith('# In[') and l1.endswith(']:') and lines:
                    yield "".join(lines)
                    lines = []
                    continue
                lines.append(l)
            if lines:
                yield "".join(lines)
    
    # Create the code cells by parsing the file in input
    cells = []
    for c in parsePy(sourceFile):
        cells.append(new_code_cell(source=c))
    
    # This creates a V4 Notebook with the code cells extracted above
    nb0 = new_notebook(cells=cells,
                       metadata={'language': 'python',})
    
    with codecs.open(destFile, encoding='utf-8', mode='w') as f:
        nbformat.write(nb0, f, 4)
    

    No guarantees, but it worked for me

    0 讨论(0)
  • 2020-11-29 17:32

    Since the code in the accepted answer does not work anymore, I have added this self-answer that shows how to import into a notebook with the current (v4) API.

    Input format

    Versions 2 and 3 of the IPython Notebook API can import a python script with special structuring comments, and break it up into cells as desired. Here's a sample input file (original documentation here). The first two lines are ignored, and optional. (In fact, the reader will ignore coding: and <nbformat> lines anywhere in the file.)

    # -*- coding: utf-8 -*-
    # <nbformat>3.0</nbformat>
    
    # <markdowncell>
    
    # The simplest notebook. Markdown cells are embedded in comments, 
    # so the file is a valid `python` script. 
    # Be sure to **leave a space** after the comment character!
    
    # <codecell>
    
    print("Hello, IPython")
    
    # <rawcell>
    
    # Raw cell contents are not formatted as markdown
    

    (The API also accepts the obsolete directives <htmlcell> and <headingcell level=...>, which are immediately transformed to other types.)

    How to import it

    For some reason, this format is not supported by version 4 of the Notebook API. It's still a nice format, so it's worth the trouble to support it by importing into version 3 and upgrading. In principle it's just two lines of code, plus i/o:

    from IPython.nbformat import v3, v4
    
    with open("input-file.py") as fpin:
        text = fpin.read()
    
    nbook = v3.reads_py(text)
    nbook = v4.upgrade(nbook)  # Upgrade v3 to v4
    
    jsonform = v4.writes(nbook) + "\n"
    with open("output-file.ipynb", "w") as fpout:
        fpout.write(jsonform)
    

    But not so fast! In fact, the notebook API has a nasty bug: If the last cell in the input is a markdown cell, v3.reads_py() will lose it. The simplest work-around is to tack on a bogus <markdown> cell at the end: The bug will delete it, and everyone is happy. So do the following before you pass text to v3.reads_py():

    text += """
    # <markdowncell>
    
    # If you can read this, reads_py() is no longer broken! 
    """
    
    0 讨论(0)
  • 2020-11-29 17:39

    The following works for IPython 3, but not IPython 4.

    The IPython API has functions for reading and writing notebook files. You should use this API and not create JSON directly. For example, the following code snippet converts a script test.py into a notebook test.ipynb.

    import IPython.nbformat.current as nbf
    nb = nbf.read(open('test.py', 'r'), 'py')
    nbf.write(nb, open('test.ipynb', 'w'), 'ipynb')
    

    Regarding the format of the .py file understood by nbf.read it is best to simply look into the parser class IPython.nbformat.v3.nbpy.PyReader. The code can be found here (it is not very large):

    https://github.com/ipython/ipython/blob/master/jupyter_nbformat/v3/nbpy.py

    Edit: This answer was originally written for IPyhton 3. I don't know how to do this properly with IPython 4. Here is an updated version of the link above, pointing to the version of nbpy.py from the IPython 3.2.1 release:

    https://github.com/ipython/ipython/blob/rel-3.2.1/IPython/nbformat/v3/nbpy.py

    Basically you use special comments such as # <codecell> or # <markdowncell> to separate the individual cells. Look at the line.startswith statements in PyReader.to_notebook for a complete list.

    0 讨论(0)
  • 2020-11-29 17:39

    You can use the script py2nb from https://github.com/sklam/py2nb

    You will have to use a certain syntax for your *.py but it's rather simple to use (look at the example in the 'samples' folder)

    0 讨论(0)
提交回复
热议问题