How to call module written with argparse in iPython notebook

I am trying to pass BioPython sequences to Ilya Stepanov's implementation of Ukkonen's suffix tree algorithm in iPython's notebook environment. I am stumbling on the argparse component.

I have never had to deal directly with argparse before. How can I use this without rewriting main()?

By the by, this writeup of Ukkonen's algorithm is fantastic.


I've had a similar problem before, but using optparse instead of argparse .

You don't need to change anything in the original script, just assign a new list to sys.argv like so:

if __name__ == "__main__":
    from Bio import SeqIO
    path = '/path/to/sequences.txt'
    sequences = [str(record.seq) for record in  SeqIO.parse(path, 'fasta')]
    sys.argv = ['-f'] + sequences
    main()

An alternative to use argparse in Ipython notebooks is passing a string to:

args = parser.parse_args() (line 303 from the git repo you referenced.)

Would be something like:

parser = argparse.ArgumentParser(
        description='Searching longest common substring. '
                    'Uses Ukkonen's suffix tree algorithm and generalized suffix tree. '
                    'Written by Ilya Stepanov (c) 2013')

parser.add_argument(
        'strings',
        metavar='STRING',
        nargs='*',
        help='String for searching',
    )

parser.add_argument(
        '-f',
        '--file',
        help='Path for input file. First line should contain number of lines to search in'
    )

and

args = parser.parse_args("AAA --file /path/to/sequences.txt".split())

Edit: It works


I ended up using BioPython to extract the sequences and then editing Ilya Steanov's implementation to remove the argparse methods.

import imp
seqs = []
lcsm = imp.load_source('lcsm', '/path/to/ukkonen.py')
for record in SeqIO.parse('/path/to/sequences.txt', 'fasta'):
    seqs.append(record)
lcsm.main(seqs)

For the algorithm, I had main() take one argument, his strings variable, but this sends the algorithm a list of special BioPython Sequence objects, which the re module doesn't like. So I had to extract the sequence string

suffix_tree.append_string(s)

to

suffix_tree.append_string(str(s.seq))

which seems kind of brittle, but that's all I've got for now.

链接地址: http://www.djcxy.com/p/40090.html

上一篇: 我们什么时候使用Trie?

下一篇: 如何在iPython笔记本中调用使用argparse编写的模块