creating stream to iterate over from string in Python
I want to create a stream from a string in Python so that it's equivalent to reading the string as if it's read from a text file. something like:
for line in open('myfile.txt'): print line
except the contents of 'myfile.txt' are stored in a string s
. Is this the correct/best way to do it?
s = StringIO.StringIO("atbnctdn")
for line in s: print line
I want to create a stream from a string in Python so that it's equivalent to reading the string as if it's read from a text file.
Is this the correct/best way to do it?
Yes, unless you really do want it in a list.
If it is intended to be consumed line by line, the way you are doing it makes sense.
StringIO()
creates a file-like object.
File objects have a method, .readlines()
, which materialize the object as a list. Instead of materializing the data in a list, you can iterate over it, which is more memory light:
# from StringIO import StringIO # Python 2 import
from io import StringIO # Python 3 import
txt = "foonbarnbaz"
Here we append each line into a list, so that we can demonstrate iterating over the file-like object and keeping a handle on the data. (More efficient would be list(file_like_io)
.
m_1 = []
file_like_io = StringIO(txt)
for line in file_like_io:
m_1.append(line)
and now:
>>> m_1
['foon', 'barn', 'baz']
you can return your io to any index point with seek
:
>>> file_like_io.seek(0)
>>> file_like_io.tell() #print where we are in the object now
0
If you really want it in a list
.readlines()
materializes the StringIO
iterator as if one did list(io)
- this is considered less preferable.
>>> m_2 = file_like_io.readlines()
And we can see that our results are the same:
>>> m_1==m_2
True
Keep in mind that it is splitting after the newlines, preserving them in the text as well, so you'll get two newlines for every printed line, double-spacing on print.
You could roll your own with a simple generator function like this:
def string_stream(s, separators="n"):
start = 0
for end in range(len(s)):
if s[end] in separators:
yield s[start:end]
start = end + 1
if start < end:
yield s[start:end+1]
Example usage:
>>> stream = string_stream("footbarnbazn", "tn")
>>> for s in stream:
... print(s)
...
foo
bar
baz
cStringIO may be faster (I haven't tested), but this would give you flexibility in defining/consuming separators.
链接地址: http://www.djcxy.com/p/19166.html上一篇: 从对象到字符串的熊猫dtype转换
下一篇: 创建流来从Python中的字符串迭代