How to sort contents in a file in python
I'm trying to figure out a simple way to sort words from a file, however the spaces "n" are always returned when I print the words. How could I improve this code to make it work properly? I'm using python 2.7 Thanks in advance.
def sorting(self):
filename = ("food.txt")
file_handle = open(filename, "r")
for word in file_handle:
word = word.split()
print sorted(file_handle)
file_handle.close()
You actually have two problems here.
The big one is that print sorted(file_handle)
reads and sorts the whole rest of the file and prints that out. You're doing that once per line. So, what happens is that you read the first line, split it, ignore the result, sort and print all the lines after the first, and then you're done.
What you want to do is accumulate all the words as you go along, then sort and print that. Like this:
def sorting(self):
filename = ("food.txt")
file_handle = open(filename, "r")
words = []
for line in file_handle:
words += line.split()
file_handle.close()
print sorted(words)
Or, if you want to print the sorted list one line at a time, instead of as a giant list, change the last line to:
print 'n'.sorted(words)
For the second, more minor problem, the one you asked about, you just need to strip
off the newlines. So, change the words +=
line to this:
words += line.strip().split()
However, if you had solved the first problem, you wouldn't even have noticed this one. If you have a line like "one two threen"
, and you call split()
on it, you will get back ["one", "two", "three"]
, with no n
to worry about. So, you don't actually even need to solve this one.
While we're at it, there are a few other improvements you could make here:
with
statement to close the file instead of doing it manually. return
the list of words (so you can do various different things with it, instead of just printing it and returning nothing). set
rather than a list
. "that."
as a word. Doing even this basic kind of natural-language processing is non-trivial, so I won't show an example here. (For example, you probably want "John's"
to be a word, you may or may not want "jack-o-lantern"
to be one word instead of three; you almost certainly don't want "two-three"
to be one word…) self
parameter is only needed in methods of classes. This doesn't appear to be in any class. (If it is, it's not doing anything with self
, so there's no visible reason for it to be in a class. You might have some reason which would be visible in your larger program, of course.) So, anyway:
def sorting(filename):
words = []
with open(filename) as file_handle:
for line in file_handle:
words += line.split()
return sorted(words)
print 'n'.join(sorting('food.txt'))
Basically all you have to do is strip that newline (and all other whitespace because you probably don't want it):
def sorting(self):
filename = ("food.txt")
file_handle = open(filename, "r")
for line in file_handle:
word = line.strip().split()
print sorted(file_handle)
file_handle.close()
Otherwise you can just remove the last character with line[:-1].split()
Use .strip(). It will remove white space by default. You can also add other characters (like "n") to strip as well. This will leave just the words.
链接地址: http://www.djcxy.com/p/55072.html上一篇: 检查字符串是否大于字符串