How to determine content type of a string
I receive some data as a string. I need to write the data to a file, but the problem is that sometimes the data is compressed/zipped and sometimes it's just plain text. I need to determine the content-type so I know whether to write it to a .txt file or a .tgz file. Any ideas on how to accomplish this? Can I use mime type somehow even though my data is a string, not a file?
Thanks.
Both gzip and zip use distinct headers before compressed data, rather unlikely for human-readable strings. If the choice is only between these, you can make a faster check than mimetypes
would provide.
If the file is downloaded from a webserver, you should have a content-type to look at, however you are at the mercy of the webserver whether or not it truly describes the type of the file.
Another alternative would be to use a heuristic to guess the file type. This can often be done by looking at the first few bytes of the file
正如已经提出的一些答案,你可以看到文件的第一个字节:
#!/usr/bin/env python
# $ cat hello.txt
# Hello World. I'm plaintext.
# $ cat hello.txt | gzip > hello.txt.gz
from struct import unpack
# 1F 8B 08 00 / gz magic number
magic = ('x1f', 'x8b', 'x08', 'x00')
for filename in ['hello.txt', 'hello.txt.gz']:
with open(filename, 'rb') as handle:
s = unpack('cccc', handle.read(4))
if s == magic:
print filename, 'seems gzipped'
else:
print filename, 'seems not gzipped'
# =>
# hello.txt seems not gzipped
# hello.txt.gz seems gzipped
链接地址: http://www.djcxy.com/p/46816.html
上一篇: ASP.NET的隐藏功能
下一篇: 如何确定字符串的内容类型