How do I fix a ValueError: read of closed file exception?

This simple Python 3 script:

import urllib.request

host = "scholar.google.com"
link = "/scholar.bib?q=info:K7uZdMSvdQ0J:scholar.google.com/&output=citation&hl=en&as_sdt=1,14&ct=citation&cd=0"
url = "http://" + host + link
filename = "cite0.bib"
print(url)
urllib.request.urlretrieve(url, filename)

raises this exception:

Traceback (most recent call last):
  File "C:UsersricardoDesktopGoogle-ScholarBibTextest2.py", line 8, in <module>
    urllib.request.urlretrieve(url, filename)
  File "C:Python32liburllibrequest.py", line 150, in urlretrieve
    return _urlopener.retrieve(url, filename, reporthook, data)
  File "C:Python32liburllibrequest.py", line 1597, in retrieve
    block = fp.read(bs)
ValueError: read of closed file

I thought this might be a temporary problem, so I added some simple exception handling like so:

import random
import time
import urllib.request

host = "scholar.google.com"
link = "/scholar.bib?q=info:K7uZdMSvdQ0J:scholar.google.com/&output=citation&hl=en&as_sdt=1,14&ct=citation&cd=0"
url = "http://" + host + link
filename = "cite0.bib"
print(url)
while True:
    try:
        print("Downloading...")
        time.sleep(random.randint(0, 5))
        urllib.request.urlretrieve(url, filename)
        break
    except ValueError:
        pass

but this just prints Downloading... ad infinitum.


Your URL return a 403 code error and apparently urllib.request.urlretrieve is not good at detecting all the HTTP errors, because it's using urllib.request.FancyURLopener and this latest try to swallow error by returning an urlinfo instead of raising an error.

About the fix if you still want to use urlretrieve you can override FancyURLopener like this (code included to also show the error):

import urllib.request
from urllib.request import FancyURLopener


class FixFancyURLOpener(FancyURLopener):

    def http_error_default(self, url, fp, errcode, errmsg, headers):
        if errcode == 403:
            raise ValueError("403")
        return super(FixFancyURLOpener, self).http_error_default(
            url, fp, errcode, errmsg, headers
        )

# Monkey Patch
urllib.request.FancyURLopener = FixFancyURLOpener

url = "http://scholar.google.com/scholar.bib?q=info:K7uZdMSvdQ0J:scholar.google.com/&output=citation&hl=en&as_sdt=1,14&ct=citation&cd=0"
urllib.request.urlretrieve(url, "cite0.bib")

Else and this is what i recommend you can use urllib.request.urlopen like so:

fp = urllib.request.urlopen('http://scholar.google.com/scholar.bib?q=info:K7uZdMSvdQ0J:scholar.google.com/&output=citation&hl=en&as_sdt=1,14&ct=citation&cd=0')
with open("citi0.bib", "w") as fo:
    fo.write(fp.read())
链接地址: http://www.djcxy.com/p/61532.html

上一篇: 是否可以使用SBT反射?

下一篇: 如何修复ValueError:读取关闭的文件异常?