How to pass a hidden recaptcha with mechanize?

I am trying to complete a form on a website automatically for academic purposes using Python's mechanize.

When a human completes the form and submits it, there is no recaptcha.

But when I fill in the controls for the form via mechanize in Python, there is a hidden control that is a recaptcha apparently.

<HiddenControl(recaptcha_response_field=manual_challenge)>

Since this recaptcha is never shown to a human, I don't know what it is looking for, or for that matter what a manual_challenge is.

Thus my question is, how can I pass this challenge so I can continue with automation / mechanize?

I've posted the script I've been using below, in case some fault lies with it.

import mechanize
import re

#constants
TEXT = "hello world!"

br = mechanize.Browser()
#ignore robots.txt
br.set_handle_robots(False)

br.addheaders = [('User-agent', 'Firefox')]

#open the page
response = br.open("http://somewebsite.com")

#this is the only form available 
br.select_form("form2")

br.form.set_all_readonly(False)

cText = br.form.find_control("text")
cText.value = TEXT

#now submit our response
response = br.submit()
br.back()

#verify the url for error checking
print response.geturl()

#print the data to a text file
s = response.read()
w = open("test.txt", 'w')
print>>w, s
w.close()

This site obviously has protection set against robots like yours. If this is really for academic purposes mail them and ask for the data.

To get around the sites protection measures - that is a different thing altogether, but you should look into how they know you are a bot - is there any javascript you are not running, are you using mechanize user agent etc.. You probably don't want to enter that battlefield with them though.

链接地址: http://www.djcxy.com/p/21630.html

上一篇: 这个复选框是如何工作的?我如何使用它?

下一篇: 如何通过机械化隐藏的recaptcha?