python - Detecing forms (and filling them in) with Scrapy -


i'm struggling find generic approach detecting form in html , submitting it. when page structure know in advance given page, of course have several options:

-- selenium/webdriver (by filling in fields , 'clicking' button)

-- determining form of post query manually, reconstructing urllib2 directly:

import urllib2 import urllib import lxml.html lh  url = "http://apply.ovoenergycareers.co.uk/vacancies/#results" params = urllib.urlencode([('field_36[]', 73), ('field_37[]', 76),    ('field_32[]', 82)]) response = urllib2.urlopen(url, params) 

or requests:

import requests r = requests.post("http://apply.ovoenergycareers.co.uk/vacancies/#results", data = 'manager') r.text 

but although forms involve post request, input fields , submit button, vary in implementation under hood. when number of pages scraped gets hundreds, it's not feasible define custom form-filling approach each.

my understanding scrapy's main added value ability follow links. presume include links arrived @ via form submission. can ability used build generic approach "following" form submission?

clarification: in case of form several dropdown menus, typically leaving these @ default value, , filling in search term input field. locating field , 'filling in' main challenge here.

link extractors cannot follow form submissions in scrapy. there mechanism called formrequest designed ease submitting forms.

note formrequests cannot handle forms when javascript involved in submission.


Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -