Website form login using Python urllib2 -


i've breen trying learn use urllib2 package in python. tried login in student (the left form) signup page maths students: http://reg.maths.lth.se/. have inspected code (using firebug) , left form should called using post key called pnr value should string 10 characters long (the last part can perhaps not seen html code, social security number know how long should be). note action in header appropriate post method url, namely http://reg.maths.lth.se/login/student.

i tried (with fake pnr in example below, used real number in own code).

import urllib import urllib2  url = 'http://reg.maths.lth.se/' values = dict(pnr='0000000000') data = urllib.urlencode(values) req = urllib2.request(url,data) resp = urllib2.urlopen(req) page = resp.read()  print page 

while executes, print source code of original page http://reg.maths.lth.se/, doesn't seem logged in. also, add key/value pairs values dictionary , doesn't produce error, seems strange me.

also, if go page http://reg.maths.lth.se/login/student, there no post method submitting data.

any suggestions?

if inspect request sent server when enter number , submit form, notice post request pnr , _token parameters:

enter image description here

you missing _token parameter need extract html source of page. hidden input element:

<input name="_token" type="hidden" value="wrbj5x05vvdlzmgzqydfxkufcfsjsldhknmhtu6m"> 

i suggest looking tools mechanize, mechanicalsoup or robobrowser ease form submission. may parse html html parser, beautifulsoup yourself, extract token , send via urllib2 or requests:

import requests bs4 import beautifulsoup  pnr = "00000000"  url = "http://reg.maths.lth.se/" login_url = "http://reg.maths.lth.se/login/student" requests.session() session:     # extract token     response = session.get(url)     soup = beautifulsoup(response.content, "html.parser")     token = soup.find("input", {"name": "_token"})["value"]      # submit form     session.post(login_url, data={         "_token": token,         "pnr": pnr     })      # navigate main page again (should logged in)     response = session.get(url)      soup = beautifulsoup(response.content, "html.parser")     print(soup.title) 

Comments

Popular posts from this blog

java - pagination of xlsx file to XSSFworkbook using apache POI -

Unlimited choices in BASH case statement -

apache - How do I stop my index.php being run twice for every user -