I have used Python and Beautiful Soup for several screen scraping projects in the past. But I have a more general question about how to get at a particular web site that I cannot figure out. I know this is not directly a Python question, but I am not sure where to ask it.
My past experience is like this. When I go to a database backed web page often there is a form that you fill in with some of the necessary information to do a lookup. When the information is returned, sometimes the URL shows exactly how to request that page using the form parameters as data, e.g.
www.testsite.com/ my-webpage?param1=username
So once you know this, you can bypass the form and build the output URL directly, then scrape the data.
The page I want is behind Microsoft, and returns something like
https://wwwtestsite.org/PublicAccess/Search.aspx?
ID=300&RefineSearch=1
IOW nothing in the URL indicates what I searched for, and gives me a way to ask directly about the parameter “username” shown above.
Am I completely barking up the wrong tree here? Or is it that I would need to do a more sophisticated analysis of the GET and RESP via the browser developer console or Wireshark?
So I am looking for very high level guidance about (I guess) how to figure out the code behind the page???
Thanks,
Mitch