Why when I retrieve the response from a URL using HttpWebRequest, do I
end up with HTML that is different than IE, even if I set the
HTTP_USER_AGENT to be the same as IE?

Here's a super-simple example. Note that in IE when you load this
URL, there are "£" symbols in front of all of the prices (verified by
View | Source), but in the HttpWebRequest response there are not...the
actual HTML is definitely different...

HttpWebRequest eRequest =
(HttpWebRequest)WebRequest.Create(
"http://cgi6.ebay.co.uk/aw-cgi/eBayISAPI.dll?ViewBids&item=2340661957");
eRequest.Headers.Add("HTTP_USER_AGENT",
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR
1.0.3705)");
eRequest.Headers.Add("HTTP_ACCEPT", "*/*");
eRequest.Headers.Add("HTTP_ACCEPT_LANGUAGE", "en-us");

WebResponse eResponse = eRequest.GetResponse();

string eContent = new
StreamReader(eResponse.GetResponseStream()).ReadToEnd();
Debug.WriteLine(eContent);

Any help appreciated.

BTW - I thought it might be cookies, so I disabled cookies in IE to
see if that made a difference - it doesn't!

Re: Why is HttpWebRequest different than IE? by John

John
Fri Aug 29 16:07:19 CDT 2003

"Dan" <turbo_pub@certes.net> wrote in message
news:67b944e5.0308291239.6c2c8a95@posting.google.com...
> Why when I retrieve the response from a URL using HttpWebRequest, do I
> end up with HTML that is different than IE, even if I set the
> HTTP_USER_AGENT to be the same as IE?
>
> Here's a super-simple example. Note that in IE when you load this
> URL, there are "£" symbols in front of all of the prices (verified by
> View | Source), but in the HttpWebRequest response there are not...the
> actual HTML is definitely different...
>
> HttpWebRequest eRequest =
> (HttpWebRequest)WebRequest.Create(
> "http://cgi6.ebay.co.uk/aw-cgi/eBayISAPI.dll?ViewBids&item=2340661957");
> eRequest.Headers.Add("HTTP_USER_AGENT",
> "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR
> 1.0.3705)");
> eRequest.Headers.Add("HTTP_ACCEPT", "*/*");
> eRequest.Headers.Add("HTTP_ACCEPT_LANGUAGE", "en-us");
>
> WebResponse eResponse = eRequest.GetResponse();
>
> string eContent = new
> StreamReader(eResponse.GetResponseStream()).ReadToEnd();
> Debug.WriteLine(eContent);
>
> Any help appreciated.
>
> BTW - I thought it might be cookies, so I disabled cookies in IE to
> see if that made a difference - it doesn't!

This sounds like an encoding difference.

You should look at the messages sent and received by IE and your program
with a network sniffer like ProxyTrace from http://pocketsoap.com. You could
compare them and see the differences.

--
John Saunders
Internet Engineer
john.saunders@surfcontrol.com




Re: Why is HttpWebRequest different than IE? by Joerg

Joerg
Sat Aug 30 10:02:46 CDT 2003

Dan schrieb:

> Why when I retrieve the response from a URL using HttpWebRequest, do
> I end up with HTML that is different than IE, even if I set the
> HTTP_USER_AGENT to be the same as IE?
>
> Here's a super-simple example. Note that in IE when you load this
> URL, there are "£" symbols in front of all of the prices (verified by
> View | Source), but in the HttpWebRequest response there are
> not...the actual HTML is definitely different...
>
> HttpWebRequest eRequest =
> (HttpWebRequest)WebRequest.Create(
> "http://cgi6.ebay.co.uk/aw-cgi/eBayISAPI.dll?ViewBids&item=234066195
> 7"); eRequest.Headers.Add("HTTP_USER_AGENT",
> "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR
> 1.0.3705)");
> eRequest.Headers.Add("HTTP_ACCEPT", "*/*");
> eRequest.Headers.Add("HTTP_ACCEPT_LANGUAGE", "en-us");
>
> WebResponse eResponse = eRequest.GetResponse();
>
> string eContent = new
> StreamReader(eResponse.GetResponseStream()).ReadToEnd();
> Debug.WriteLine(eContent);

The console uses almost certainly a different encoding than the actual
content received from the web server. Thus, dumping web non-ASCII
content to the console "as is" is going to give you some funny or
missing characters.

In addition to that, you blindly decode the response assuming it's
encoded with UTF-8 (default StreamReader constructor). Unfortunately,
www.ebay.co.uk is using an 8 bit encoding. Which one it does not bother
to specify, but since it's running on IIS4, Windows-1252 is a safe bet
;-)

Cheers,
--
Joerg Jooss
joerg.jooss@gmx.net