Remove HTTP headers from a raw response
Let's say we make a request to a URL and get back the raw response, like this:
HTTP/1.1 200 OK
Date: Wed, 28 Apr 2010 14:39:13 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=e2bca72563dfffcc:TM=1272465553:LM=1272465553:S=ZN2zv8oxlFPT1BJG; expires=Fri, 27-Apr-2012 14:39:13 GMT; path=/; domain=.google.co.uk
Server: gws
X-XSS-Protection: 1; mode=block
Connection: close
<!doctype html><html><head>...</head><body>...</body></html>
What would be the best way to remove the HTTP headers from the response in C#? With regexes? Parsing it into some kind of HTTPResponse object and using only the body?
EDIT:
I'm using SOCKS to make the request; that's why I get the raw response.
Headers and body are separated by empty line. it is really easier to do it without RE. Just search for first empty line.
If you use HttpWebrequest class you get an HttpWebResponse object returned which in turn contains a collection of Headers. You can then remove them, parse them or do whatever you wish with them.
Note that using the substring method will leave you with a leading carriage return. I used this:
string HTTPHeaderDelimiter = "rnrn";
if (RawHTTPResponse.IndexOf("HTTP/1.1 200 OK") > -1)
{
HTTPPayload = RawHTTPResponse.Substring(RawHTTPResponse.IndexOf(HTTPHeaderDelimiter)+HTTPHeaderDelimiter.Length);
}
else
{
return;
}
链接地址: http://www.djcxy.com/p/45424.html
下一篇: 从原始响应中删除HTTP标头