Results 1 to 10 of 10

Thread: HTTP Basics

  1. #1
    Junior Member
    Join Date
    Dec 2002
    Posts
    14

    HTTP Basics

    As I was searching through AntiOnline, looking for ways a "newbie" like me could contribute, I realized that nobody had posted (or at least, I could not find) a general tutorial on HTTP. "Hey!" I said to myself, "I know (in general) how HTTP works! I'm sure someone would appreciate a tutorial on HTTP." So, I have set off to write this thing, and I hope it will be useful to someone out there.

    HTTP, or the Hypertext Transfer Protocol, is one of the most important protocols used on the Internet today. It is the protocol that web clients (browsers) use to transfer web pages (and other files) from a web server to your computer. It can also be used to transfer files from your computer to a web server. Actually, the term "files" can be confusing, because when I send this tutorial in, I am not sending a file per say, I am sending the text that I am typing into the form. A better term (the one used in the RFC for HTTP) is resource; a resource can any type of data. By the way, HTTP is defined in RFC2616 (which I am going to be referring to a lot during the course of writing this). I would suggest that you read the RFC after reading this, because I am not going to be creating a comprehensive list of request methods, status codes, and headers. I could not shorten the RFC's lists without leaving out important parts, and I do not think I should copy and paste the lists into here; that would be a waste of space.

    Even though the RFC can be confusing, HTTP is really very simple. Even so, I think I'll include an example that you can try on your computer, before confusing you with too many definitions. You can send HTTP messages to a server from any basic Telnet client. Just open up your telnet client and connect to any computer running a web server on port 80 (not every computer will have its web server running on port 80, but most will). Here's what I did to connect to google.com:
    Code:
    telnet> open google.com 80
    Trying 216.239.35.100...
    Connected to www.google.com (216.239.35.100).
    Escape character is '^]'.
    Oh, before I go on, everyone please do not use google.com. The steps I am outlining are standardized, and should work on any web server. I do not want Google to think that they are being attacked by hackers. Also, no, in case you were getting suspicious, this is not illegal; this is the same thing your web client does (except, your web client probably does it faster than you can type ). Okay, now onto our next step: requesting a page.
    Code:
    GET / HTTP/1.1
    Note the extra line in the code segment. It is necessary. Just press Enter twice after you have typed the "GET / HTTP/1.1". This segment of code is fairly self explanatory. You want to get the "/" page (actually, you are asking for the index of the "/" folder of the web site since you do not know the name of any files). "HTTP/1.1" is simply indicating to the server that you are using version 1.1 of HTTP instead of 1.0 or (gasp) 0.9. All this, so far, is equivalent to typing "http://google.com/" into your web client's address bar. After pressing Enter twice, you should get a response similar to this:
    Code:
    HTTP/1.1 200 OK
    Content-Length: 2486
    Server: GWS/2.0
    Date: Wed, 01 Jan 2003 21:34:40 GMT
    Content-Type: text/html
    Cache-control: private
    Set-Cookie: PREF=ID=109a32574e737804:TM=1041456880:LM=1041456880:S=IQSsfydCTaF_UqVz; expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com
    
    <html>
    ...
    </html>
    I replaced the HTML code with ellipses because the actual HTML has no significance in this tutorial. The first line that you get is the status code of the reply. In this case, the code was 200 OK. This is the response that you probably want your client to receive when you are surfing the web. One code that I am sure everyone is familiar with is the 404 Not Found code (grimace). The rest of the lines are headers, which tell you about the file and request that you set a cookie (which I would not worry about at this point).

    Now that you have gotten a (hopefully) clear example of a simple HTTP transaction, I will go into some definitions of what a HTTP message is. All HTTP messages are either requests, like what we typed, or responses, like what we received. Each message starts with a request line, like "GET / HTTP/1.1", or a status line, like "HTTP/1.1 200 OK". Note that when I say line, I mean a string of text followed by a CRLF. CRLF is normally a carriage return character followed by a line feed character, but any character combination sent that is considered a line break is supposed to work according to the RFC. In short, in this case, CRLF = Enter key equivalent. I hope that did not confuse anyone. After the request or status line, comes as many header lines, like "Content-Type: text/html", as you or the server wants. Then comes a blank line and, depending on the type of request or status line, a message body.

    I should probably describe requests in more depth. First, a request line consists of: "Method URL HTTP-Version CRLF". And, as I said before, then you have optional header lines, a blank line, and an optional message body. An example of a method would be "GET". An example of a URL would be "/". An example of a HTTP-Version would be "HTTP/1.1".

    Well, I hope this has given you enough information to have a general understanding of web clients and servers. I definitely have not given you enough information to program a full-fledged client or server though. If you want to do that, you really need to read the RFC. Also, if you are interested, read RFC2617, HTTP Authentication. That RFC goes into how password protected web sites can be created and access via HTTP.

    Okay, that's it. If anyone has any suggestions or corrections, please send them to me. Thanks for reading.
    Binary005

  2. #2
    Great read and info! Binary005
    Thanks for the rfc#

  3. #3
    Senior Member
    Join Date
    Jan 2002
    Posts
    1,207
    Also note that many web servers these days will not respond correctly to a request unless you include a "Host:" header due to host-header based virtual hosting being used.

    So if you wanted to probe a particular site to see what headers etc, it sends back, be sure to send "Host:" in there.

    The Host: header should be like
    Code:
    Host: www.google.com
    or whatever. The web server matches it against sites it runs, and if there is a match then serves that site. If there is no match, Apache sends the default site, but IIS just responds with "No web server is configured at this address" or something.

  4. #4
    Just Another Geek
    Join Date
    Jul 2002
    Location
    Rotterdam, Netherlands
    Posts
    3,401
    Originally posted here by slarty
    Also note that many web servers these days will not respond correctly to a request unless you include a "Host:" header due to host-header based virtual hosting being used.

    So if you wanted to probe a particular site to see what headers etc, it sends back, be sure to send "Host:" in there.

    The Host: header should be like
    Code:
    Host: www.google.com
    or whatever. The web server matches it against sites it runs, and if there is a match then serves that site. If there is no match, Apache sends the default site, but IIS just responds with "No web server is configured at this address" or something.
    You are correct. HTTP/1.1 implies a host: header. Only HTTP/1.0 requests are correct without.

  5. #5
    Junior Member
    Join Date
    Dec 2002
    Posts
    14

    Glad people liked it.

    I'm glad people liked it; I tried to be as technically accurate as possible. I did not know the host thing was necessary; I didn't get an indication of that from the RFC. However, I have trouble reading standards. They get really confusing for me. I am really impressed with the level of knowledge some of you have.
    Binary005

  6. #6
    Junior Member
    Join Date
    Dec 2002
    Posts
    2
    helpful,thx

  7. #7

    Question

    hey, im kinda new to this and i got a question...
    im on a dsl connection, and if i want to connect to a remote computer with telnet,
    do i have to connect through my regular 56k dial-up modem? Is it possible to configure telnet
    to connect through the dsl modem?

  8. #8
    Junior Member
    Join Date
    Nov 2002
    Posts
    21
    this really help. I wish i could read this when i was a noob.

  9. #9
    Junior Member
    Join Date
    Dec 2002
    Posts
    14

    bootieofdarkness

    bootieofdarkness...lol, I wish I could come up with creative names like that. Well, have you actually tried to connect to a server with Telnet? There shouldn't be any problem with DSL. A connection to the Internet is a connection to the Internet. Now, there may be a problem with certain ISP's that you might be using (*cough AOL cough*). I don't know though, I never used AOL. Anyway, just try to telnet to a server out there, and if you cannot get it to work, come back and give us more details. It is hard to figure out a problem if you don't know what factors are involved.
    Binary005

  10. #10
    yeah, i tried connecting a couple of time and it did'nt work, and im not an aol user, so i don't know what could be the problem
    Squirrels have bushy tails
    I cut them off, then I laugh.
    That squirrel has no tail.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •