<!--#include virtual="commontop.html" -->
<title>Web Programming Step by Step, Lecture 1: Internet/WWW</title>
</head>
<body>
<div class="layout">
<div id="controls"><!-- DO NOT EDIT --></div>
<div id="currentSlide"><!-- DO NOT EDIT --></div>
<div id="header"></div>
<div id="footer">
<h1><em>Web Programming Step by Step</em>, Chapter 1</h1>
<h2>Internet/WWW; HTML Basics</h2>
</div>
</div>
<div class="presentation">
<div class="slide">
<h1><a href="http://www.webstepbook.com/">Web Programming Step by Step</a></h1>
<h3>Lecture 1<br /> Internet/WWW</h3>
<h4>Reading: Chapter 1</h4>
<p class="license">
Except where otherwise noted, the contents of this presentation are Copyright 2010 Marty Stepp and Jessica Miller.
</p>
<div class="w3c">
<a href="http://validator.w3.org/check/referer"><img src="images/w3c-xhtml11.png" alt="Valid XHTML 1.1" /></a>
<a href="http://jigsaw.w3.org/css-validator/check/referer"><img src="images/w3c-css.png" alt="Valid CSS!" /></a>
</div>
</div>
<div class="slide titleslide">
<h1>1.1: The Internet</h1>
<ul>
<li>
<strong>1.1: The Internet</strong>
</li>
<li>
1.2: The World Wide Web (WWW)
</li>
</ul>
</div>
<div class="slide">
<h1>The Internet</h1>
<div class="centerfigure">
<img src="images/internet.png" alt="The Internet" />
</div>
<ul>
<li>Wikipedia: <a href="http://en.wikipedia.org/wiki/Internet">http://en.wikipedia.org/wiki/Internet</a></li>
<li>a connection of computer networks using the Internet Protocol (IP)</li>
<li>What's the difference between the Internet and the World Wide Web (WWW)?</li>
</ul>
<div class="handout">
<ul>
<li>the Web is the collection of web sites and pages around the world; the Internet is larger and also includes other services such as email, chat, online games, etc.</li>
</ul>
</div>
</div>
<div class="slide">
<h1>
Brief history
<span class="readingsection">(1.1.1)</span>
</h1>
<ul>
<li>began as a US Department of Defense network called <a href="http://en.wikipedia.org/wiki/ARPANET">ARPANET</a> (1960s-70s)</li>
<li>initial services: electronic mail, file transfer</li>
<li>opened to commercial interests in late 80s</li>
<li>WWW created in 1989-91 by <a href="http://en.wikipedia.org/wiki/Tim_Berners-Lee">Tim Berners-Lee</a></li>
<li>popular web browsers released: Netscape 1994, IE 1995</li>
<li>Amazon.com opens in 1995; Google January 1996</li>
<li class="incremental"><a href="http://www.webhamster.com/">Hamster Dance</a> web page created in 1999 <img src="images/hamster_dance.png" alt="hamster dance" class="incremental" style="vertical-align: top" /></li>
</ul>
</div>
<div class="slide">
<h1>Key aspects of the internet</h1>
<ul>
<li>subnetworks can stand on their own</li>
<li>computers can dynamically join and leave the network</li>
<li>built on open standards; anyone can create a new internet device</li>
<li>lack of centralized control (mostly)</li>
<li>everyone can use it with simple, commonly available software</li>
</ul>
</div>
<div class="slide">
<h1>
People and organizations
<span class="readingsection">(1.1.2)</span>
</h1>
<ul>
<li>Internet Engineering Task Force (<a href="http://en.wikipedia.org/wiki/Internet_Engineering_Task_Force">IETF</a>): internet protocol standards</li>
<li>Internet Corporation for Assigned Names and Numbers (<a href="http://en.wikipedia.org/wiki/ICANN">ICANN</a>): <br />
decides top-level <a href="http://news.com.com/ICANN+rejects+.xxx+domain/2100-1047_3-6071124.html">domain names</a></li>
<li>World Wide Web Consortium (<a href="http://en.wikipedia.org/wiki/World_Wide_Web_Consortium">W3C</a>): web standards</li>
</ul>
<div class="centerfigure">
<img src="images/ietf_logo.gif" alt="IETF" />
<img src="images/icann.jpg" alt="ICANN" />
<img src="images/w3c.png" alt="W3C" />
</div>
</div>
<div class="slide">
<h1>
Layered architecture
<span class="readingsection">(1.1.3)</span>
</h1>
<p>
The internet uses a layered hardware/software architecture (also called the "OSI model"):
<img class="rightfigure" src="images/osi_model.png" alt="OSI model" />
</p>
<ul>
<li><em>physical layer</em> : devices such as ethernet, coaxial cables, fiber-optic lines, modems</li>
<li><em>data link layer</em> : basic hardware protocols (ethernet, wifi, DSL PPP)</li>
<li><em>network / internet layer</em> : basic software protocol (IP)</li>
<li><em>transport layer</em> : adds reliability to network layer (TCP, UDP)</li>
<li><em>application layer</em> : implements specific communication for each kind of program (HTTP, POP3/IMAP, SSH, FTP)</li>
</ul>
</div>
<div class="slide">
<h1>Internet Protocol (<a href="http://en.wikipedia.org/wiki/Internet_Protocol">IP</a>)</h1>
<ul>
<li>a simple protocol for attempting to send data between two computers</li>
<li>each device has a 32-bit IP address written as four 8-bit numbers (0-255) <br />
<img src="images/fig1_ip_address.png" alt="IP address" style="width: 507px; margin-top: 20px;" />
</li>
<li>find out your internet IP address: <a href="http://www.whatismyip.com/">whatismyip.com</a></li>
<li>find out your local IP address:
<ul>
<li>in a terminal, type: <code>ipconfig</code> (Windows) or <code>ifconfig</code> (Mac/Linux)</li>
</ul>
</li>
</ul>
</div>
<div class="slide">
<h1>Transmission Control Protocol (<a href="http://en.wikipedia.org/wiki/Tcp_protocol">TCP</a>)</h1>
<ul>
<li>adds multiplexing, guaranteed message delivery on top of IP</li>
<li>
<strong>multiplexing</strong>: multiple programs using the same IP address
<ul>
<li><strong>port</strong>: a number given to each program or service</li>
<li>port 80: web browser (port 443 for secure browsing)</li>
<li>port 25: email</li>
<li>port 22: ssh</li>
<li>port 5190: AOL Instant Messenger</li>
<li><a href="http://en.wikipedia.org/wiki/List_of_TCP_and_UDP_port_numbers">more common ports</a></li>
</ul>
</li>
<li>some programs (games, streaming media programs) use simpler <a href="http://en.wikipedia.org/wiki/User_Datagram_Protocol">UDP</a> protocol instead of TCP</li>
</ul>
</div>
<div class="slide titleslide">
<h1>1.2: The World Wide Web (WWW)</h1>
<ul>
<li>
1.1: The Internet
</li>
<li>
<strong>1.2: The World Wide Web (WWW)</strong>
</li>
</ul>
</div>
<div class="slide">
<h1>
<a href="http://en.wikipedia.org/wiki/Web_server">Web servers</a> and
<a href="http://en.wikipedia.org/wiki/Web_browser">browsers</a>
<span class="readingsection">(1.2.1)</span>
</h1>
<div class="topfigure">
<img src="images/webserver.gif" alt="web server" />
</div>
<ul>
<li>
<strong>web server</strong>: software that listens for web page requests
<ul>
<li><a href="http://www.apache.org">Apache</a></li>
<li>Microsoft Internet Information Server (IIS) (<a href="http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/iiiisin2.mspx?mfr=true">part of Windows</a>)</li>
</ul>
</li>
<li><strong>web browser</strong>: fetches/displays documents from web servers
<img class="rightfigure" src="images/web_browser.jpg" alt="Firefox web browser" />
<ul>
<li><a href="http://www.getfirefox.com/">Mozilla Firefox</a></li>
<li>Microsoft <a href="http://www.microsoft.com/windows/products/winfamily/ie/">Internet Explorer</a> (IE)</li>
<li>Apple <a href="http://www.apple.com/safari/">Safari</a></li>
<li><a href="http://www.google.com/chrome/">Google Chrome</a></li>
<li><a href="http://www.opera.com/">Opera</a></li>
</ul>
</li>
</ul>
</div>
<div class="slide">
<h1>
Domain Name System (<a href="http://en.wikipedia.org/wiki/Dns">DNS</a>)
<span class="readingsection">(1.2.2)</span>
</h1>
<ul>
<li>a set of servers that map written names to IP addresses
<ul>
<li>Example: <code>www.cs.washington.edu</code> → <code>128.208.3.88</code></li>
</ul>
</li>
<li>many systems maintain a local cache called a <a href="http://en.wikipedia.org/wiki/Hosts_file">hosts file</a>
<ul>
<li>Windows: <code><a href="file:///C:/Windows/system32/drivers/etc/hosts">C:\Windows\system32\drivers\etc\hosts</a></code></li>
<li>Mac: <code><a href="file:///private/etc/hosts">/private/etc/hosts</a></code></li>
<li>Linux: <code><a href="file:///etc/hosts">/etc/hosts</a></code></li>
</ul>
</li>
</ul>
</div>
<div class="slide">
<h1>Uniform Resource Locator (<a href="http://en.wikipedia.org/wiki/Url">URL</a>)</h1>
<ul>
<li>an identifier for the location of a document on a web site</li>
<li>
a basic URL:
<pre>
<a href="http://www.aw-bc.com/info/regesstepp/index.html">http://www.aw-bc.com/info/regesstepp/index.html</a>
~~~~ ~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~
protocol host path
</pre>
</li>
<li>
upon entering this URL into the browser, it would:
<ul>
<li>ask the DNS server for the IP address of <code>www.aw-bc.com</code></li>
<li>connect to that IP address at port 80</li>
<li>ask the server to <code>GET /info/regesstepp/index.html</code></li>
<li>display the resulting page on the screen</li>
</ul>
</li>
</ul>
</div>
<div class="slide">
<h1>More advanced URLs</h1>
<ul>
<li>
<strong>anchor</strong>: jumps to a given section of a web page
<pre>
<a href="http://www.textpad.com/download/index.html#downloads">http://www.textpad.com/download/index.html<strong>#downloads</strong></a>
</pre>
<ul>
<li>fetches <code>index.html</code> then jumps down to part of the page labeled <code>downloads</code></li>
</ul>
</li>
<li>
<strong>port</strong>: for web servers on ports other than the default 80
<pre>
http://www.cs.washington.edu<strong>:8080</strong>/secret/money.txt
</pre>
</li>
<li>
<strong>query string</strong>: a set of parameters passed to a web program
<pre>
<a href="http://www.google.com/search?q=miserable+failure&start=10">http://www.google.com/search<strong>?q=miserable+failure&start=10</strong></a>
</pre>
<ul>
<li>parameter <code>q</code> is set to <code>"miserable+failure"</code></li>
<li>parameter <code>start</code> is set to <code>10</code></li>
</ul>
</li>
</ul>
</div>
<div class="slide">
<h1>
Hypertext Transport Protocol
(<a href="http://en.wikipedia.org/wiki/Http_protocol">HTTP</a>)
<span class="readingsection">(1.2.3)</span>
</h1>
<ul>
<li>the set of commands understood by a web server and sent from a browser</li>
<li>
some HTTP commands (your browser sends these internally):
<ul>
<li><code>GET <em>filename</em></code> : download</li>
<li><code>POST <em>filename</em></code> : send a web form response</li>
<li><code>PUT <em>filename</em></code> : upload</li>
</ul>
</li>
<li>
simulating a browser with a terminal window:
<pre class="examplecode shell">
$ <em>telnet www.cs.washington.edu 80</em>
Trying 128.208.3.88...
Connected to 128.208.3.88 (128.208.3.88).
Escape character is '^]'.
<em>GET /index.html</em>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 ...">
<html>
...
</pre>
</li>
</ul>
</div>
<div class="slide">
<h1>HTTP error codes</h1>
<ul>
<li>when something goes wrong, the web server returns a special "error code" number to the browser, possibly followed by an HTML document</li>
<li>common error codes:
<table class="standard">
<tr>
<th class="spaced">Number</th><th>Meaning</th>
</tr>
<tr>
<td>200</td>
<td>OK</td>
</tr>
<tr>
<td><a href="http://clsc.net/research/google-302-page-hijack.htm">301-303</a></td>
<td>page has moved (permanently or temporarily)</td>
</tr>
<tr>
<td><a href="http://www.cs.washington.edu/education/courses/cse190d/07sp/lectures/">403</a></td>
<td>you are forbidden to access this page</td>
</tr>
<tr>
<td><a href="http://www.homestarrunner.com/404.html">404</a></td>
<td>page not found</td>
</tr>
<tr>
<td>500</td>
<td>internal server error</td>
</tr>
<tr>
<td colspan="2" class="completelist"><a href="http://en.wikipedia.org/wiki/Http_error_codes">complete list</a></td>
</tr>
</table>
</li>
</ul>
</div>
<div class="slide">
<h1>Internet media ("<a href="http://en.wikipedia.org/wiki/Mime_type">MIME</a>") types</h1>
<ul>
<li>
sometimes when including resources in a page (style sheet, icon, multimedia object), we specify their type of data
<table class="standard">
<tr>
<th>MIME type</th>
<th class="slidetable">file extension</th>
</tr>
<tr class="code">
<td>text/html</td><td>.html</td>
</tr>
<tr class="code">
<td>text/plain</td><td>.txt</td>
</tr>
<tr class="code">
<td>image/gif</td><td>.gif</td>
</tr>
<tr class="code">
<td>image/jpeg</td><td>.jpg</td>
</tr>
<tr class="code">
<td>video/quicktime</td><td>.mov</td>
</tr>
<tr class="code">
<td>application/octet-stream</td><td>.exe</td>
</tr>
</table>
</li>
<li>Lists of MIME types: <a href="http://www.w3schools.com/media/media_mimeref.asp">by type</a>, <a href="http://www.webmaster-toolkit.com/mime-types.shtml">by extension</a></li>
</ul>
</div>
<div class="slide">
<h1>
Web languages / technologies
<span class="readingsection">(1.2.4)</span>
</h1>
<ul>
<li>Hypertext Markup Language (<a href="http://en.wikipedia.org/wiki/Html">HTML</a>): used for writing web pages</li>
<li>Cascading Style Sheets (<a href="http://en.wikipedia.org/wiki/Cascading_Style_Sheets">CSS</a>): stylistic info for web pages</li>
<li>PHP Hypertext Processor (<a href="http://www.php.net/">PHP</a>): dynamically create pages on a web server</li>
<li><a href="http://en.wikipedia.org/wiki/JavaScript">JavaScript</a>: interactive and programmable web pages</li>
<li>Asynchronous JavaScript and XML (<a href="http://en.wikipedia.org/wiki/Ajax_%28programming%29">Ajax</a>): accessing data for web applications</li>
<li>eXtensible Markup Language (<a href="http://en.wikipedia.org/wiki/Xml">XML</a>): metalanguage for organizing data</li>
<li>Structured Query Language (<a href="http://en.wikipedia.org/wiki/Sql">SQL</a>): interaction with databases</li>
</ul>
</div>
<!--#include virtual="commonbottom.html" -->