{% include toc.html %}
When you are working with online sources, much of the time you will be
using files that have been marked up with HTML (Hyper Text Markup
Language). Your browser already knows how to interpret HTML, which is
handy for human readers. Most browsers also let you see the HTML source code
for any page that you visit. The two images below show a typical web
page (from the Old Bailey Online) and the HTML source used to generate
that page, which you can see with the
Tools -> Web Developer -> Page Source
menu item in Firefox.
When you're working in the browser, you typically don't want or need to see the source for a web page. If you are writing a page of your own, however, it can be very useful to see how other people accomplished a particular effect. You will also want to study HTML source as you write programs to manipulate web pages or automatically extract information from them.
{% include figure.html filename="obo.png" caption="Old Bailey Online screenshot" %}
{% include figure.html filename="obo-page-source.png" caption="HTML Source for Old Bailey Online Web Page" %}
(To learn more about HTML, you may find it useful at this point to work through the W3 Schools HTML tutorial. Detailed knowledge of HTML isn't immediately necessary to continue reading, but any time that you spend learning HTML will be amply rewarded in your work as a digital historian or digital humanist.)
HTML is what is known as a markup language. In other words, HTML is
text that has been "marked up" with tags that provide information for
the interpreter (which is often a web browser). Suppose you are
formatting a bibliographic entry and you want to indicate the title of a
work by italicizing it. In HTML you use em
tags ("em" stands for
emphasis). So part of your HTML file might look like this
... in Cohen and Rosenzweig's <em>Digital History</em>, for example ...
The simplest HTML file consists of tags which indicate the beginning and
end of the whole document, and tags which identify a head
and a body
within that document. Information about the file usually goes into the
head, whereas information that will be displayed on the screen usually
goes into the body.
<html>
<head></head>
<body>Hello World!</body>
</html>
You can try creating some HTML code. In your text editor, create
a new file. Copy the code below into the editor. The first line tells
the browser what kind of file it is. The html
tag has the text direction
set to ltr
(left to right) and the lang
(language) set to US English.
The title
tag in the head of the HTML document contains material that is
usually displayed in the top bar of a window when the page is being
viewed, and in Firefox tabs.
<!doctype html>
<html dir="ltr" lang="en-US">
<head>
<title><!-- Insert your title here --></title>
</head>
<body>
<!-- Insert your content here -->
</body>
</html>
Change both
<!-- Insert your title here -->
and
<!-- Insert your content here -->
to
Hello World!
Save the file to your programming-historian
directory as
hello-world.html
. Now go to Firefox and choose File -> New Tab
and
then File -> Open File
. Choose hello-world.html
. Depending on your
text editor you may have a 'view page in browser' or 'open in browser'
option. Once you have opened the file, your message should appear in the
browser. Note the difference between opening an HTML file with a browser
like Firefox (which interprets it) and opening the same file with your
text editor (which does not).