HTML5: getting started with html5

Elements

Elements are marked up using start tags and end tags. Tags are delimited using angle brackets with the tag name in between. The difference between start tags and end tags is that the latter includes a slash before the tag name.

Example:

This example paragraph illustrates the use of start tags and end tags.

<p>The quick brown fox jumps over the lazy dog.</p>

In both tags, whitespace is permitted between the tag name and the closing right angle bracket, however it is usually omitted because it’s redundant.

In XHTML, tag names are case sensitive and are usually defined to be written in lowercase. In HTML, however, tag names are case insensitive and may be written in all uppercase or mixed case, although the most common convention is to stick with lowercase. The case of the start and end tags do not have to be the same, but being consistent does make the code look cleaner.

HTML Example:

<DIV>...</DIV>

An empty element is any element that does not contain any content within it. In general, an empty element is just one with a start tag immediately followed by its associated end tag. In both HTML and XHTML syntaxes, this can be represented in the same way.

Example:

<span></span>

Some elements, however, are forbidden from containing any content at all. These are known as void elements. In HTML, the above syntax cannot be used for void elements. For such elements, the end tag must be omitted because the element is automatically closed by the parser. Such elements include, among others, br, hr, link and meta

HTML Example:

<link type="text/css" rel="stylesheet" href="style.css">

n XHTML, the XML syntactic requirements dictate that this must be made explicit using either an explicit end tag, as above, or the empty element syntax. This is achieved by inserting a slash at the end of the start tag immediately before the right angle bracket.

Example:

<link type="text/css" href="style.css"/>

Authors may optionally choose to use this same syntax for void elements in the HTML syntax as well. Some authors also choose to include whitespace before the slash, however this is not necessary. (Using whitespace in that fashion is a convention inherited from the compatibility guidelines in XHTML 1.0, Appendix C.)

Attributes

Elements may contain attributes that are used to set various properties of an element. Some attributes are defined globally and can be used on any element, while others are defined for specific elements only. All attributes have a name and a value and look like this.

Example:
This example illustrates how to mark up a div element with an attribute named class using a value of "example".

<div class="example">...</div>

Attributes may only be specified within start tags and must never be used in end tags.

Erroneous Example:

<section id="example">...</section id="example">

In XHTML, attribute names are case sensitive and most are defined to be lowercase. In HTML, attribute names are case insensitive, and so they could be written in all uppercase or mixed case, depending on your own preferences. It is conventional, however, to use the same case as would be used in XHTML, which is generally all lowercase.

HTML Example:

<div CLASS="example">

In general, the values of attributes can contain any text or character references, although depending on the syntax used, some additional restrictions apply, which are outlined below.
There are four slightly different syntaxes that may be used for attributes in HTML: empty, unquoted, single-quoted and double-quoted. All four syntaxes may be used in the HTML syntax, depending on what is needed for each specific attribute. However, in the XHTML syntax, attribute values must always be quoted using either single or double quotes.

Empty Attributes

An empty attribute is one where the value has been omitted. This is a syntactic shorthand for specifying the attribute with an empty value, and is commonly used for boolean attributes. This syntax may be used in the HTML syntax, but not in the XHTML syntax.

Note: In previous editions of HTML, which were formally based on SGML, it was technically an attribute’s name that could be omitted where the value was a unique enumerated value specified in the DTD. However, due to legacy constraints, this has been changed in HTML5 to reflect the way implementations really work.

HTML Example:

<input disabled>...</div>

The previous example is equivalent to specifying the attribute with an empty string as the value.

<input disabled="">...</div>

Note: The previous example is semantically equivalent to specifying the attribute with the value "disabled", but it is not exactly the same.

Example:

<img src="decoration.png" alt>

The previous example is equivalent to specifying the attribute with an empty string as the value.

<img src="decoration.png" alt="">

Unquoted Attribute Values

In HTML, but not in XHTML, the quotes surrounding the value may also be omitted in most cases. The value may contain any characters except for spaces, single or double quotes (' or "), an equals sign (=) or a greater-than symbol (>). If you need an attribute to contain those characters, they either need to be escaped using character references, or you need to use either the single- or double-quoted attribute values.
Some additional characters cannot be used in unquoted attribute values, including space characters, single (') or double (") quotation marks, equals signs (=) or greater than signs (>).

HTML Example:

<div class=example>

Double-Quoted Attribute Values

In both HTML and XHTML, attribute values may be surrounded with double quotes.
By quoting attributes, the value may contain the additional characters that can’t be used in unquoted attribute values, but for obvious reasons, these attributes cannot contain additional double quotation marks within the value.

Example:

<div class="example class names">...</div>

Single-Quoted Attribute Values

In both HTML and XHTML, attribute values may be surrounded with single quotes.
By quoting attributes, the value may contain the additional characters that can’t be used in unquoted attribute values, but for obvious reasons, these attributes cannot contain additional single quotation marks within the value.

Example:

<div class='example class names'>...</div>

Character References

Discuss numeric and named character reference syntax. May link to the list of entity references in a separate document, rather than trying to list them all in here.

Understanding MIME Types

Discuss text/html, application/xhtml+xml, etc.

Character Encoding

Overview of Unicode, character repertoires, encodings, etc. Declaring the encoding with the Content-Type header, BOM, meta, etc.

Choosing HTML or XHTML

The choice of HTML or XHTML syntax is largely dependent upon a number of factors the, including needs of a given project, the skill set of the developers involved, level of support in browsers used by the site’s target audience, or it may simply be a matter of personal preference.
The important thing to understand is that there are valid reasons to choose both, and that authors are encouraged to make an informed decision.

Need to develop guidelines to help authors make this choice.

Polyglot Documents

A polyglot HTML document is a document that conforms to both the HTML and XHTML syntactic requirements, and which can be processed as either by browsers, depending on the MIME type used. This works by using a common subset of the syntax that is shared by both HTML and XHTML.
Polyglot documents are useful to create for situations where a document is intended to be served as either HTML or XHTML, depending on the support in particular browsers, or when it is not known at the time of creation, which MIME type the document will ultimately be served as.
In order to successfully create and maintain polyglot documents, authors need to be familiar with both the similarities and differences between the two syntaxes. This includes not only syntactic differences, but also differences in the way stylesheets, and scripts are handled, and the way in which character encodings are detected.
This section will provide the details about each of these similarities and differences, and provide guidelines on the creation of polyglot documents.
Base this on the HTML vs. XHTML article

Getting Started with HTML 5

The most common format for publishing documents on the web and creating web applications is HTML. From its beginning as a relatively simple language primarily designed for describing scientific documents, it has grown and adapted to a wide variety needs ranging from publishing news and blogs, to providing the foundation for full blown applications for email, maps, word processing and spreadsheets.
As the uses of HTML have grown, the demands placed upon it by authors have increased and the limitations of HTML become more pronounced. HTML 5 is represents the next major step in the development of HTML, introducing a wide range of new features into the language. Authors who are familiar with previous versions of HTML are advised to familiarise themselves with the differences from HTML 4 [HTML4DIFF]
This section provides an introductory tutorial to help get you started with HTML, and is suitable for beginners. Experienced authors may choose to skip this section and proceed to the syntax overview and the element reference.

A Basic Document

The goal of this section is to walk people though creating example01.html

<!DOCTYPE html>
<html>
 <head>
  <meta charset="UTF-8">
  <title>Example 01</title>
 </head>
 <body>
  <p>This is my first document.</p>
 </body>

</html>

To begin, we’re going to create a very basic HTML document, which will also serve as a useful template for future HTML documents. This document will simply contain a title and short paragraph.
Open a text editor and create a new, empty file. I suggest you save the file as example01.html.
All HTML documents need to begin with a DOCTYPE. The DOCTYPE is a remnant from the early days of the web. For historical reasons, it is needed to ensure that web browsers interpret the document correctly, rather than using a special compatibility mode designed to replicate the behaviour of older browsers.
In your text editor, type the following on the first line, and save the file.

<!DOCTYPE html>

Because this is required for all documents, it is good practice to get in the habit of always typing that as the first line in all new HTML documents you create, so that it never gets forgotten.
An HTML document is divided into two main sections. The head, which is used to contain document metadata, such as the title, stylesheets and scripts; and the body, which contain all of the page’s content. The markup itself forms a tree structure, as illustrated in the following diagram.

Understanding Semantics

In general, the purpose of writing and publishing a document is to convey information to the readers. This could be any kind of information, such as telling a story, reporting news and current affairs or describing available products and services. Whatever the information is, it needs to be conveyed to the reader in a way that can be easily understood.
A typical document, such as an book, news article, blog entry or letter is often grouped into different sections containing a variety of headings, paragraphs, lists, tables, quotes and various other typographical structures. All of these structures are important for more easily conveying information to the reader. HTML provides the means to clearly identify each of these structures in a way that can then be easily presented to the user. In essence, this is the purpose of markup, and HTML in particular.
Markup is a machine readable language that describes aspects of a document such as its structure, semantics and/or style. Some markup languages are designed solely for the purpose of describing the presentation of the document, such as RTF (Rich Text Format). Others, such as HTML, are more generic and rather than focussing on describing the presentation, they are designed to focus on describing the meaning or purpose of the content and leave the presentation for another layer to deal with.
HTML provides a wide variety of semantic elements that can be used to mark up various common typographical structures. There are heading elements for marking up different levels of headings, a paragraph (p) element for paragraph, various list elements for marking up different types of lists, and a table elements for marking up tables.
It’s important to distinguish between the structure and semantics of content, which should be described using HTML, and its presentation. In one document, a heading may be presented visually in a large bold typeface with wide margins above and below to separate it from the surrounding content and make it stand out. In another document, a heading may be presented in a light coloured, italic, fancy script typeface. But regardless of the presentation, it’s still a heading and the markup can still uses the same basic elements for identifying common structures.

HTML5

Wednesday, 4 January 2012

HTML5 MORE ABOUT SYNTAX

Elements

Attributes

Empty Attributes

Unquoted Attribute Values

Double-Quoted Attribute Values

Single-Quoted Attribute Values

Character References

Understanding MIME Types

Character Encoding

Choosing HTML or XHTML

Polyglot Documents

HTML SESSION 2

Getting Started with HTML 5

A Basic Document

Understanding Semantics