Updated: 2013-01-20
Characters not available on the keyboard can be entered using their Unicode code points (here, hexadecimal).  Entering these characters directly into documents is prefered over using (X)HTML character references (�).
Create HTML 5 (markup, HTML 4 differences) documents that can be both HTML and XML (XHTML). See the W3C's Polyglot Markup: HTML-Compatible XHTML Documents. Not complete, just select reminders.
<!DOCTYPE html>
<html lang="en-US" xml:lang="en-US" xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="UTF-8"/>
<title></title>
</head>
<body></body>
</html>
br) are <element/> (an not <element> or <element></element>), other empty elements (eg p) are <element></element> (not <element/>)<link rel="" href=""/> and <script src=""></script>)trs in a tbody, thead, or tfoot, and cols in a colgroup<pre> or <textarea>
)lang and xml:langnoscriptWhen you feel like including metadata… just add the attributes (lite (or core)) to the document (unlike earlier).
Check out schema.org for vocabulary.