HTML is the acronym for Hypertext Markup Language, the format for all web pages on the Internet. It is what gets sent to your web browser by a website server, and is decoded to show the webpage you see. You are reading HTML right now!
Because of the way the World Wide Web was cleverly added as a layer on top of the much older, very basic text-based Internet, which only used to handle newsgroups and email, HTML is restricted to a quite small number of US keyboard characters, numbers and symbols. As a Markup Language, it uses a very simple system of different tags to surround content describing (or marking up) what that content is – a heading, a paragraph, a table, or content within one of those elements that should perhaps be highlighted or treated differently in some way. Because those tags are short words or abbreviations, an HTML document is usually easily human-readable, desite being designed for a computer browser to interpret.
HTML was never designed as a language that had fine visual formatting options – it came from a point in computer history where the GUI or Graphical User Interface was still relatively new. Simple style changes like bold or italic were available, but there were few standards and how they were interpreted was left up to the browser, and the whims of the browser writers. To fix this, a supporting standard has grown up and become embedded: Cascading Style Sheets or CSS. This allows almost every HTML tag to have how it is displayed defined in a way that is entirely separate from the HTML in the webpage, and in an independent way that should match in any browser. This deliberate separation of form and function and the new features in CSS have been critical for the recent growth and jump in quality of the World Wide Web.
What you have to be aware of is that this site must convert all stories into HTML, so that we can then display them by sending them to a reader’s web browser. We do not keep the original file or format of your story, whether it was RTF or text. Additionally, if you submit stories in HTML then we check that you have not used any tags that are unsuitable for simple formatting of your story on the site.
There are thousands of excellent HTML tutorials on the web if you want to learn more, such as the W3Schools HTML Tutorial site.
In general, no. If you are used to writing in a word processor and getting all the features that offers, like grammar and spell checking, smart quotes and auto-correct, then there’s no need to worry about HTML. Just upload and the site parsers will do the work of converting everything. Our goal is to make your story appear as close as possible to what you originally wrote, within the limits of the overall design skin for this website. Some small formatting errors can also be fixed by the parsers.
However, since parsing is an automatic process and writers are free to write in very idiosyncratic ways, sometimes some manual intervention is required to edit the parsed results into a better rendition of the original. That is perfectly normal and we are always happy to help out if asked, if you are unable to edit the HTML yourself.
The main problem some authors may have that requires a little knowledge of HTML now is the way the parsers convert quotemarks to HTML tags, instead of them being merely text symbols. This means that quotemarks have to be written in a standardised way for the parsers to understand them clearly and tag them correctly. It also requires that typing errors such as missing quotes and incorrect punctuation/spacing around quotemarks must be robustly checked by us, and we may prevent posting if errors cannot be automatically corrected, due to the broken HTML that this would produce.
For more information on the way quotes are handled by this site, read the discussion below: Quotes in MMSA.
Yes! Use the upload browser, and make sure that your file is saved with a .TXT suffix. Do NOT try to upload whole web pages enclosed in <html> and <body> tags, just the story text split normally into paragraphs using <p> tags. You can also copy/paste your HTML source directly into the edit box at the bottom of the new story upload page. When previewing, the box there contains the current source for your story which you can continue to edit until you are happy with the results.
Writing directly in HTML means you can be more specific about the look and layout of your story, and you can make sure that the correct entities are used for unusual characters and symbols. There are many restrictions on tags, however, which are necessary to ensure stories do not break the site page display, and a few style tags may not work as you’d normally expect. This is explained in the next section.
Most simple formatting and layout tags are allowed. For basic text style changes you can use <b>, <i>, <s> and <u> tags:
Text can be bold, italic,
Note that the tags <strong> and <em>, which are often used to create bold and italic effects, have a different stylistic purpose on this site and will NOT appear with those styles here. The repurposing of <em> on this site is explained below in the section on writing quotes in your stories.
More unusual styles are subscripts and superscripts, created with <sub> and <sup>, and changing the size of text. Since this site is now aiming to be HTML5 compliant, the <big> tag which used to be matched with <small> cannot be used, so for this purpose you should use <strong> instead. For monospaced (teletype or quasi-digital output) use <code>:
H2O has a subscript, 20th Century has a superscript.
This is big, this is small,
this is monospaced.
For layout, the <blockquote> tag produces the indented text sections that you can see above. It adds no style of its own. It is a container tag, meaning that you should put text in paragraphs within it, it cannot appear within a paragraph. Paragraphs must always be delimited with <p> tags, or you may use header tags: <h2>, <h3> and <h4>. You can break lines with the <br> tag and add indented horizontal rules between story sections with the <hr> tag, like this one:
The <pre> tag produces monospaced output that retains white-spaces (where HTML normally discards any white-space characters past the first it encounters). If you need to set monospaced, tabular output with specific numbers of spaces between columns so that they line up vertically, or computer code that requires correct indentation, this is the tag you should use, NOT <code>, because despite the tag name that will still have spacing automatically condensed by the browser (it is effectively just a monospaced style tag). Please avoid setting long widths of text this way, however! It forces narrow browsers (like smartphones in portrait mode) to provide horizontal scrollbars, which are annoying for readers since they cannot see the full content width at once.
You may also use ordered/unordered lists, set with <ol>, <ul> and <li> tags (note the automatic indentation, line spacing and numbering):
There are three final tags allowed. The use of <q> and <em>, which is rather more complex, is discussed in the main section below on writing quotes. The last is <wbr> which allows browsers to optionally split a long line of text at the tag if needed to wrap it within the window, where there are no hyphens or white space to do this (although it would be better practise to avoid the need for this). Without this tag, very long text words would force a narrow browser to provide a horizontal scrollbar, which is bad for readers. This tag does not otherwise display anything and has been reintroduced officially in HTML5 (it was once a non-standard Internet Explorer tag) so should now be widely supported. If your browser width is narrow, you may even see it in action in the code example below, allowing the line to wrap:
Generally, NO. If a parameter was specifically requested – eg. for the ordered list tag <ol>, to change the numbering format – then we could probably accommodate that by special arrangement. We do allow limited classing for paragraphs and headers, however, to allow simple text alignment (centering and right-justification) and to help with one quoting situation with <q> tags (described below):
This is centred.
You can set text
Use them freely. They are the only way to describe reliably many symbols and non-English characters in HTML. Keep in mind that it is far easier to debug and edit named entities than numbered ones, however.
A few you should probably become familiar with are the non-breaking space (very useful for keeping two words or numbers joined when you do not want them split across lines), – the long dash or en-dash and its longer brother — the em-dash (both of which should be used with the non-breaking space after words to ensure the dash is never orphaned to the next line). Short hyphens are only used to link words, so these dashes should be used in all other text situations.
Try to avoid using the … entity (ellipsis), as frankly it usually looks strange (and is code-wasteful) compared to just writing three full stops. The site parsers remove and replace this where possible. Writing quotes with named entities is also strongly discouraged. For the best results you should be using the <q> and <em> tags, and all Latin quote entities (even international ones like guillemets/chevrons and European lower-9/upper-9 quotemarks) will be converted to those. (See the main section below for more information.)
The only quasi-quote entities that will be left, and converted to where possible from single apostrophes and quote symbols next to numbers, are the ′ and ″ symbols, meant for marking feet and inches or minutes/seconds, which we strongly recommend you use:
I am 5′6″ tall. I am holding a 12″ ruler.
Being a web story archive creates its own rules, just like any other library. If you walked into your local library you would expect to find some consistency – the normal size of books displayed on relatively same-spaced shelves, books containing black type on white pages, all following normal typographic rules so that they are easily readable etc.. This archive has to take written submissions in multiple formats and convert them into correct HTML that follows very rigidly defined rules, while providing a good facsimile of what authors originally wrote. Having to convert everything to HTML so it is readable in a web browser creates challenges, but also interesting opportunities that ultimately benefit authors and readers.
One of our goals for a long while has been to standardise the use of quotemarks in the site. What we inherited from the old archive was the use of standard non-typographic or straight quotes which were expected in text and email submissions of the time. Due to the difficulties of deciding algorithmically how to open and close quotes in a complex paragraph, or nest them, we continued that way for some time. A few years ago we experimented by adding support for smart typographic quotes into our RTF parser, but that was not an unqualified success. It also meant that the quotemarks are hard-coded into the page and story text, which makes them less flexible for the future. After all, they are just another piece of markup that browsers should be able to handle. The actual quote symbols and the way they are nested changes from language to language too, something that has to be considered on a multi-lingual site.
The oldest stories in the archive date from a time in the Internet’s history when HTML was quite rudimentary, but even back then there was support in Markup Language for quotes. Remember, Markup Language is not really for defining the stylistic look of content, such as whether it is in bold or italic, etc., it is for marking up text by defining what that content actually IS. That is what makes it so useful. Some of the tags little-used outside academia are the most powerful in this regard, such as <cite>, which defines a source for a quotation. It doesn’t even display anything! It just adds the information silently into the page source for reference, perhaps to be extracted automatically later.
So HTML does in fact have a quotes tag, defining passages of quoted text. Unsurprisingly it is <q> and is used like most other markup tags, around quoted passages with a matching closing </q> tag. Using it should display the correct quotemarks suggested by the web page language around the tagged text. However, it was virtually ignored by the browser authors for years, and when it did work, you couldn’t guarantee it would display the same in the competing browsers, so actually having it in your web pages was rather pointless and difficult to work around reliably with non-supporting browsers (particularly Safari and IE).
With the advent of CSS or Cascading Style Sheets as a companion to HTML, so that HTML elements can be styled individually and flexibly, the <q> tag has finally had a little love from the W3C (World Wide Web Consortium, who define web standards including HTML and CSS) and from the major browser manufacturers. It is now a more standardised and supported tag, and most modern browsers display the tag correctly – enough to make a shift and embrace it. The beauty of the tag is that it is more compact and therefore efficient than using hard-coded quote entities, it provides useful information about what it is tagging, it can be multi-level nested so that nested quotemarks are displayed differently and correctly by the browser, but most importantly it can be styled with CSS.
Why is that important? Because by styling with CSS we can provide different quotes solutions for the different languages being used on the site, that more accurately reflect the proper typographic traditions of those countries/languages just as you’d find in a locally printed book, rather than a generic default of US or UK English tradition. It also means that we can provide a mechanism for authors to have some personal control over the quotemark style defaults for their own stories, now provided as a set and forget system through authors’ own account management pages.
When we first had the idea of using <q> tags, it seemed very simple: all speech passages should go inside double quotemarks. That’s the international (English) standard. Some writers, however, and many book publishers, prefer that speech passages should appear inside single quotemarks. With the style setting now up to the author this is easy to accommodate, as that is just the appearance of the tag changing (through CSS), NOT the tag itself. Whatever it looked like, we now had speechmark tags.
But that didn’t address the problem of words or phrases that are NOT speech which are often set inside quotemarks. That might be the text of a sign, note or news article (is the sign ‘speaking’?) or, like the word inside the brackets, separated for emphasis or because it is ironic/sarcastic. In your inner voice as you read a story, or this FAQ, you would stress a word ‘emphasised’ like that. In this site, we refer to those as emphasis quotes. Since we needed a unified approach to quotes that allowed support for different national styles, which would also change the quotemarks used around those words, it was obvious that these would need to be handled with tags also.
There is no support in HTML for this idea of secondary quotes, so no official tag we can use for it. However, there is the <em> tag which is specifically designed to markup ‘emphasis’. Traditionally, this has been done by showing that emphasis in italics, identically to the <i> tag (which was reviled by HTML purists as being a mere style tag, not useful markup). However, since we are free to style the site any way we wish, and since we already have a simple way to italicise text, we can co-opt the <em> tag for this and change the way it displays by adding quotemarks instead of italics. This is arguably a better use and interpretation of the tag rather than it merely duplicating function.
Where normal quoted speech appears in double quotemarks (whatever those are for a given language), emphasis quotes will appear in single ones. They will not be styled in any other way (ie. no italics, you would have to add that separately if wanted). If you are used to writing in HTML, remember to use the <i> tag for styling in italics, not the <em> tag. Also be aware that, like <q> tags, these will be subject to strict parser checking as they MUST be matched (have a closing tag for every opening one in a paragraph) at all times.
While the use of emphasis quotes ought to be clear, and the types of content that demand them (generally anything that is not spoken aloud by a person or machine or reported as speech), there is a grey area where you could easily use either. If you are re-quoting words that have already been spoken, then you may wish to use full quotemarks around those words to reference the source. Consider the following example:
You're just a big baby!laughed John.
I didn’t like being calleda big baby.
But such a cute boy!agreed Tom.
I reminded them that thisboywas nearly thirty.
Although they are being used for emphasis, full quotes there instead of single emphasis quotemarks reminds readers that the words being highlighted were actually spoken. This could be more useful when the reference is made much later in a story. As always, the process of writing is about making meaning clear to the reader, and this subtle reminder of the source of the words you are using might be more powerful than simple emphasis.
Most authors will not need to do anything special. Where stories are submitted in text or via RTF, and they contain smart quotes added by a word processor, then those will be converted directly to <q> and <em> tags. The exact form of quotes you may have used will be lost, no longer hard-coded into the story text. The mechanism used to reconstruct your preferred quotemark style and display it in a browser will now rely on the site defaults and your personal preferences, if you choose to set them differently.
Authors who are writing or submitting in HTML should start using these tags now instead of quote entities, and are encouraged to edit their own stories to fix tagging this new way. Everyone should notice that the new <q> and <em> tags will now appear in the HTML edit box when your story is parsed. PLEASE DO NOT TRY TO UNDO THEM OR ALTER THEM IF YOU DO NOT UNDERSTAND THEM. The story preview should show the browser representations of the tags using English language defaults for the story, until the story is reviewed and the correct language is set for it.
It is probable that this new system may have some teething troubles and unintended consequences. As usual the Forum Helpdesk will be the place to get advice and ask for fixes in the case of serious problems with any story submission.
As part of this major site upgrade, the story previews will now highlight for you in purple any paragraphs where the parsers have had difficulties identifying quotes. This may be an error in your writing (a quotemark missing, or spaced wrongly, or missing punctuation which misleads the parser in recognising a closing or opening quote) or where there has been some ambiguity and the parser cannot be sure about the decision it has made. The most likely reasons for ambiguity will be paragraphs missing a closing quotemark (correct for quotes spanning a number of paragraphs by the same speaker) and words front-shortened with an opening apostrophe, like ’til, or ’em as a patois for them, which appear exactly like opening single quotes. Those are now handled through a whitelisting system, and you will need to inform us of any such words you have used which are not being passed through correctly.
However, some quote errors may be impossible for the parser to guess at or fix automatically, and would produce broken HTML if posted unchanged. These will be highlighted in red, and will prevent story posting until the errors are fixed. If you are familiar with HTML, you can use the HTML box at the bottom of the preview to see how the parser has converted things and correct any mistakes there, otherwise you will need to correct them in your source file and re-upload.
Purple highlighting will NOT appear in your final story on-site and you do NOT need to remove it from the HTML box. It may appear in some circumstances even if there is no actual problem with a paragraph, just as a warning to check the parsed result very carefully. You can safely submit even when there is still purple highlighting showing, although you are strongly advised to check all highlighted paragraphs very carefully for errors. If you cannot see why a paragraph is highlighted, ask in the Forum Helpdesk.
The examples below have been set in <div> tags with their language property defined in HTML. This is the same mechanism that this site uses to define the overall language of your stories for the browser. This is also a test of your current browser. If it supports the <q> tag correctly then all the examples below should display with different styles of quotemarks.
Note carefully the difference between inner and outer nested quotes. Theoretically there may be unlimited levels of nesting, with the two quote styles alternating ad infinitum, but some browsers may not display this correctly beyond a few levels. It is strongly advised that you do not test this to destruction in your stories – keep your nesting levels LOW. Your readers will appreciate it too...
Where you would use single quotes to delimit something that is NOT dialogue or reported speech, but might be emphasis, sarcasm, the text of a note/sign or other non-speech element, this should now be set within <em> tags. On this site, from now on these will not add italic formatting but will instead provide single quotes in language specific versions.
Note that each of these lines, because of the containing <div> tags setting the language, needs just the following simple HTML to display all these different quotemarks:
English language (EN):
These arenormal outer quotes containing an. This is emphasised.inner quote
(You will be able to set the common alternative reversing these if you prefer).German language (DE):
These arenormal outer quotes containing an. This is emphasised.inner quote
(You will be able to set the alternative using » « and › ‹ if you prefer).French language (FR):
These arenormal outer quotes containing an. This is emphasised.inner quote
(Note the correct additional non-break spacing is added automatically).Spanish language (ES):
These arenormal outer quotes containing an. This is emphasised.inner quoteDutch language (NL):
These arenormal outer quotes containing an. This is emphasised.inner quoteCzech (CS) and Romanian language (RO):
These arenormal outer quotes containing an. This is emphasised.inner quoteRussian language (RU):
These arenormal outer quotes containing an. This is emphasised.inner quote
These have been compiled from checking a number of online references. Italian and Portugese language styles appear to be the same as English. There are a few common alternatives, so this may not be the same as your personal preference, but that will be fixable if you provide us with a reference to a different convention. Also, if you are a native speaker/reader/writer and feel there are errors in these examples, let us know.
Using <q> tags presents a small problem. They MUST be balanced (opening ones always matched by closing ones) but there is one type of quoting situation where we don’t want to have closing quotemarks on a paragraph. That is where a long quote, or single speaker’s dialogue, spans many paragraphs. In that case the CORRECT way to set the quotemarks is for the first paragraph and all subsequent ones to have only an opening quotemark, and ONLY the final paragraph ends with a closing quotemark.
However, with a clever bit of CSS we can suppress the outer final quotes in the first and intermediate paragraphs from displaying. That means we still have balanced quotes, which is correct HTML, but the browser hides the bits we don’t want to show. Unfortunately, though, you have to write the HTML a little bit differently.
Currently, we allow classing of paragraphs and headers (<p> and <h> tags) to manage alignment. To get a centred or right-justified paragraph you write:
To make the CSS work and suppress the last closing quote in a paragraph, you can use this form:
Note that any nested/embedded quotes in the paragraph will be unaffected. Use these M-classed paragraphs for all except the final one in your multi-paragraph-spanning quote, and a normal one to finish. The result will appear like this:
In this example,paragraph onestarts the multi-quote. The final closing quote is hidden.
Thesecond paragraphalso has a hidden quotemark at the end, even though the tag is there.
And this is anormal paragraphto finish.
Note that if you are NOT writing in HTML, you should submit your stories this way – with the quotemark (correctly) missing at the end of the first and intermediate paragraphs. The parsers will assume that you want to set the paragraphs as a multi-para quote and will (also correctly) add the closing <q> tag with the correct classing. However, because this could also be the result of an author typo, missing a quotemark that was in fact required, it will flag the paragraph in purple in the preview to show where this has been done. It is up to you to confirm that the parser’s guess was correct and if necessary fix any typo either in your submission file or in the HTML window.
Just check out this table of all the various possible quotemarks (shamelessly borrowed from an excellent but apparently rather old CSS resource I found). Where the webpage font does not actually contain some of these admittedly obscure symbols, they may have helpfully been inserted by the browser from its default sans-serif or serif fonts.
Note that even this table does not include the back-tick character ( `, Unicode 0060) which you can type on your keyboard, often mistakenly used as an opening single quote, nor the Prime and Double Prime characters which look like quotemarks ( ′ and ″, HTML ′ and ″ or Unicode 2032 and 2033) meant for representing feet and inches or minutes and seconds, usually written incorrectly as apostrophes and double quotes.
It might seem like the tip of a Babel iceberg... but fortunately we only have European (Latin) languages to consider right now.
|Symbol name||HTML / Unicode||Symbol|
|APOSTROPHE or SINGLE QUOTATION MARK (keyboard character)||' 0027||'|
|DOUBLE QUOTATION MARK (keyboard character)||" 0022||"|
|LEFT SINGLE QUOTATION MARK||‘ 2018||‘|
|RIGHT SINGLE QUOTATION MARK||’ 2019||’|
|LEFT DOUBLE QUOTATION MARK||“ 201C||“|
|RIGHT DOUBLE QUOTATION MARK||” 201D||”|
|SINGLE LOW-9 QUOTATION MARK||‚ 201A||‚|
|DOUBLE LOW-9 QUOTATION MARK||„ 201E||„|
|SINGLE HIGH-REVERSED-9 QUOTATION MARK||‛ 201B||‛|
|DOUBLE HIGH-REVERSED-9 QUOTATION MARK||‟ 201F||‟|
|SINGLE LEFT-POINTING ANGLE QUOTATION MARK||‹ 2039||‹|
|SINGLE RIGHT-POINTING ANGLE QUOTATION MARK||› 203A||›|
|LEFT-POINTING DOUBLE ANGLE QUOTATION MARK||« 00AB||«|
|RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK||» 00BB||»|
|HEAVY SINGLE TURNED COMMA QUOTATION MARK ORNAMENT||❛ 275B||❛|
|HEAVY SINGLE COMMA QUOTATION MARK ORNAMENT||❜ 275C||❜|
|HEAVY DOUBLE TURNED COMMA QUOTATION MARK ORNAMENT||❝ 275D||❝|
|HEAVY DOUBLE COMMA QUOTATION MARK ORNAMENT||❞ 275E||❞|
|REVERSED DOUBLE PRIME QUOTATION MARK||〝 301D||〝|
|DOUBLE PRIME QUOTATION MARK||〞 301E||〞|
|LOW DOUBLE PRIME QUOTATION MARK||〟 301F||〟|
|LEFT CORNER BRACKET (Japanese)||「 300C||「|
|RIGHT CORNER BRACKET (Japanese)||」 300D||」|
|LEFT WHITE CORNER BRACKET (Japanese)||『 300E||『|
|RIGHT WHITE CORNER BRACKET (Japanese)||』 300F||』|
For most people, who naturally write dialogue quotes in double quotemarks normally, hopefully nothing. You only need to be aware of the difference between a dialogue quote and an emphasis quote – the latter should ALWAYS be in single quotemarks. And now that is the ONLY thing that should be in single quotemarks in your stories. If you normally write dialogue in single quotes only, then we ask that you change to double quotes in order that the parser is not confused. You can easily change the LOOK afterwards. It is the MEANING that the parser has to understand.
A quick test with my Word for Windows shows that nested quotes using " characters (standard keyboard quotes) are all left as double quotemarks when they are automatically smartened. That means that you should NOT try to second-guess the parser by using different quote styles when they are nested, just use double quotes for ALL nest levels. This may of course be different in other versions of Word or in other word processors.
Using text editors, writing plain text using normal double quotes is also needed. Where you want emphasised (non-dialogue) text in single quotes, use single quotes for that ONLY. The parser algorithm will use the spacing and punctuation around all the quotes to decide whether they are opening or closing. Hopefully reliably!
You may also need to be careful if you are writing dialogue with patois or dialect words that are front-shortened with an apostrophe, like ’em and ’til. In order that these are not mistaken for single quotemarks, words written this way have to be whitelisted for the parser to ignore them. There are separate whitelists for each language, already set up with some common words, and these are maintained through a topic in the Authors’ area in the MMSA Forum. If you need to add words in your story that are not being recognised correctly, you will have to ask there or contact the site admins directly. You can also try replacing the front apostrophe on these types of words with a vertical bar symbol, which will get converted back to an apostrophe after parsing. Whitelisting words is a better way, however.
Otherwise just write your stories and upload them, and the stories should get converted correctly with all the quotemark nesting in place. Our web browsers will do the rest. If you have any problems or queries, ask in the Forum. There may still be some tweaking of the parsers needed, especially for automatically converting quotes to Prime symbols (where you wish to write a character’s height: eg. 5′6″). New stories will get the new <q> and <em> treatment, older ones will be back-converted over time.
Remember that despite the requirement to submit using double quotes, you are now able to control the eventual LOOK of your story by changing the personal quotemark styles on your stories if you are not happy with the typographic defaults above. That is what this change is all about: separating whether something IS a quote (and doing it reliably) from how it actually appears. The end result is better-looking stories with fewer errors, something that benefits both readers and authors.