How to handle bidirectional content in HTML.
This specification is an experimental breakup of the HTML specification. You can see the full list on the index page and take part in the discussion in the repository.
bdi
elementdir
global attribute has special semantics on this element.aria-*
attributes
applicable to the allowed roles.HTMLElement
.The bdi
element represents a span of text that is to be isolated from
its surroundings for the purposes of bidirectional text formatting. [[!BIDI]]
The dir
global attribute defaults to auto
on this element (it never inherits from the parent element like
with other elements).
This element has rendering requirements involving the bidirectional algorithm.
This element is especially useful when embedding user-generated content with an unknown directionality.
In this example, usernames are shown along with the number of posts that the user has
submitted. If the bdi
element were not used, the username of the Arabic user would
end up confusing the text (the bidirectional algorithm would put the colon and the number "3"
next to the word "User" rather than next to the word "posts").
<ul> <li>User <bdi>jcranmer</bdi>: 12 posts. <li>User <bdi>hober</bdi>: 5 posts. <li>User <bdi>إيان</bdi>: 3 posts. </ul>
bdo
elementdir
global attribute has special semantics on this element.aria-*
attributes
applicable to the allowed roles.HTMLElement
.The bdo
element represents explicit text directionality formatting
control for its children. It allows authors to override the Unicode bidirectional algorithm by
explicitly specifying a direction override. [[!BIDI]]
Authors must specify the dir
attribute on this element, with the
value ltr
to specify a left-to-right override and with the value rtl
to
specify a right-to-left override. The auto
value must not be specified.
This element has rendering requirements involving the bidirectional algorithm.
dir
attributeThe dir
attribute specifies the element's text directionality.
The attribute is an enumerated attribute with the following keywords and states:
ltr
keyword, which maps to the ltr stateIndicates that the contents of the element are explicitly directionally isolated left-to-right text.
rtl
keyword, which maps to the rtl stateIndicates that the contents of the element are explicitly directionally isolated right-to-left text.
auto
keyword, which maps to the auto stateIndicates that the contents of the element are explicitly directionally isolated text, but that the direction is to be determined programmatically using the contents of the element (as described below).
The heuristic used by this state is very crude (it just looks at the first character with a strong directionality, in a manner analogous to the Paragraph Level determination in the bidirectional algorithm). Authors are urged to only use this value as a last resort when the direction of the text is truly unknown and no better server-side heuristic can be applied. [[!BIDI]]
For textarea
and pre
elements, the heuristic is
applied on a per-paragraph level.
The attribute has no invalid value default and no missing value default.
The directionality of an element (any element, not just an HTML element) is either 'ltr' or 'rtl', and is determined as per the first appropriate set of steps from the following list:
dir
attribute is in the ltr statedir
attribute is not in a defined state (i.e. it is not present or has an invalid value)input
element whose type
attribute is in the Telephone state, and the dir
attribute is not in a defined state (i.e. it is not present or has an invalid value)The directionality of the element is 'ltr'.
dir
attribute is in the rtl stateThe directionality of the element is 'rtl'.
input
element whose type
attribute is in the Text, Search,
Telephone, URL, or E-mail
state, and the dir
attribute is in the auto statetextarea
element and the dir
attribute is in the auto stateIf the element's value contains a character of bidirectional character type AL or R, and there is no character of bidirectional character type L anywhere before it in the element's value, then the directionality of the element is 'rtl'. [[!BIDI]]
Otherwise, if the element's value is not the empty string, or if the element is a root element, the directionality of the element is 'ltr'.
Otherwise, the directionality of the element is the same as the element's parent element's directionality.
dir
attribute is in the auto statebdi
element and the dir
attribute is not in a defined state (i.e. it is not present or has an invalid value)Find the first character in tree order that matches the following criteria:
The character is from a Text
node that is a descendant of the element whose
directionality is being determined.
The character is of bidirectional character type L, AL, or R. [[!BIDI]]
The character is not in a Text
node that has an ancestor element that is a
descendant of the element whose directionality is
being determined and that is either:
If such a character is found and it is of bidirectional character type AL or R, the directionality of the element is 'rtl'.
If such a character is found and it is of bidirectional character type L, the directionality of the element is 'ltr'.
Otherwise, if the element is a root element, the directionality of the element is 'ltr'.
Otherwise, the directionality of the element the same as the element's parent element's directionality.
dir
attribute is
not in a defined state (i.e. it is not present or has an invalid value)The directionality of the element is the same as the element's parent element's directionality.
Since the dir
attribute is only defined for
HTML elements, it cannot be present on elements from other namespaces. Thus, elements
from other namespaces always just inherit their directionality from their parent element, or, if they don't have one,
default to 'ltr'.
This attribute has rendering requirements involving the bidirectional algorithm.
The directionality of an attribute of an HTML element, which is used when the text of that attribute is to be included in the rendering in some manner, is determined as per the first appropriate set of steps from the following list:
dir
attribute is in the auto
stateFind the first character (in logical order) of the attribute's value that is of bidirectional character type L, AL, or R. [[!BIDI]]
If such a character is found and it is of bidirectional character type AL or R, the directionality of the attribute is 'rtl'.
Otherwise, the directionality of the attribute is 'ltr'.
The following attributes are directionality-capable attributes:
abbr
on th
elementsalt
on area
,
img
, and
input
elementscontent
on meta
elements, if the name
attribute specifies a metadata name whose value is primarily intended to be human-readable rather than machine-readablelabel
on menuitem
,
menu
,
optgroup
,
option
, and
track
elementsplaceholder
on input
and
textarea
elementstitle
on all HTML elementsdir
[ = value ]Returns the html
element's dir
attribute's value, if any.
Can be set, to either "ltr
", "rtl
", or "auto
" to replace the html
element's dir
attribute's value.
If there is no html
element, returns the empty string and ignores new values.
The dir
IDL attribute on an element must
reflect the dir
content attribute of that element,
limited to only known values.
The dir
IDL attribute on Document
objects must reflect the dir
content attribute of
the html
element, if any, limited to only known values. If
there is no such element, then the attribute must return the empty string and do nothing on
setting.
Authors are strongly encouraged to use the dir
attribute to indicate text direction rather than using CSS, since that way their documents will
continue to render correctly even in the absence of CSS (e.g. as interpreted by search
engines).
This markup fragment is of an IM conversation.
<p dir=auto class="u1"><b><bdi>Student</bdi>:</b> How do you write "What's your name?" in Arabic?</p> <p dir=auto class="u2"><b><bdi>Teacher</bdi>:</b> ما اسمك؟</p> <p dir=auto class="u1"><b><bdi>Student</bdi>:</b> Thanks.</p> <p dir=auto class="u2"><b><bdi>Teacher</bdi>:</b> That's written "شكرًا".</p> <p dir=auto class="u2"><b><bdi>Teacher</bdi>:</b> Do you know how to write "Please"?</p> <p dir=auto class="u1"><b><bdi>Student</bdi>:</b> "من فضلك", right?</p>
Given a suitable style sheet and the default alignment styles for the p
element,
namely to align the text to the start edge of the paragraph, the resulting rendering could
be as follows:
As noted earlier, the auto
value is not a panacea. The
final paragraph in this example is misinterpreted as being right-to-left text, since it begins
with an Arabic character, which causes the "right?" to be to the left of the Arabic text.
Text content in HTML elements with Text
nodes in their
contents, and text in attributes of HTML
elements that allow free-form text, may contain characters in the ranges U+202A to U+202E
and U+2066 to U+2069 (the bidirectional-algorithm formatting characters). However, the use of
these characters is restricted so that any embedding or overrides generated by these characters do
not start and end with different parent elements, and so that all such embeddings and overrides
are explicitly terminated by a U+202C POP DIRECTIONAL FORMATTING character. This helps reduce
incidences of text being reused in a manner that has unforeseen effects on the bidirectional
algorithm. [[!BIDI]]
The aforementioned restrictions are defined by specifying that certain parts of documents form bidirectional-algorithm formatting character ranges, and then imposing a requirement on such ranges.
The strings resulting from applying the following algorithm to an HTML element element are bidirectional-algorithm formatting character ranges:
Let output be an empty list of strings.
Let string be an empty string.
Let node be the first child node of element, if any, or null otherwise.
Loop: If node is null, jump to the step labeled end.
Process node according to the first matching step from the following list:
Text
nodeAppend the text data of node to string.
br
elementIf string is not the empty string, push string onto output, and let string be empty string.
Let node be node's next sibling, if any, or null otherwise.
Jump to the step labeled loop.
End: If string is not the empty string, push string onto output.
Return output as the bidirectional-algorithm formatting character ranges.
The value of a namespace-less attribute of an HTML element is a bidirectional-algorithm formatting character range.
Any strings that, as described above, are bidirectional-algorithm formatting character
ranges must match the string
production in the following ABNF, the
character set for which is Unicode. [[!ABNF]]
string = *( plaintext ( embedding / override / isolation ) ) plaintext embedding = ( lre / rle ) string pdf override = ( lro / rlo ) string pdf isolation = ( lri / rli / fsi ) string pdi lre = %x202A ; U+202A LEFT-TO-RIGHT EMBEDDING rle = %x202B ; U+202B RIGHT-TO-LEFT EMBEDDING lro = %x202D ; U+202D LEFT-TO-RIGHT OVERRIDE rlo = %x202E ; U+202E RIGHT-TO-LEFT OVERRIDE pdf = %x202C ; U+202C POP DIRECTIONAL FORMATTING lri = %x2066 ; U+2066 LEFT-TO-RIGHT ISOLATE rli = %x2067 ; U+2067 RIGHT-TO-LEFT ISOLATE fsi = %x2068 ; U+2068 FIRST STRONG ISOLATE pdi = %x2069 ; U+2069 POP DIRECTIONAL ISOLATE plaintext = *( %x0000-2029 / %x202F-2065 / %x206A-10FFFF ) ; any string with no bidirectional-algorithm formatting characters
While the U+2069 POP DIRECTIONAL ISOLATE character implicitly also ends open embeddings and overrides, text that relies on this implicit scope closure is not conforming to this specification. All strings of embeddings, overrides, and isolations need to be explicitly terminated to conform to this section's requirements.
Authors are encouraged to use the dir
attribute, the
bdo
element, and the bdi
element, rather than maintaining the
bidirectional-algorithm formatting characters manually. The bidirectional-algorithm formatting
characters interact poorly with CSS.
User agents must implement the Unicode bidirectional algorithm to determine the proper ordering of characters when rendering documents and parts of documents. [[!BIDI]]
The mapping of HTML to the Unicode bidirectional algorithm must be done in one of three ways. Either the user agent must implement CSS, including in particular the CSS 'unicode-bidi', 'direction', and 'content' properties, and must have, in its user agent style sheet, the rules using those properties given in this specification's rendering section, or, alternatively, the user agent must act as if it implemented just the aforementioned properties and had a user agent style sheet that included all the aforementioned rules, but without letting style sheets specified in documents override them, or, alternatively, the user agent must implement another styling language with equivalent semantics. [[!CSSWM]] [[!CSSGC]]
The following elements and attributes have requirements defined by the rendering section that, due to the requirements in this section, are requirements on all user agents (not just those that support the suggested default rendering):