Version: 3.x.x 🚧

@yozora/tokenizer-emphasis

github flavor markdown spec

First, some definitions. A delimiter run is either a sequence of one or more * characters that is not preceded or followed by a non-backslash-escaped * character, or a sequence of one or more _ characters that is not preceded or followed by a non-backslash-escaped _ character.

A left-flanking delimiter run is a delimiter run that is:

not followed by Unicode whitespace, and either

a) not followed by a punctuation character, or

b) followed by a punctuation character and preceded by Unicode whitespace or a punctuation character.

For purposes of this definition, the beginning and the end of the line count as Unicode whitespace.

A right-flanking delimiter run is a delimiter run that is:

not preceded by Unicode whitespace, and either

a) not preceded by a punctuation character, or

2b) preceded by a punctuation character and followed by Unicode whitespace or a punctuation character.

For purposes of this definition, the beginning and the end of the line count as Unicode whitespace.

The following rules define emphasis and strong emphasis:

A single * character can open emphasis iff (if and only if) it is part of a left-flanking delimiter run.
A single _ character can open emphasis iff it is part of a left-flanking delimiter run and either

a) not part of a right-flanking delimiter run or

b) part of a [right-flanking delimiter run] preceded by a punctuation character.

A single * character can close emphasis iff it is part of a right-flanking delimiter run.
A single _ character can close emphasis iff it is part of a right-flanking delimiter run and either

a) not part of a left-flanking delimiter run or

b) part of a left-flanking delimiter run followed by a punctuation character.

A double ** can open strong emphasis iff it is part of a left-flanking delimiter run.
A double __ can open strong emphasis iff it is part of a left-flanking delimiter run and either

a) not part of a right-flanking delimiter run or

b) part of a right-flanking delimiter run preceded by a punctuation character.

A double ** can close strong emphasis iff it is part of a right-flanking delimiter run.
A double __ can close strong emphasis iff it is part of a right-flanking delimiter run and either

a) not part of a left-flanking delimiter run or

b) part of a [left-flanking delimiter run] followed by a punctuation character.

Emphasis begins with a delimiter that can open emphasis and ends with a delimiter that can close emphasis, and that uses the same character (_ or *) as the opening delimiter. The opening and closing delimiters must belong to separate delimiter runs. If one of the delimiters can both open and close emphasis, then the sum of the lengths of the delimiter runs containing the opening and closing delimiters must not be a multiple of $3$ unless both lengths are multiples of $3$ .
Strong emphasis begins with a delimiter that can open strong emphasis and ends with a delimiter that can close strong emphasis, and that uses the same character (_ or *) as the opening delimiter. The opening and closing delimiters must belong to separate delimiter runs. If one of the delimiters can both open and close strong emphasis, then the sum of the lengths of the delimiter runs containing the opening and closing delimiters must not be a multiple of $3$ unless both lengths are multiples of $3$ .
A literal * character cannot occur at the beginning or end of *-delimited emphasis or **-delimited strong emphasis, unless it is backslash-escaped.
A literal _ character cannot occur at the beginning or end of _-delimited emphasis or __-delimited strong emphasis, unless it is backslash-escaped.

Where rules 1-12 above are compatible with multiple parsings, the following principles resolve ambiguity:

The number of nestings should be minimized. Thus, for example, an interpretation ... is always preferred to ....
An interpretation ... is always preferred to ....
When two potential emphasis or strong emphasis spans overlap, so that the second begins before the first ends and ends after the first ends, the first takes precedence. Thus, for example, *foo _bar* baz_ is parsed as foo _bar baz_ rather than *foo bar* baz.
When there are two potential emphasis or strong emphasis spans with the same closing delimiter, the shorter one (the one that opens later) takes precedence. Thus, for example, **foo **bar baz** is parsed as **foo bar baz rather than foo **bar baz.
Inline code spans, links, images, and HTML tags group more tightly than emphasis. So, when there is a choice between an interpretation that contains one of these elements and one that does not, the former always wins. Thus, for example, *[foo*](bar) is parsed as *<a href="bar">foo*</a> rather than as [foo](bar).

See github flavor markdown spec for details.
See Live Examples for an intuitive impression.

Install

npm
Yarn
pnpm

npm install --save @yozora/tokenizer-emphasis

yarn add @yozora/tokenizer-emphasis

pnpm add @yozora/tokenizer-emphasis

Usage

tip

@yozora/tokenizer-emphasis has been integrated into @yozora/parser / @yozora/parser-gfm-ex / @yozora/parser-gfm, so you can use YozoraParser / GfmExParser / GfmParser directly.

Basic Usage
YozoraParser
GfmParser
GfmExParser

@yozora/tokenizer-emphasis cannot be used alone, it needs to be registered in Parser as a plugin-in before it can be used.

import { DefaultParser } from '@yozora/core-parser'
import ParagraphTokenizer from '@yozora/tokenizer-paragraph'
import TextTokenizer from '@yozora/tokenizer-text'
import EmphasisTokenizer from '@yozora/tokenizer-emphasis'

const parser = new DefaultParser()
  .useFallbackTokenizer(new ParagraphTokenizer())
  .useFallbackTokenizer(new TextTokenizer())
  .useTokenizer(new EmphasisTokenizer())

// parse source markdown content
parser.parse(`

**foo bar**

__foo bar__

_foo bar_

*foo bar*

__**__foo__**__

`)

import YozoraParser from '@yozora/parser'

const parser = new YozoraParser()

// parse source markdown content
parser.parse(`

**foo bar**

__foo bar__

_foo bar_

*foo bar*

__**__foo__**__

`)

import GfmParser from '@yozora/parser-gfm'

const parser = new GfmParser()

// parse source markdown content
parser.parse(`

**foo bar**

__foo bar__

_foo bar_

*foo bar*

__**__foo__**__

`)

import GfmExParser from '@yozora/parser-gfm-ex'

const parser = new GfmExParser()

// parse source markdown content
parser.parse(`

**foo bar**

__foo bar__

_foo bar_

*foo bar*

__**__foo__**__

`)

Options

Name	Type	Required	Default
`name`	`string`	`false`	`"@yozora/tokenizer-emphasis"`
`priority`	`number`	`false`	`TokenizerPriority.CONTAINING_INLINE`

name: The unique name of the tokenizer, used to bind the token it generates, to determine the tokenizer that should be called in each life cycle of the token in the entire matching / parsing phase.
priority: Priority of the tokenizer, determine the order of processing, high priority priority execution. interruptable. In addition, in the match-block stage, a high-priority tokenizer can interrupt the matching process of a low-priority tokenizer.

Exception: Delimiters of type full are always processed before other type delimiters.

Types

@yozora/tokenizer-emphasis produce Emphasis / Strong type nodes. See @yozora/ast for full base types.

Emphasis

import type { Parent } from '@yozora/ast'

export const EmphasisType = 'emphasis'
export type EmphasisType = typeof EmphasisType

/**
 * Emphasis represents stress emphasis of its contents.
 * @see https://github.com/syntax-tree/mdast#emphasis
 * @see https://github.github.com/gfm/#emphasis-and-strong-emphasis
 */
export type Emphasis = Parent<EmphasisType>

Strong

import type { Parent } from '@yozora/ast'

export const StrongType = 'strong'
export type StrongType = typeof StrongType

/**
* Strong represents strong importance, seriousness, or urgency for its
* contents.
* @see https://github.com/syntax-tree/mdast#strong
* @see https://github.github.com/gfm/#emphasis-and-strong-emphasis
*/
export type Strong = Parent<StrongType>

Live Examples

Rule1.
#360
yozora

pretty-json
*foo bar*
"root":{
...
}
2 items
Rule2.
#366
yozora

pretty-json
_foo bar_
"root":{
...
}
2 items
Rule3.
#374
yozora

pretty-json
_foo*
"root":{
...
}
2 items
Rule4.
#380
yozora

pretty-json
_foo bar _
"root":{
...
}
2 items
Rule5.
#387
yozora

pretty-json
**foo bar**
"root":{
...
}
2 items
Rule6.
#391
yozora

pretty-json
__foo bar__
"root":{
...
}
2 items
Rule7.
#400
yozora

pretty-json
**foo bar **
"root":{
...
}
2 items
Rule8.
#406
yozora

pretty-json
__foo bar __
"root":{
...
}
2 items
Rule9.
#413
yozora

pretty-json
*foo [bar](/url)*
"root":{
...
}
2 items
Rule10.
#431
yozora

pretty-json
**foo [bar](/url)**
"root":{
...
}
2 items
Rule11.
#445
yozora

pretty-json
foo ***
"root":{
...
}
2 items
Rule12.
#457
yozora

pretty-json
foo ___
"root":{
...
}
2 items
Rule13.
#475
yozora

pretty-json
******foo******
"root":{
...
}
2 items
Rule14.
#476
yozora

pretty-json
***foo***
"root":{
...
}
2 items
Rule15.
#478
yozora

pretty-json
*foo _bar* baz_
"root":{
...
}
2 items
Rule16.
#480
yozora

pretty-json
**foo **bar baz**
"root":{
...
}
2 items
Rule17.
#482
yozora

pretty-json
*[bar*](/url)
"root":{
...
}
2 items

Install​

Usage​

Options​

Types​

Live Examples​

Related​

Install

Usage

Options

Types

Live Examples

Related