For a finite set of symbols, we call an alphabet. Then we also define:
- A string (or sentence) as a sequence of alphabet symbols.
- is the set of all strings over that have length .
- is the set of all strings over with length at least 1 (i.e., excluding the empty string ), called the positive closure.
- is the set of all strings over , called the Kleene closure. This includes the empty string. Practically speaking, this is a repetition.
- For a language , the Kleene star is any .
In discrete mathematics and computer science, a formal language over the alphabet is any set , i.e., any set of strings of characters of the alphabet. Defining formal languages allows us to explore regular expressions.
The concatenation of and is the set:
The union of and is: