String handling functions.
Category | Functions |
---|---|
Searching | column indexOf indexOfAny indexOfNeither lastIndexOf lastIndexOfAny lastIndexOfNeither |
Comparison | isNumeric |
Mutation | capitalize |
Pruning and Filling | center chomp chompPrefix chop detabber detab entab entabber leftJustify outdent rightJustify strip stripLeft stripRight wrap |
Substitution | abbrev soundex soundexer succ tr translate |
Miscellaneous | assumeUTF fromStringz lineSplitter representation splitLines toStringz |
Objects of types string
, wstring
, and dstring
are value types and cannot be mutated element-by-element. For using mutation during building strings, use char[]
, wchar[]
, or dchar[]
. The xxxstring
types are preferable because they don't exhibit undesired aliasing, thus making code more robust.
The following functions are publicly imported:
Module | Functions |
---|---|
Publicly imported functions | |
std.algorithm | cmp count endsWith startsWith |
std.array | join replace replaceInPlace split empty |
std.format | format sformat |
std.uni | icmp toLower toLowerInPlace toUpper toUpperInPlace |
std.uni
and std.ascii
, respectively. Other functions that have a wider generality than just strings can be found in std.algorithm
and std.range
. std.algorithm
and std.range
for generic range algorithms std.ascii
for functions that work with ASCII strings std.uni
for functions that work with unicode strings Exception thrown on errors in std.string functions.
import std.exception : assertThrown; auto bad = " a\n\tb\n c"; assertThrown!StringException(bad.outdent);
Char* cString
| A null-terminated c-style string. |
char
, wchar
or dchar
referencing the same string. The returned array will retain the same type qualifiers as the input. Important Note: The returned array is a slice of the original buffer. The original data is not changed and not copied.writeln(fromStringz("foo\0"c.ptr)); // "foo"c writeln(fromStringz("foo\0"w.ptr)); // "foo"w writeln(fromStringz("foo\0"d.ptr)); // "foo"d writeln(fromStringz("福\0"c.ptr)); // "福"c writeln(fromStringz("福\0"w.ptr)); // "福"w writeln(fromStringz("福\0"d.ptr)); // "福"d
const(char)[] s
| A D-style string. |
s
. s
must not contain embedded '\0'
's as any C function will treat the first '\0'
that it sees as the end of the string. If s.empty
is true
, then a string containing only '\0'
is returned. Important Note: When passing a char*
to a C function, and the C function keeps it around for any reason, make sure that you keep a reference to it in your D code. Otherwise, it may become invalid during a garbage collection cycle and cause a nasty bug when the C code tries to use it.import core.stdc.string : strlen; import std.conv : to; auto p = toStringz("foo"); writeln(strlen(p)); // 3 const(char)[] foo = "abbzxyzzy"; p = toStringz(foo[3 .. 5]); writeln(strlen(p)); // 2 string test = ""; p = toStringz(test); writeln(*p); // 0 test = "\0"; p = toStringz(test); writeln(*p); // 0 test = "foo\0"; p = toStringz(test); assert(p[0] == 'f' && p[1] == 'o' && p[2] == 'o' && p[3] == 0); const string test2 = ""; p = toStringz(test2); writeln(*p); // 0
Flag indicating whether a search is case-sensitive.
Searches for character in range.
Range s
| string or InputRange of characters to search in correct UTF format |
dchar c
| character to search for |
size_t startIdx
| starting index to a well-formed code point |
CaseSensitive cs
|
Yes.caseSensitive or No.caseSensitive
|
c
in s
with respect to the start index startIdx
. If c
is not found, then -1
is returned. If c
is found the value of the returned index is at least startIdx
. If the parameters are not valid UTF, the result will still be in the range [-1 .. s.length], but will not be reliable otherwise. startIdx
does not represent a well formed codepoint, then a std.utf.UTFException
may be thrown. std.algorithm.searching.countUntil
import std.typecons : No; string s = "Hello World"; writeln(indexOf(s, 'W')); // 6 writeln(indexOf(s, 'Z')); // -1 writeln(indexOf(s, 'w', No.caseSensitive)); // 6
import std.typecons : No; string s = "Hello World"; writeln(indexOf(s, 'W', 4)); // 6 writeln(indexOf(s, 'Z', 100)); // -1 writeln(indexOf(s, 'w', 3, No.caseSensitive)); // 6
Searches for substring in s
.
Range s
| string or ForwardRange of characters to search in correct UTF format |
const(Char)[] sub
| substring to search for |
size_t startIdx
| the index into s to start searching from |
CaseSensitive cs
|
Yes.caseSensitive or No.caseSensitive
|
sub
in s
with respect to the start index startIdx
. If sub
is not found, then -1
is returned. If the arguments are not valid UTF, the result will still be in the range [-1 .. s.length], but will not be reliable otherwise. If sub
is found the value of the returned index is at least startIdx
. startIdx
does not represent a well formed codepoint, then a std.utf.UTFException
may be thrown. import std.typecons : No; string s = "Hello World"; writeln(indexOf(s, "Wo", 4)); // 6 writeln(indexOf(s, "Zo", 100)); // -1 writeln(indexOf(s, "wo", 3, No.caseSensitive)); // 6
import std.typecons : No; string s = "Hello World"; writeln(indexOf(s, "Wo")); // 6 writeln(indexOf(s, "Zo")); // -1 writeln(indexOf(s, "wO", No.caseSensitive)); // 6
const(Char)[] s
| string to search |
dchar c
| character to search for |
size_t startIdx
| the index into s to start searching from |
CaseSensitive cs
|
Yes.caseSensitive or No.caseSensitive
|
c
in s
. If c
is not found, then -1
is returned. The startIdx
slices s
in the following way s[0 .. startIdx]
. startIdx
represents a codeunit index in s
. startIdx
does not represent a well formed codepoint, then a std.utf.UTFException
may be thrown. cs
indicates whether the comparisons are case sensitive.import std.typecons : No; string s = "Hello World"; writeln(lastIndexOf(s, 'l')); // 9 writeln(lastIndexOf(s, 'Z')); // -1 writeln(lastIndexOf(s, 'L', No.caseSensitive)); // 9
import std.typecons : No; string s = "Hello World"; writeln(lastIndexOf(s, 'l', 4)); // 3 writeln(lastIndexOf(s, 'Z', 1337)); // -1 writeln(lastIndexOf(s, 'L', 7, No.caseSensitive)); // 3
const(Char1)[] s
| string to search |
const(Char2)[] sub
| substring to search for |
size_t startIdx
| the index into s to start searching from |
CaseSensitive cs
|
Yes.caseSensitive or No.caseSensitive
|
sub
in s
. If sub
is not found, then -1
is returned. The startIdx
slices s
in the following way s[0 .. startIdx]
. startIdx
represents a codeunit index in s
. startIdx
does not represent a well formed codepoint, then a std.utf.UTFException
may be thrown. cs
indicates whether the comparisons are case sensitive.import std.typecons : No; string s = "Hello World"; writeln(lastIndexOf(s, "ll")); // 2 writeln(lastIndexOf(s, "Zo")); // -1 writeln(lastIndexOf(s, "lL", No.caseSensitive)); // 2
import std.typecons : No; string s = "Hello World"; writeln(lastIndexOf(s, "ll", 4)); // 2 writeln(lastIndexOf(s, "Zo", 128)); // -1 writeln(lastIndexOf(s, "lL", 3, No.caseSensitive)); // -1
Returns the index of the first occurrence of any of the elements in needles
in haystack
. If no element of needles
is found, then -1
is returned. The startIdx
slices haystack
in the following way haystack[startIdx .. $]
. startIdx
represents a codeunit index in haystack
. If the sequence ending at startIdx
does not represent a well formed codepoint, then a std.utf.UTFException
may be thrown.
const(Char)[] haystack
| String to search for needles in. |
const(Char2)[] needles
| Strings to search for in haystack. |
size_t startIdx
| slices haystack like this haystack[startIdx .. $] . If the startIdx is greater equal the length of haystack the functions returns -1 . |
CaseSensitive cs
| Indicates whether the comparisons are case sensitive. |
import std.conv : to; ptrdiff_t i = "helloWorld".indexOfAny("Wr"); writeln(i); // 5 i = "öällo world".indexOfAny("lo "); writeln(i); // 4
import std.conv : to; ptrdiff_t i = "helloWorld".indexOfAny("Wr", 4); writeln(i); // 5 i = "Foo öällo world".indexOfAny("lh", 3); writeln(i); // 8
Returns the index of the last occurrence of any of the elements in needles
in haystack
. If no element of needles
is found, then -1
is returned. The stopIdx
slices haystack
in the following way s[0 .. stopIdx]
. stopIdx
represents a codeunit index in haystack
. If the sequence ending at startIdx
does not represent a well formed codepoint, then a std.utf.UTFException
may be thrown.
const(Char)[] haystack
| String to search for needles in. |
const(Char2)[] needles
| Strings to search for in haystack. |
size_t stopIdx
| slices haystack like this haystack[0 .. stopIdx] . If the stopIdx is greater equal the length of haystack the functions returns -1 . |
CaseSensitive cs
| Indicates whether the comparisons are case sensitive. |
ptrdiff_t i = "helloWorld".lastIndexOfAny("Wlo"); writeln(i); // 8 i = "Foo öäöllo world".lastIndexOfAny("öF"); writeln(i); // 8
import std.conv : to; ptrdiff_t i = "helloWorld".lastIndexOfAny("Wlo", 4); writeln(i); // 3 i = "Foo öäöllo world".lastIndexOfAny("öF", 3); writeln(i); // 0
Returns the index of the first occurrence of any character not an elements in needles
in haystack
. If all element of haystack
are element of needles
-1
is returned.
const(Char)[] haystack
| String to search for needles in. |
const(Char2)[] needles
| Strings to search for in haystack. |
size_t startIdx
| slices haystack like this haystack[startIdx .. $] . If the startIdx is greater equal the length of haystack the functions returns -1 . |
CaseSensitive cs
| Indicates whether the comparisons are case sensitive. |
writeln(indexOfNeither("abba", "a", 2)); // 2 writeln(indexOfNeither("def", "de", 1)); // 2 writeln(indexOfNeither("dfefffg", "dfe", 4)); // 6
writeln(indexOfNeither("def", "a")); // 0 writeln(indexOfNeither("def", "de")); // 2 writeln(indexOfNeither("dfefffg", "dfe")); // 6
Returns the last index of the first occurence of any character that is not an elements in needles
in haystack
. If all element of haystack
are element of needles
-1
is returned.
const(Char)[] haystack
| String to search for needles in. |
const(Char2)[] needles
| Strings to search for in haystack. |
size_t stopIdx
| slices haystack like this haystack[0 .. stopIdx] If the stopIdx is greater equal the length of haystack the functions returns -1 . |
CaseSensitive cs
| Indicates whether the comparisons are case sensitive. |
writeln(lastIndexOfNeither("abba", "a")); // 2 writeln(lastIndexOfNeither("def", "f")); // 1
writeln(lastIndexOfNeither("def", "rsa", 3)); // -1 writeln(lastIndexOfNeither("abba", "a", 2)); // 1
Returns the representation of a string, which has the same type as the string except the character type is replaced by ubyte
, ushort
, or uint
depending on the character width.
Char[] s
| The string to return the representation of. |
string s = "hello"; static assert(is(typeof(representation(s)) == immutable(ubyte)[])); assert(representation(s) is cast(immutable(ubyte)[]) s); writeln(representation(s)); // [0x68, 0x65, 0x6c, 0x6c, 0x6f]
Capitalize the first character of s
and convert the rest of s
to lowercase.
S input
| The string to capitalize. |
std.uni.asCapitalized
for a lazy range version that doesn't allocate memorywriteln(capitalize("hello")); // "Hello" writeln(capitalize("World")); // "World"
Split s
into an array of lines according to the unicode standard using '\r'
, '\n'
, "\r\n"
, std.uni.lineSep
, std.uni.paraSep
, U+0085
(NEL), '\v'
and '\f'
as delimiters. If keepTerm
is set to KeepTerminator.yes
, then the delimiter is included in the strings returned.
Does not throw on invalid UTF; such is simply passed unchanged to the output.
Allocates memory; use lineSplitter
for an alternative that does not.
Adheres to Unicode 7.0.
C[] s
| a string of chars , wchars , or dchars , or any custom type that casts to a string type |
KeepTerminator keepTerm
| whether delimiter is included or not in the results |
s
string s = "Hello\nmy\rname\nis"; writeln(splitLines(s)); // ["Hello", "my", "name", "is"]
Split an array or slicable range of characters into a range of lines using '\r'
, '\n'
, '\v'
, '\f'
, "\r\n"
, std.uni.lineSep
, std.uni.paraSep
and '\u0085'
(NEL) as delimiters. If keepTerm
is set to Yes.keepTerminator
, then the delimiter is included in the slices returned.
Does not throw on invalid UTF; such is simply passed unchanged to the output.
Adheres to Unicode 7.0.
Does not allocate memory.
Range r
| array of chars , wchars , or dchars or a slicable range |
keepTerm | whether delimiter is included or not in the results |
r
import std.array : array; string s = "Hello\nmy\rname\nis"; /* notice the call to 'array' to turn the lazy range created by lineSplitter comparable to the string[] created by splitLines. */ writeln(lineSplitter(s).array); // splitLines(s)
auto s = "\rpeter\n\rpaul\r\njerry\u2028ice\u2029cream\n\nsunday\nmon\u2030day\n"; auto lines = s.lineSplitter(); static immutable witness = ["", "peter", "", "paul", "jerry", "ice", "cream", "", "sunday", "mon\u2030day"]; uint i; foreach (line; lines) { writeln(line); // witness[i++] } writeln(i); // witness.length
Strips leading whitespace (as defined by std.uni.isWhite
) or as specified in the second argument.
Range input
| string or forward range of characters |
const(Char)[] chars
| string of characters to be stripped |
input
stripped of leading whitespace or characters specified in the second argument. input
and the returned value will share the same tail (see std.array.sameTail
). std.algorithm.mutation.stripLeft
import std.uni : lineSep, paraSep; assert(stripLeft(" hello world ") == "hello world "); assert(stripLeft("\n\t\v\rhello world\n\t\v\r") == "hello world\n\t\v\r"); assert(stripLeft("hello world") == "hello world"); assert(stripLeft([lineSep] ~ "hello world" ~ lineSep) == "hello world" ~ [lineSep]); assert(stripLeft([paraSep] ~ "hello world" ~ paraSep) == "hello world" ~ [paraSep]); import std.array : array; import std.utf : byChar; assert(stripLeft(" hello world "w.byChar).array == "hello world ");
assert(stripLeft(" hello world ", " ") == "hello world "); assert(stripLeft("xxxxxhello world ", "x") == "hello world "); assert(stripLeft("xxxyy hello world ", "xy ") == "hello world ");
import std.array : array; import std.utf : byChar, byWchar, byDchar; assert(stripLeft(" xxxyy hello world "w.byChar, "xy ").array == "hello world "); assert(stripLeft("\u2028\u2020hello world\u2028"w.byWchar, "\u2028").array == "\u2020hello world\u2028"); assert(stripLeft("\U00010001hello world"w.byWchar, " ").array == "\U00010001hello world"w); assert(stripLeft("\U00010001 xyhello world"d.byDchar, "\U00010001 xy").array == "hello world"d); writeln(stripLeft("\u2020hello"w, "\u2020"w)); // "hello"w writeln(stripLeft("\U00010001hello"d, "\U00010001"d)); // "hello"d writeln(stripLeft(" hello ", "")); // " hello "
Strips trailing whitespace (as defined by std.uni.isWhite
) or as specified in the second argument.
Range str
| string or random access range of characters |
const(Char)[] chars
| string of characters to be stripped |
str
stripped of trailing whitespace or characters specified in the second argument. std.algorithm.mutation.stripRight
import std.uni : lineSep, paraSep; assert(stripRight(" hello world ") == " hello world"); assert(stripRight("\n\t\v\rhello world\n\t\v\r") == "\n\t\v\rhello world"); assert(stripRight("hello world") == "hello world"); assert(stripRight([lineSep] ~ "hello world" ~ lineSep) == [lineSep] ~ "hello world"); assert(stripRight([paraSep] ~ "hello world" ~ paraSep) == [paraSep] ~ "hello world");
assert(stripRight(" hello world ", "x") == " hello world "); assert(stripRight(" hello world ", " ") == " hello world"); assert(stripRight(" hello worldxy ", "xy ") == " hello world");
Strips both leading and trailing whitespace (as defined by std.uni.isWhite
) or as specified in the second argument.
Range str
| string or random access range of characters |
const(Char)[] chars
| string of characters to be stripped |
const(Char)[] leftChars
| string of leading characters to be stripped |
const(Char)[] rightChars
| string of trailing characters to be stripped |
str
stripped of leading and trailing whitespace or characters as specified in the second argument. std.algorithm.mutation.strip
import std.uni : lineSep, paraSep; assert(strip(" hello world ") == "hello world"); assert(strip("\n\t\v\rhello world\n\t\v\r") == "hello world"); assert(strip("hello world") == "hello world"); assert(strip([lineSep] ~ "hello world" ~ [lineSep]) == "hello world"); assert(strip([paraSep] ~ "hello world" ~ [paraSep]) == "hello world");
assert(strip(" hello world ", "x") == " hello world "); assert(strip(" hello world ", " ") == "hello world"); assert(strip(" xyxyhello worldxyxy ", "xy ") == "hello world"); writeln(strip("\u2020hello\u2020"w, "\u2020"w)); // "hello"w writeln(strip("\U00010001hello\U00010001"d, "\U00010001"d)); // "hello"d writeln(strip(" hello ", "")); // " hello "
writeln(strip("xxhelloyy", "x", "y")); // "hello" assert(strip(" xyxyhello worldxyxyzz ", "xy ", "xyz ") == "hello world"); writeln(strip("\u2020hello\u2028"w, "\u2020"w, "\u2028"w)); // "hello"w assert(strip("\U00010001hello\U00010002"d, "\U00010001"d, "\U00010002"d) == "hello"d); writeln(strip(" hello ", "", "")); // " hello "
If str
ends with delimiter
, then str
is returned without delimiter
on its end. If it str
does not end with delimiter
, then it is returned unchanged.
If no delimiter
is given, then one trailing '\r'
, '\n'
, "\r\n"
, '\f'
, '\v'
, std.uni.lineSep
, std.uni.paraSep
, or std.uni.nelSep
is removed from the end of str
. If str
does not end with any of those characters, then it is returned unchanged.
Range str
| string or indexable range of characters |
const(C2)[] delimiter
| string of characters to be sliced off end of str[] |
import std.uni : lineSep, paraSep, nelSep; import std.utf : decode; writeln(chomp(" hello world \n\r")); // " hello world \n" writeln(chomp(" hello world \r\n")); // " hello world " writeln(chomp(" hello world \f")); // " hello world " writeln(chomp(" hello world \v")); // " hello world " writeln(chomp(" hello world \n\n")); // " hello world \n" writeln(chomp(" hello world \n\n ")); // " hello world \n\n " writeln(chomp(" hello world \n\n" ~ [lineSep])); // " hello world \n\n" writeln(chomp(" hello world \n\n" ~ [paraSep])); // " hello world \n\n" writeln(chomp(" hello world \n\n" ~ [nelSep])); // " hello world \n\n" writeln(chomp(" hello world")); // " hello world" writeln(chomp("")); // "" writeln(chomp(" hello world", "orld")); // " hello w" writeln(chomp(" hello world", " he")); // " hello world" writeln(chomp("", "hello")); // "" // Don't decode pointlessly writeln(chomp("hello\xFE", "\r")); // "hello\xFE"
If str
starts with delimiter
, then the part of str
following delimiter
is returned. If str
does not start with
delimiter
, then it is returned unchanged.
Range str
| string or forward range of characters |
const(C2)[] delimiter
| string of characters to be sliced off front of str[] |
writeln(chompPrefix("hello world", "he")); // "llo world" writeln(chompPrefix("hello world", "hello w")); // "orld" writeln(chompPrefix("hello world", " world")); // "hello world" writeln(chompPrefix("", "hello")); // ""
Returns str
without its last character, if there is one. If str
ends with "\r\n"
, then both are removed. If str
is empty, then it is returned unchanged.
Range str
| string (must be valid UTF) |
writeln(chop("hello world")); // "hello worl" writeln(chop("hello world\n")); // "hello world" writeln(chop("hello world\r")); // "hello world" writeln(chop("hello world\n\r")); // "hello world\n" writeln(chop("hello world\r\n")); // "hello world" writeln(chop("Walter Bright")); // "Walter Brigh" writeln(chop("")); // ""
Left justify s
in a field width
characters wide. fillChar
is the character that will be used to fill up the space in the field that s
doesn't fill.
S s
| string |
size_t width
| minimum field width |
dchar fillChar
| used to pad end up to width characters |
leftJustifier
, which does not allocatewriteln(leftJustify("hello", 7, 'X')); // "helloXX" writeln(leftJustify("hello", 2, 'X')); // "hello" writeln(leftJustify("hello", 9, 'X')); // "helloXXXX"
Left justify s
in a field width
characters wide. fillChar
is the character that will be used to fill up the space in the field that s
doesn't fill.
Range r
| string or range of characters |
size_t width
| minimum field width |
dchar fillChar
| used to pad end up to width characters |
rightJustifier
import std.algorithm.comparison : equal; import std.utf : byChar; assert(leftJustifier("hello", 2).equal("hello".byChar)); assert(leftJustifier("hello", 7).equal("hello ".byChar)); assert(leftJustifier("hello", 7, 'x').equal("helloxx".byChar));
Right justify s
in a field width
characters wide. fillChar
is the character that will be used to fill up the space in the field that s
doesn't fill.
S s
| string |
size_t width
| minimum field width |
dchar fillChar
| used to pad end up to width characters |
rightJustifier
, which does not allocatewriteln(rightJustify("hello", 7, 'X')); // "XXhello" writeln(rightJustify("hello", 2, 'X')); // "hello" writeln(rightJustify("hello", 9, 'X')); // "XXXXhello"
Right justify s
in a field width
characters wide. fillChar
is the character that will be used to fill up the space in the field that s
doesn't fill.
Range r
| string or forward range of characters |
size_t width
| minimum field width |
dchar fillChar
| used to pad end up to width characters |
leftJustifier
import std.algorithm.comparison : equal; import std.utf : byChar; assert(rightJustifier("hello", 2).equal("hello".byChar)); assert(rightJustifier("hello", 7).equal(" hello".byChar)); assert(rightJustifier("hello", 7, 'x').equal("xxhello".byChar));
Center s
in a field width
characters wide. fillChar
is the character that will be used to fill up the space in the field that s
doesn't fill.
S s
| The string to center |
size_t width
| Width of the field to center s in |
dchar fillChar
| The character to use for filling excess space in the field |
centerJustifier
instead.writeln(center("hello", 7, 'X')); // "XhelloX" writeln(center("hello", 2, 'X')); // "hello" writeln(center("hello", 9, 'X')); // "XXhelloXX"
Center justify r
in a field width
characters wide. fillChar
is the character that will be used to fill up the space in the field that r
doesn't fill.
Range r
| string or forward range of characters |
size_t width
| minimum field width |
dchar fillChar
| used to pad end up to width characters |
leftJustifier
rightJustifier
import std.algorithm.comparison : equal; import std.utf : byChar; assert(centerJustifier("hello", 2).equal("hello".byChar)); assert(centerJustifier("hello", 8).equal(" hello ".byChar)); assert(centerJustifier("hello", 7, 'x').equal("xhellox".byChar));
Replace each tab character in s
with the number of spaces necessary to align the following character at the next tab stop.
Range s
| string |
size_t tabSize
| distance between tab stops |
writeln(detab(" \n\tx", 9)); // " \n x"
Replace each tab character in r
with the number of spaces necessary to align the following character at the next tab stop.
Range r
| string or forward range |
size_t tabSize
| distance between tab stops |
import std.array : array; writeln(detabber(" \n\tx", 9).array); // " \n x"
Replaces spaces in s
with the optimal number of tabs. All spaces and tabs at the end of a line are removed.
Range s
| String to convert. |
size_t tabSize
| Tab columns are tabSize spaces apart. |
entabber
to not allocate. entabber
writeln(entab(" x \n")); // "\tx\n"
Replaces spaces in range r
with the optimal number of tabs. All spaces and tabs at the end of a line are removed.
Range r
| string or forward range |
size_t tabSize
| distance between tab stops |
entab
import std.array : array; writeln(entabber(" x \n").array); // "\tx\n"
Replaces the characters in str
which are keys in transTable
with their corresponding values in transTable
. transTable
is an AA where its keys are dchar
and its values are either dchar
or some type of string. Also, if toRemove
is given, the characters in it are removed from str
prior to translation. str
itself is unaltered. A copy with the changes is returned.
tr
, std.array.replace
, std.algorithm.iteration.substitute
C1[] str
| The original string. |
dchar[dchar] transTable
| The AA indicating which characters to replace and what to replace them with. |
const(C2)[] toRemove
| The characters to remove from the string. |
dchar[dchar] transTable1 = ['e' : '5', 'o' : '7', '5': 'q']; writeln(translate("hello world", transTable1)); // "h5ll7 w7rld" writeln(translate("hello world", transTable1, "low")); // "h5 rd" string[dchar] transTable2 = ['e' : "5", 'o' : "orange"]; writeln(translate("hello world", transTable2)); // "h5llorange worangerld"
This is an overload of translate
which takes an existing buffer to write the contents to.
C1[] str
| The original string. |
dchar[dchar] transTable
| The AA indicating which characters to replace and what to replace them with. |
const(C2)[] toRemove
| The characters to remove from the string. |
Buffer buffer
| An output range to write the contents to. |
import std.array : appender; dchar[dchar] transTable1 = ['e' : '5', 'o' : '7', '5': 'q']; auto buffer = appender!(dchar[])(); translate("hello world", transTable1, null, buffer); writeln(buffer.data); // "h5ll7 w7rld" buffer.clear(); translate("hello world", transTable1, "low", buffer); writeln(buffer.data); // "h5 rd" buffer.clear(); string[dchar] transTable2 = ['e' : "5", 'o' : "orange"]; translate("hello world", transTable2, null, buffer); writeln(buffer.data); // "h5llorange worangerld"
This is an ASCII-only overload of translate
. It will not work with Unicode. It exists as an optimization for the cases where Unicode processing is not necessary.
Unlike the other overloads of translate
, this one does not take an AA. Rather, it takes a string
generated by makeTransTable
.
The array generated by makeTransTable
is 256
elements long such that the index is equal to the ASCII character being replaced and the value is equal to the character that it's being replaced with. Note that translate does not decode any of the characters, so you can actually pass it Extended ASCII characters if you want to (ASCII only actually uses 128
characters), but be warned that Extended ASCII characters are not valid Unicode and therefore will result in a UTFException
being thrown from most other Phobos functions.
Also, because no decoding occurs, it is possible to use this overload to translate ASCII characters within a proper UTF-8 string without altering the other, non-ASCII characters. It's replacing any code unit greater than 127
with another code unit or replacing any code unit with another code unit greater than 127
which will cause UTF validation issues.
tr
, std.array.replace
, std.algorithm.iteration.substitute
const(char)[] str
| The original string. |
const(char)[] transTable
| The string indicating which characters to replace and what to replace them with. It is generated by makeTransTable . |
const(char)[] toRemove
| The characters to remove from the string. |
auto transTable1 = makeTrans("eo5", "57q"); writeln(translate("hello world", transTable1)); // "h5ll7 w7rld" writeln(translate("hello world", transTable1, "low")); // "h5 rd"
Do same thing as makeTransTable
but allocate the translation table on the GC heap.
Use makeTransTable
instead.
auto transTable1 = makeTrans("eo5", "57q"); writeln(translate("hello world", transTable1)); // "h5ll7 w7rld" writeln(translate("hello world", transTable1, "low")); // "h5 rd"
Construct 256 character translation table, where characters in from[] are replaced by corresponding characters in to[].
const(char)[] from
| array of chars, less than or equal to 256 in length |
const(char)[] to
| corresponding array of chars to translate to |
writeln(translate("hello world", makeTransTable("hl", "q5"))); // "qe55o wor5d" writeln(translate("hello world", makeTransTable("12345", "67890"))); // "hello world"
This is an ASCII-only overload of translate
which takes an existing buffer to write the contents to.
const(char)[] str
| The original string. |
const(char)[] transTable
| The string indicating which characters to replace and what to replace them with. It is generated by makeTransTable . |
const(char)[] toRemove
| The characters to remove from the string. |
Buffer buffer
| An output range to write the contents to. |
import std.array : appender; auto buffer = appender!(char[])(); auto transTable1 = makeTransTable("eo5", "57q"); translate("hello world", transTable1, null, buffer); writeln(buffer.data); // "h5ll7 w7rld" buffer.clear(); translate("hello world", transTable1, "low", buffer); writeln(buffer.data); // "h5 rd"
Return string that is the 'successor' to s[]. If the rightmost character is a-zA-Z0-9, it is incremented within its case or digits. If it generates a carry, the process is repeated with the one to its immediate left.
writeln(succ("1")); // "2" writeln(succ("9")); // "10" writeln(succ("999")); // "1000" writeln(succ("zz99")); // "aaa00"
Replaces the characters in str
which are in from
with the the corresponding characters in to
and returns the resulting string.
tr
is based on Posix's tr, though it doesn't do everything that the Posix utility does.
C1[] str
| The original string. |
const(C2)[] from
| The characters to replace. |
const(C3)[] to
| The characters to replace with. |
const(C4)[] modifiers
| String containing modifiers. |
Modifier | Description |
'c' | Complement the list of characters in from
|
'd' | Removes matching characters with no corresponding replacement in to
|
's' | Removes adjacent duplicates in the replaced characters |
'd'
is present, then the number of characters in to
may be only 0
or 1
. If the modifier 'd'
is not present, and to
is empty, then to
is taken to be the same as from
. If the modifier 'd'
is not present, and to
is shorter than from
, then to
is extended by replicating the last character in to
. Both from
and to
may contain ranges using the '-'
character (e.g. "a-d"
is synonymous with "abcd"
.) Neither accept a leading '^'
as meaning the complement of the string (use the 'c'
modifier for that). writeln(tr("abcdef", "cd", "CD")); // "abCDef" writeln(tr("1st March, 2018", "March", "MAR", "s")); // "1st MAR, 2018" writeln(tr("abcdef", "ef", "", "d")); // "abcd" writeln(tr("14-Jul-87", "a-zA-Z", " ", "cs")); // " Jul "
Takes a string s
and determines if it represents a number. This function also takes an optional parameter, bAllowSep
, which will accept the separator characters ','
and '__'
within the string. But these characters should be stripped from the string before using any of the conversion functions like to!int()
, to!float()
, and etc else an error will occur.
Also please note, that no spaces are allowed within the string anywhere whether it's a leading, trailing, or embedded space(s), thus they too must be stripped from the string before using this function, or any of the conversion functions.
S s
| the string or random access range to check |
bool bAllowSep
| accept separator characters or not |
bool
assert(isNumeric("123")); assert(isNumeric("123UL")); assert(isNumeric("123L")); assert(isNumeric("+123U")); assert(isNumeric("-123L"));
assert(isNumeric("+123")); assert(isNumeric("-123.01")); assert(isNumeric("123.3e-10f")); assert(isNumeric("123.3e-10fi")); assert(isNumeric("123.3e-10L")); assert(isNumeric("nan")); assert(isNumeric("nani")); assert(isNumeric("-inf"));
assert(isNumeric("-123e-1+456.9e-10Li")); assert(isNumeric("+123e+10+456i")); assert(isNumeric("123+456"));
enum a = isNumeric("123.00E-5+1234.45E-12Li"); enum b = isNumeric("12345xxxx890"); static assert( a); static assert(!b);
Soundex algorithm.
The Soundex algorithm converts a word into 4 characters based on how the word sounds phonetically. The idea is that two spellings that sound alike will have the same Soundex value, which means that Soundex can be used for fuzzy matching of names.
Range str
| String or InputRange to convert to Soundex representation. |
soundex
writeln(soundexer("Gauss")); // "G200" writeln(soundexer("Ghosh")); // "G200" writeln(soundexer("Robert")); // "R163" writeln(soundexer("Rupert")); // "R163" writeln(soundexer("0123^&^^**&^")); // ['\0', '\0', '\0', '\0']
Like soundexer
, but with different parameters and return value.
const(char)[] str
| String to convert to Soundex representation. |
char[] buffer
| Optional 4 char array to put the resulting Soundex characters into. If null, the return value buffer will be allocated on the heap. |
soundexer
writeln(soundex("Gauss")); // "G200" writeln(soundex("Ghosh")); // "G200" writeln(soundex("Robert")); // "R163" writeln(soundex("Rupert")); // "R163" writeln(soundex("0123^&^^**&^")); // null
Construct an associative array consisting of all abbreviations that uniquely map to the strings in values.
This is useful in cases where the user is expected to type in one of a known set of strings, and the program will helpfully auto-complete the string once sufficient characters have been entered that uniquely identify it.
import std.string; static string[] list = [ "food", "foxy" ]; auto abbrevs = abbrev(list); assert(abbrevs == ["fox": "foxy", "food": "food", "foxy": "foxy", "foo": "food"]);
Compute column number at the end of the printed form of the string, assuming the string starts in the leftmost column, which is numbered starting from 0.
Tab characters are expanded into enough spaces to bring the column number to the next multiple of tabsize. If there are multiple lines in the string, the column number of the last line is returned.
Range str
| string or InputRange to be analyzed |
size_t tabsize
| number of columns a tab character represents |
import std.utf : byChar, byWchar, byDchar; writeln(column("1234 ")); // 5 writeln(column("1234 "w)); // 5 writeln(column("1234 "d)); // 5 writeln(column("1234 ".byChar())); // 5 writeln(column("1234 "w.byWchar())); // 5 writeln(column("1234 "d.byDchar())); // 5 // Tab stops are set at 8 spaces by default; tab characters insert enough // spaces to bring the column position to the next multiple of 8. writeln(column("\t")); // 8 writeln(column("1\t")); // 8 writeln(column("\t1")); // 9 writeln(column("123\t")); // 8 // Other tab widths are possible by specifying it explicitly: writeln(column("\t", 4)); // 4 writeln(column("1\t", 4)); // 4 writeln(column("\t1", 4)); // 5 writeln(column("123\t", 4)); // 4 // New lines reset the column number. writeln(column("abc\n")); // 0 writeln(column("abc\n1")); // 1 writeln(column("abcdefg\r1234")); // 4 writeln(column("abc\u20281")); // 1 writeln(column("abc\u20291")); // 1 writeln(column("abc\u00851")); // 1 writeln(column("abc\u00861")); // 5
Wrap text into a paragraph.
The input text string s is formed into a paragraph by breaking it up into a sequence of lines, delineated by \n, such that the number of columns is not exceeded on each line. The last line is terminated with a \n.
S s
| text string to be wrapped |
size_t columns
| maximum number of columns in the paragraph |
S firstindent
| string used to indent first line of the paragraph |
S indent
| string to use to indent following lines of the paragraph |
size_t tabsize
| column spacing of tabs in firstindent[] and indent[] |
writeln(wrap("a short string", 7)); // "a short\nstring\n" // wrap will not break inside of a word, but at the next space writeln(wrap("a short string", 4)); // "a\nshort\nstring\n" writeln(wrap("a short string", 7, "\t")); // "\ta\nshort\nstring\n" writeln(wrap("a short string", 7, "\t", " ")); // "\ta\n short\n string\n"
Removes one level of indentation from a multi-line string.
This uniformly outdents the text as much as possible. Whitespace-only lines are always converted to blank lines.
Does not allocate memory if it does not throw.
S str
| multi-line string |
enum pretty = q{ import std.stdio; void main() { writeln("Hello"); } }.outdent(); enum ugly = q{ import std.stdio; void main() { writeln("Hello"); } }; writeln(pretty); // ugly
Removes one level of indentation from an array of single-line strings.
This uniformly outdents the text as much as possible. Whitespace-only lines are always converted to blank lines.
S[] lines
| array of single-line strings |
auto str1 = [ " void main()\n", " {\n", " test();\n", " }\n" ]; auto str1Expected = [ "void main()\n", "{\n", " test();\n", "}\n" ]; writeln(str1.outdent); // str1Expected auto str2 = [ "void main()\n", " {\n", " test();\n", " }\n" ]; writeln(str2.outdent); // str2
Assume the given array of integers arr
is a well-formed UTF string and return it typed as a UTF string.
ubyte
becomes char
, ushort
becomes wchar
and uint
becomes dchar
. Type qualifiers are preserved.
When compiled with debug mode, this function performs an extra check to make sure the return value is a valid Unicode string.
T[] arr
| array of bytes, ubytes, shorts, ushorts, ints, or uints |
representation
string a = "Hölo World"; immutable(ubyte)[] b = a.representation; string c = b.assumeUTF; writeln(a); // c
© 1999–2018 The D Language Foundation
Licensed under the Boost License 1.0.
https://dlang.org/phobos/std_string.html