Searching in Strings (GNU Octave (version 8.3.0))

5.3.4 Searching in Strings

Since a string is a character array, comparisons between strings work element by element as the following example shows:

GNU = "GNU's Not UNIX";
spaces = (GNU == " ")
     ⇒ spaces =
       0   0   0   0   0   1   0   0   0   1   0   0   0   0

To determine if two strings are identical it is necessary to use the strcmp function. It compares complete strings and is case sensitive. strncmp compares only the first N characters (with N given as a parameter). strcmpi and strncmpi are the corresponding functions for case-insensitive comparison.

: tf = strcmp (str1, str2) ¶

Return 1 if the character strings str1 and str2 are the same, and 0 otherwise.

If either str1 or str2 is a cell array of strings, then an array of the same size is returned, containing the values described above for every member of the cell array. The other argument may also be a cell array of strings (of the same size or with only one element), char matrix or character string.

Caution: For compatibility with MATLAB, Octave’s strcmp function returns 1 if the character strings are equal, and 0 otherwise. This is just the opposite of the corresponding C library function.

See also: strcmpi, strncmp, strncmpi.

: tf = strncmp (str1, str2, n) ¶

Return 1 if the first n characters of strings str1 and str2 are the same, and 0 otherwise.

strncmp ("abce", "abcd", 3)
      ⇒ 1

strncmp ("abce", {"abcd", "bca", "abc"}, 3)
     ⇒ [1, 0, 1]

Caution: For compatibility with MATLAB, Octave’s strncmp function returns 1 if the character strings are equal, and 0 otherwise. This is just the opposite of the corresponding C library function.

See also: strncmpi, strcmp, strcmpi.

: tf = strcmpi (str1, str2) ¶

Return 1 if the character strings str1 and str2 are the same, disregarding case of alphabetic characters, and 0 otherwise.

Caution: National alphabets are not supported.

See also: strcmp, strncmp, strncmpi.

: tf = strncmpi (str1, str2, n) ¶

Return 1 if the first n character of s1 and s2 are the same, disregarding case of alphabetic characters, and 0 otherwise.

Caution: For compatibility with MATLAB, Octave’s strncmpi function returns 1 if the character strings are equal, and 0 otherwise. This is just the opposite of the corresponding C library function.

Caution: National alphabets are not supported.

See also: strncmp, strcmp, strcmpi.

Despite those comparison functions, there are more specialized function to find the index position of a search pattern within a string.

: retval = startsWith (str, pattern) ¶

: retval = startsWith (str, pattern, "IgnoreCase", ignore_case) ¶

Check whether string(s) start with pattern(s).

Return an array of logical values that indicates which string(s) in the input str (a single string or cell array of strings) begin with the input pattern (a single string or cell array of strings).

If the value of the parameter "IgnoreCase" is true, then the function will ignore the letter case of str and pattern. By default, the comparison is case sensitive.

Examples:

## one string and one pattern while considering case
startsWith ("hello", "he")
      ⇒  1

## one string and one pattern while ignoring case
startsWith ("hello", "HE", "IgnoreCase", true)
      ⇒  1

## multiple strings and multiple patterns while considering case
startsWith ({"lab work.pptx", "data.txt", "foundations.ppt"},
            {"lab", "data"})
      ⇒  1  1  0

## multiple strings and one pattern while considering case
startsWith ({"DATASHEET.ods", "data.txt", "foundations.ppt"},
            "data", "IgnoreCase", false)
      ⇒  0  1  0

## multiple strings and one pattern while ignoring case
startsWith ({"DATASHEET.ods", "data.txt", "foundations.ppt"},
            "data", "IgnoreCase", true)
      ⇒  1  1  0

See also: endsWith, regexp, strncmp, strncmpi.

: retval = endsWith (str, pattern) ¶

: retval = endsWith (str, pattern, "IgnoreCase", ignore_case) ¶

Check whether string(s) end with pattern(s).

Return an array of logical values that indicates which string(s) in the input str (a single string or cell array of strings) end with the input pattern (a single string or cell array of strings).

If the value of the parameter "IgnoreCase" is true, then the function will ignore the letter case of str and pattern. By default, the comparison is case sensitive.

Examples:

## one string and one pattern while considering case
endsWith ("hello", "lo")
      ⇒  1

## one string and one pattern while ignoring case
endsWith ("hello", "LO", "IgnoreCase", true)
      ⇒  1

## multiple strings and multiple patterns while considering case
endsWith ({"tests.txt", "mydoc.odt", "myFunc.m", "results.pptx"},
          {".docx", ".odt", ".txt"})
      ⇒  1  1  0  0

## multiple strings and one pattern while considering case
endsWith ({"TESTS.TXT", "mydoc.odt", "result.txt", "myFunc.m"},
          ".txt", "IgnoreCase", false)
      ⇒  0  0  1  0

## multiple strings and one pattern while ignoring case
endsWith ({"TESTS.TXT", "mydoc.odt", "result.txt", "myFunc.m"},
          ".txt", "IgnoreCase", true)
      ⇒  1  0  1  0

See also: startsWith, regexp, strncmp, strncmpi.

: v = findstr (s, t) ¶

: v = findstr (s, t, overlap) ¶

This function is obsolete. Use strfind instead.

Return the vector of all positions in the longer of the two strings s and t where an occurrence of the shorter of the two starts.

If the optional argument overlap is true (default), the returned vector can include overlapping positions. For example:

findstr ("ababab", "a")
     ⇒ [1, 3, 5];
findstr ("abababa", "aba", 0)
     ⇒ [1, 5]

Caution: findstr is obsolete. Use strfind in all new code.

See also: strfind, strmatch, strcmp, strncmp, strcmpi, strncmpi, find.

: idx = strchr (str, chars) ¶

: idx = strchr (str, chars, n) ¶

: idx = strchr (str, chars, n, direction) ¶

: [i, j] = strchr (…) ¶

Search through the string str for occurrences of characters from the set chars.

The return value(s), as well as the n and direction arguments behave identically as in find.

This will be faster than using regexp in most cases.

See also: find.

: n = index (s, t) ¶

: n = index (s, t, direction) ¶

Return the position of the first occurrence of the string t in the string s, or 0 if no occurrence is found.

s may also be a string array or cell array of strings.

For example:

index ("Teststring", "t")
    ⇒ 4

If direction is "first", return the first element found. If direction is "last", return the last element found.

See also: find, rindex.

: n = rindex (s, t) ¶

Return the position of the last occurrence of the character string t in the character string s, or 0 if no occurrence is found.

s may also be a string array or cell array of strings.

For example:

rindex ("Teststring", "t")
     ⇒ 6

The rindex function is equivalent to index with direction set to "last".

See also: find, index.

: idx = unicode_idx (str) ¶

Return an array with the indices for each UTF-8 encoded character in str.

unicode_idx ("aäbc")
     ⇒ [1, 2, 2, 3, 4]

: idx = strfind (str, pattern) ¶

: idx = strfind (cellstr, pattern) ¶

: idx = strfind (…, "overlaps", val) ¶

: idx = strfind (…, "forcecelloutput", val) ¶

Search for pattern in the string str and return the starting index of every such occurrence in the vector idx.

If there is no such occurrence, or if pattern is longer than str, or if pattern itself is empty, then idx is the empty array [].

The optional argument "overlaps" determines whether the pattern can match at every position in str (true), or only for unique occurrences of the complete pattern (false). The default is true.

If a cell array of strings cellstr is specified then idx is a cell array of vectors, as specified above.

The optional argument "forcecelloutput" forces idx to be returned as a cell array of vectors. The default is false.

Examples:

strfind ("abababa", "aba")
     ⇒ [1, 3, 5]

strfind ("abababa", "aba", "overlaps", false)
     ⇒ [1, 5]

strfind ({"abababa", "bebebe", "ab"}, "aba")
     ⇒
        {
          [1,1] =

             1   3   5

          [1,2] = [](1x0)
          [1,3] = [](1x0)
        }

strfind ("abababa", "aba", "forcecelloutput", true)
     ⇒
        {
          [1,1] =

             1   3   5
        }

See also: regexp, regexpi, find.

: idx = strmatch (s, A) ¶

: idx = strmatch (s, A, "exact") ¶

This function is obsolete. Use an alternative such as strncmp or strcmp instead.

Return indices of entries of A which begin with the string s.

The second argument A must be a string, character matrix, or a cell array of strings.

If the third argument "exact" is not given, then s only needs to match A up to the length of s. Trailing spaces and nulls in s and A are ignored when matching.

For example:

strmatch ("apple", "apple juice")
     ⇒ 1

strmatch ("apple", ["apple  "; "apple juice"; "an apple"])
     ⇒ [1; 2]

strmatch ("apple", ["apple  "; "apple juice"; "an apple"], "exact")
     ⇒ [1]

Caution: strmatch is obsolete (and can produce incorrect results in MATLAB when used with cell arrays of strings. Use strncmp (normal case) or strcmp ("exact" case) in all new code. Other replacement possibilities, depending on application, include regexp or validatestring.

See also: strncmp, strcmp, regexp, strfind, validatestring.