Functions
	utf8_strlen ($string)
	Returns the number of Unicode code points in a string. More...

	utf8_substr ($string, $offset, $length=null)
	Returns part of a string given character offset and optionally length. More...

	utf8_strtolower ($string)
	Makes a string lowercase. More...

	utf8_strtoupper ($string)
	Makes a string uppercase. More...

	utf8_strpos ($haystack, $needle, $offset=0)
	Finds position of first occurrence of a string within another, case sensitive. More...

	utf8_stripos ($haystack, $needle, $offset=0)
	Finds position of first occurrence of a string within another, case insensitive. More...

	utf8_ucfirst ($string)
	Makes a string's first character uppercase. More...

	utf8_is_valid ($string)
	Tests a string as to whether it's valid UTF-8 and supported by the Unicode standard. More...

	utf8_bad_replace ($string, $replace='?')
	Replace bad bytes with an alternative character - ASCII character recommended is replacement char. More...

Detailed Description

UTF-8 related string functions.

Author: Harry Fuecks hfuec.nosp@m.ks@g.nosp@m.mail..nosp@m.com; The CMSimple_XH developers devs@.nosp@m.cmsi.nosp@m.mple-.nosp@m.xh.o.nosp@m.rg

Copyright: 2006-2007 Harry Fuecks; 2009-2023 The CMSimple_XH developers https://www.cmsimple-xh.org/?About-CMSimple_XH/The-XH-Team; GNU GPLv3 http://www.gnu.org/licenses/gpl-3.0.en.html

Function Documentation

◆ utf8_bad_replace()

utf8_bad_replace	(	$string,
		$replace = `'?'`
	)

Replace bad bytes with an alternative character - ASCII character recommended is replacement char.

PCRE Pattern to locate bad bytes in a UTF-8 string Comes from W3 FAQ: Multilingual Forms.

Note: modified to include full ASCII range including control chars

Parameters

string	$string	A string to search.
string	$replace	A string to replace bad bytes with - use ASCII.

Returns: string

See also: http://www.w3.org/International/questions/qa-forms-utf-8

◆ utf8_is_valid()

utf8_is_valid ( $string )

Tests a string as to whether it's valid UTF-8 and supported by the Unicode standard.

Parameters

string $string A UTF-8 encoded string.

Returns: boolean

◆ utf8_stripos()

utf8_stripos	(	$haystack,
		$needle,
		$offset = `0`
	)

Finds position of first occurrence of a string within another, case insensitive.

Returns false if needle is not found.

Parameters

string	$haystack	A haystack.
string	$needle	A needle.
int	$offset	An offset in Unicode code points.

Returns: int|false

◆ utf8_strlen()

utf8_strlen ( $string )

Returns the number of Unicode code points in a string.

Note: this function does not count bad bytes in the string - these are simply ignored.

Parameters

string $string A UTF-8 encoded string.

Returns: int

◆ utf8_strpos()

utf8_strpos	(	$haystack,
		$needle,
		$offset = `0`
	)

Finds position of first occurrence of a string within another, case sensitive.

Returns false if needle is not found.

Parameters

string	$haystack	A haystack.
string	$needle	A needle.
int	$offset	An offset in Unicode code points.

Returns: int

◆ utf8_strtolower()

utf8_strtolower ( $string )

Makes a string lowercase.

Note: The concept of a characters "case" only exists is some alphabets such as Latin, Greek, Cyrillic, Armenian and archaic Georgian - it does not exist in the Chinese alphabet, for example. See Unicode Standard Annex #21: Case Mappings.

Parameters

string $string A UTF-8 encoded string.

Returns: string

◆ utf8_strtoupper()

utf8_strtoupper ( $string )

Makes a string uppercase.

Note: The concept of a characters "case" only exists is some alphabets such as Latin, Greek, Cyrillic, Armenian and archaic Georgian - it does not exist in the Chinese alphabet, for example. See Unicode Standard Annex #21: Case Mappings.

Parameters

string $string A UTF-8 encoded string.

Returns: string

◆ utf8_substr()

utf8_substr	(	$string,
		$offset,
		$length = `null`
	)

Returns part of a string given character offset and optionally length.

Parameters

string	$string	A UTF-8 encoded string.
int	$offset	A number of UTF-8 code points offset.
int	$length	A length in UTF-8 code points from offset

Returns: string

◆ utf8_ucfirst()

utf8_ucfirst ( $string )

Makes a string's first character uppercase.

Parameters

string $string A UTF-8 encoded string.

Returns: string

Functions

Detailed Description

Function Documentation

◆ utf8_bad_replace()

◆ utf8_is_valid()

◆ utf8_stripos()

◆ utf8_strlen()

◆ utf8_strpos()

◆ utf8_strtolower()

◆ utf8_strtoupper()

◆ utf8_substr()

◆ utf8_ucfirst()