What is an ideal way to detected if a character is uppercase or lowercase, regardless of the fact of the current local language.
Is there a more direct function?
Assumptions: Set internal character encoding to UTF-8 & Local browser session is en-US,en;q=0.5 & Have installed Multibyte String extension. Do not use ctype_lower, or ctype_upper.
See below test code that should be multibyte compatible.
$encodingtype = 'utf8';
$charactervalue = mb_ord($character, $encodingtype);
$characterlowercase = mb_strtolower($character, $encodingtype) ;
$characterlowercasevalue = mb_ord(mb_strtolower($character, $encodingtype));
$characteruppercase = mb_strtoupper($character, $encodingtype);
$characteruppercasevalue = mb_ord(mb_strtoupper($character, $encodingtype));
// Diag Info
echo 'Input: ' . $character . "<br />";
echo 'Input Value: ' . $charactervalue = mb_ord($character, $encodingtype) . "<br />" . "<br />";
echo 'Lowercase: ' . $characterlowercase = mb_strtolower($character, $encodingtype) . "<br />";
echo 'Lowercase Value: ' . $characterlowercasevalue = mb_ord(mb_strtolower($character, $encodingtype)) . "<br />" . "<br />";
echo 'Uppercase: ' . $characteruppercase = mb_strtoupper($character, $encodingtype) . "<br />";
echo 'Uppercase Value: ' . $characteruppercasevalue = mb_ord(mb_strtoupper($character, $encodingtype)) . "<br />" . "<br />";
// Diag Info
if ($charactervalue == $characterlowercasevalue and $charactervalue != $characteruppercasevalue){
$uppercase = 0;
$lowercase = 1;
echo 'Is character is lowercase' . "<br />" . "<br />";
}
elseif ($charactervalue == $characteruppercasevalue and $charactervalue != $characterlowercasevalue ){
$uppercase = 1;
$lowercase = 0;
echo 'Character is uppercase' . "<br />" . "<br />";
}
else{
$uppercase = 0;
$lowercase = 0;
echo 'Character is neither lowercase or uppercase' . "<br />" . "<br />";
}
- // Test 1 A // Output-> Character is uppercase
- // Test 2 z // Output-> Character is lowercase
- // Test 3 + // Output-> Character is lowercase
- // Test 4 0 // Output-> Character is neither lowercase or uppercase
- // Test 5 ǻ // LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE // Output-> Character is lowercase
- // Test 6 Ͱ GREEK CAPITAL LETTER HETA // Output-> Character is uppercase
- // Test 7 '' NULL // Output-> Character is neither lowercase or uppercase
I feel the most direct way would be to write a regex pattern to determine the character type.
In the following snippet, I'll search for uppercase letters (including unicode) in the first capture group, or lowercase letters in the second capture group. If the pattern makes no match, the character is not a letter.
A good reference for unicode letters in regex: https://regular-expressions.mobi/unicode.html
Writing two capture groups separated by a pipe means each type of letter will be slotted into a different indexed element in the output array.
[0]
is the fullstring match (never used in this case, but its generation is unavoidable).[1]
will hold the uppercase match (or be empty when there is a lowercase match -- as a placeholding element).[2]
will hold the lowercase match -- it will only be generated if there is a lowercase match.For this reason, we can assume the highest key in the matches array will determine the casing of the letter.
If the input character is a non-letter,
preg_match()
will return the falsey result of0
to represent the number of matches, when this happens0
is used with the lookup to accessneither
.Code: (Demo) (Pattern Demo)
Output:
For anyone who is not yet on php7.3, you can call end() then key() like this:
Code: (Demo)
My first approach makes a minimum of one function call per test, and a maximum of two calls. My solution can be made into a one-liner by writing the
preg_
call inside of$lookup[
and]
, but I'm aiming for readability.p.s. Here is another variation that I dreamed up. The difference is that
preg_match()
always makes a match because of the final empty "alternative" (empty branch).Update:
Change all
ctype_
function with direct comparison... and Support for Cirilic characters...i like use function:
but you need to have a dictionary special character; and not depending of lenguage setting...
Update
more short: with gateres List of character accent:
Added and Style function: