-->

why is php trim is not really remove all whitespac

2020-08-10 09:30发布

问题:

I am grabbing input from a file with the following code

$jap= str_replace("\n","",addslashes(strtolower(trim(fgets($fh), " \t\n\r"))));

i had also previously tried these while troubleshooting

$jap= str_replace("\n","",addslashes(strtolower(trim(fgets($fh)))));
$jap= addslashes(strtolower(trim(fgets($fh), " \t\n\r")));

and if I echo $jap it looks fine, so later in the code, without any other alterations to $jap it is inserted into the DB, however i noticed a comparison test that checks if this jap is already in the DB returned false when i can plainly see that a seemingly exact same entry of jap is in the DB. So I copy the jap entry that was inserted right from phpmyadmin or from my site where the jap is displayed and paste into a notepad i notice that it paste like this... (this is an exact paste into the below quotes)

"

バスにのって、うみへ行きました"

and obviously i need, it without that white space and breaks or whatever it is.

so as far as I can tell the trim is not doing what it says it will do. or im missing something here. if so what is it?

UPDATE: with regards to Jacks answer

the preg_replace did not help but here is what i did, i used the bin2hex() to determine that the part that "is not the part i want" is efbbbf i did this by taking $jap into str replace and removing the japanese i am expecting to find, and what is left goes into the bin2hex. and the result was the above "efbbbf"

echo bin2hex(str_replace("どちらがあなたの本ですか","",$jap));

output of the above was efbbbf but what is it? can i make a str_replace to remove this somehow?

回答1:

The trim function doesn't know about Unicode white spaces. You could try this:

preg_replace('/^\p{Z}+|\p{Z}+$/u', '', $str);

As taken from: Trim unicode whitespace in PHP 5.2

Otherwise, you can do a bin2hex() to find out what characters are being added at the front.

Update

Your file contains a UTF8 BOM; to remove it:

$f = fopen("file.txt", "r");
$s = fread($f, 3);
if ($s !== "\xef\xbb\xbf") {
    // bom not found, rewind file
    fseek($f, 0, SEEK_SET);
}
// continue reading here


标签: php