UTF8 error after pg_escape – Postgresql and PHP

I have the following piece of code that reads a txt file and imports it to a postgresql db. I used pg_escape_string for all the text fields that could possibly contain special characters. Then I import it to the db. All successful. Later when I run a select query on the same data, I get a UTF8 error. I checked the hex for the error code and it is a character that is nowhere in my data. When I manually remove all the ‘ in the text file and run the import and select query again, all is working fine.

I have checked stack overflow for similar errors but could not find an answer that specifically works with the pg_escape_string and then get this error.

Any advice on how to run the import without manually removing ‘ and not getting the error please?

$val9 = pg_escape_string(trim($meterArr[$row][9]));
$val10 = pg_escape_string(trim($meterArr[$row][10]));
INSERT INTO "Meters"("AgentCode", "BuildingNr", "MeterNr", "Premises", "SortCode", "PrevReading", "DatePrev", "ConsumptionPrev", "Variance", "MeterName", "CurrOccupant", "MeterType", "MeterNoR","currReading","lastReadingDate","percConsumption","dailyAvg")
                VALUES ('$val0', '$val1', '$val2', '$val3', '$val4', '$val5', '$val6', '$val7', '$val8', '$val9', '$val10', '$val11', '$val12','$val13','$val14','$val15','$val16')
                ON CONFLICT ("AgentCode","BuildingNr","MeterNr") DO UPDATE SET 
                "Premises" = '$val3',
                "SortCode" = '$val4',
                "PrevReading" = '$val5',
                "DatePrev" = '$val6',
                "ConsumptionPrev" = '$val7',
                "Variance" = '$val8',
                "MeterName" = '$val9',
                "CurrOccupant" = '$val10',
                "MeterType" = '$val11',
                "MeterNoR" = '$val12',
                "currReading" = '$val13',
                "lastReadingDate" = '$val14',
                "percConsumption"='$val15',
                "dailyAvg"='$val16';

This is just a snippet of the code.

I tried manual removal of special characters and I tried pg_escape_string. With manual removal everything is working but with pg_escape string it isn’t.

Here is the output I get with urlencode(pg_escape_string($data[5])) with the problematic data.
Bergen%27%27s+Appliance+R

And then here is the actual data row from the txt file.
” 728″,” 298″,” 12″,” 6998″,”00008″,”Bergen’s Appliance R”,” 62066.00″,” 62066.00″,” .00″,” .00″,” .00″,”2019- 1-25″,”25- 1-2019″,” .00″,” .00″,”N”,”N”,”N”,”N”,”E”,”N”,”N”,” .00″,” .00″,” 0″,”N”,” “,” 1″,”Bergen’s Appliance R “,”Bergen’s Appliance R”,”?? “,” 0″

  • It makes little sense that the presence or absence of a ' should actually make any difference in this regard. Could it be that those are not actually ' – but something similarly looking, like an actual backtick ` or a ´? Problems with those would be to be expected, if your input text file is not actually encoded in proper UTF-8.

    – 

  • I thought about this but I opened the txt file in VS Code and specifically searched for the apostrophe when I removed it.

    – 

  • So what does pg_escape_string return, for one of those problematic values? (Please apply something like urlencode when you make a debug output, so that we can see what actual byte values are in there, even for “invisible” characters.)

    – 

  • 2

    Never used this function, I always found it much easier and clearer to use the function pg_query_params(). Clean PHP code, clean SQL, and no SQL injection: php.net/pg_query_params

    – 

  • 2

    It is so much easier to read code and data if you place it into the question.

    – 

Leave a Comment