First, please read the full story, then do whatever you want. Do not run commands or do action without reading the full thing. After changing Host or switching to MariaDB, weird characters may appear. Here is Details on non-UTF8 Character Like  Removal From WordPress. First, you must understand what is UTF-8. There are standard websites on UTF-8. The initial versions of WordPress databases were created using the latin1 character set and the latin1_swedish_ci collation. WordPress started from a fork of another free software.
Weird Latin Character Removal From WordPress : You did it versus some bugs did it
Some has MEANING. Some has no meaning (= blank space). Â, ​ has no meaning. Without any reason appearance is basically not your error. Such characters need simple removal. But, meaningful characters need replacement.
1 2 3 4 5 6 7 8 | – means — • means - … means … “ means “ †means ” ‘ means ‘ ’ means ’ — means – |
– type stuffs mean that either you did it or some software did it. Finding the reason may be important for such case. It can be a pointing symptom towards breech in security. Characters like Â, ​ are innocent killers.
---
There are very difficult article written in WordPress doc :
1 | https://codex.wordpress.org/Converting_Database_Character_Sets |
Weird Latin Character Removal From WordPress : Commenting Out 2 Lines From wp-config is Not Easy Matter
Most commonly advised method to get rid of the weird Latin characters from old posts is to simply commenting out these two lines (2nd & 4th lines in the below snippet) from wp-config.php
file :
1 2 3 4 | /** Database Charset to use in creating database tables. */ define('DB_CHARSET', 'utf8'); /** The Database Collate type. Don't change this if in doubt. */ define('DB_COLLATE', ''); |
As you can see there are two matter – collation
and the UTF-8 matter. Even it is written – “Don’t change this if in doubt.”
Yes, it does affect the whole database like a sci-fi story. Actually, wp-config.php
file is not exactly easy thing. BUT, such changes may empty a Plugin’s all fields which you used for customization of the theme. Genesis Simple Hooks is one such plugin which get affected. The collation part is not easy.
You can test by simply commenting out these two lines from wp-config.php
file, but you must have a full database backup.
It is better to create a database with the backup of your database with wired characters from SSH or whatever you use to administer the backend. In case of devastation, you will first change those two commenting out lines from wp-config.php
file to working lines and then use the backup database details. It is very easy to take backup and create new database from SSH :
1 2 3 4 5 6 7 8 9 10 11 12 | # for localhost database mysqldump -h localhost -u root -p yourdatabasename > backedupdatabase.sql # for database over network/other server; 1It is 0.0.0.91 is an example IP mysqldump -h 10.0.0.91 -u root -p yourdatabasename > backedupdatabase.sql # login for localhost mysql/mariadb mysql -u root -h localhost -p # username is the database username mysql -u username -h 10.0.0.91 -p CREATE DATABASE backUPdb2015; use backUPdb2015; \. backedupdatabase.sql exit; |
WE CLEARLY AGAINST THIS METHOD OF COMMENTING OUT. We can not see or control on what is happening. The problem is with only content of posts & pages.
Weird Latin Character Removal From WordPress : Checking php.ini and Nginx declaration
In ideal settings, your HTML source code of web page with fault should be lacking the UTF-8 declaration. In such case, you should enable default char set as UTF-8 from php.ini
file (search default_charset
in php.ini
of cli and fpm) and Nginx’s nginx.conf
file. These are useless works, we tested, they are probably a wastage a time for WordPress. Problem is difficult in nature.
Weird Latin Character Removal From WordPress : Best is Running Query on MySQL/MariaDB Database
First you need to find out which characters are appearing. You possibly have Google Search your website. Search with the character copied from the post you noticed to be funny with Google search. It is practical. Yeah, in one day Google may not spider all, but it is the best way. You may allow more days to see more funny characters. When we will say to run :
1 | UPDATE wp_posts SET post_content = REPLACE(post_content, 'Â', ''); |
it will mean – replace  with nothing. But if you have – â€, then you should run :
1 | UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€', '”'); |
Login to database server :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | # login for localhost mysql/mariadb mysql -u root -h localhost -p # username is the database username for remote db with 10.0.0.91 IP mysql -u username -h 10.0.0.91 -p # commands show databases; use backUPdb2015; # may be you alter once to utf8 ALTER TABLE wp_posts CHARACTER SET utf8; # Replace - we used  as an example funny Latin character UPDATE wp_posts SET post_content = REPLACE(post_content, 'Â', ''); # change â with your abnormal one and run the command # Replace - we used †as an example to convert it to ” UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€', '”'); exit; |
Full output will be like this :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | mysql -u backUPdb2015 -h 10.0.0.91 -p Enter password: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 2629 Server version: 5.5.44-MariaDB-1ubuntu0.14.04.1-log (Ubuntu) Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> use backUPdb2015; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed MariaDB [backUPdb2015]> ALTER TABLE wp_posts CHARACTER SET utf8; Query OK, 6694 rows affected (4.02 sec) Records: 6694 Duplicates: 0 Warnings: 0 MariaDB [backUPdb2015]> UPDATE wp_posts SET post_content = REPLACE(post_content, 'Â', '”'); Query OK, 3125 rows affected (3.62 sec) Rows matched: 6694 Changed: 3125 Warnings: 0 MariaDB [backUPdb2015]> UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€', ''); Query OK, 430 rows affected (1.46 sec) Rows matched: 6694 Changed: 430 Warnings: 0 |
Notice it :
1 | Rows matched: 6694 Changed: 3125 Warnings: 0 |
simply commenting out these two lines from wp-config.php
file would force all ignoring Warning.
There is no other shortcut. You may need to repeat it later or manually edit 1-2 old posts.
Finding more shortcut may vanish some posts, some posts may do paranormal 301 redirection towards its category. With few posts (with 500 posts) that can be easy. But with mammoth sized database like that of us with 5K posts, it is like buying few pounds of RDX and keeping it in own kitchen for fun.
Tagged With examples of utf-8 , non utf 8 characters list , non utf-8 character list , non utf-8 characters