I am getting this symbol � when copy & pasting text into my website. It seems to happen with é characters and others too. The font is Arial so it should be supported with that font type.
Is it due to the content type that is set or is this a character encoding issue in PHP?
It looks like mismatch of encoding types by it could well be the copy and pasted text from different programs that have adopted another encoding from the ‘original source’ during the copy-and-paste.
Hence when input into the textarea they then might adopt their former encoding and the PHP script probably then attempts to parse using its encoding or the input rules.
In other words if there is a conflict between multiple encoding declarations within XHTML it follows:
HTTP Content-Type header
byte-order mark (BOM)
XML declaration
meta element
link charset attribute
The ‘replacement character’ � (often a black diamond with a white question mark) a symbol found in the Unicode standard at codepoint U+FFFD in the Specials table.
The content in question is encoded windows-1252, which is incompatible with utf-8. Paste into a text editor, and save as with character encoding specified as utf-8. Then you can copy/paste from the editor to the page buffer.
The content is copied from the web, usually from the publisher website with the book synopsis.
I do not see this issue with other editors, such as WordPress so I am thinking the built in WordPress WYSIWYG could have something to do with sustaining all of the characters correctly?
(The content is pasted directly into a textarea for this website, no WYSIWYG)
You are using utf-8, so the characters should display. Where are you getting this text from? Word? It would be better to paste the text into a plain text file first.
Also, I recommend that you clean up your doctype, as you are mixing XHTML and HTML5 doctypes. If you want to use the HTML5 one, it should look like this: