Fix Encoding Artifacts
Remove Weird Unicode Characters Online
Fix encoding artifacts, strange symbols and invisible Unicode characters from copied text, PDFs, OCR exports and documents directly in your browser.
Workspace
Paste corrupted text, repair Unicode, copy clean output
Core cleanup
Target Unicode corruption, not rewrites
Remove weird Unicode symbols
Clean replacement marks, stray bytes and odd copied symbols.
Fix encoding artifacts
Repair mojibake such as ’, “, †and stray  marks.
Normalize quotation marks
Convert malformed smart quotes into predictable plain quotes.
Remove invisible characters
Strip zero-width spaces, joiners, soft hyphens and markers.
Remove non-breaking spaces
Replace NBSPs with normal spaces for editors and forms.
Repair corrupted punctuation
Normalize broken apostrophes, quotes, dashes and ellipses.
Quick cleanup modes
Choose the Unicode problem you see
Before and after
Real Unicode cleanup examples
Encoding corruption
Before
Don’t worry about it.
After
Don't worry about it.
Broken quotation marks
Before
“Hello Worldâ€
After
"Hello World"
Unicode noise
Before
Text with strange spaces
After
Text with strange spaces
OCR corruption
Before
The financial report contains unusual ligatures.
After
The financial report contains unusual ligatures.
Invisible Unicode
Before
Text containing hidden zero-width characters
After
Clean visible text only
Background
Why strange Unicode characters appear
Strange Unicode characters usually appear when text is decoded with the wrong encoding or copied through software that preserves bytes, layout marks or hidden document characters. A clean quote can become ’, a normal space can become Â, and an invisible BOM marker can appear as .
PDF extraction, OCR conversion, email copy-paste, website copying and document conversions can all introduce invisible Unicode, mojibake, malformed punctuation, ligatures and non-breaking spaces that make otherwise readable text hard to edit.
Artifacts
Common Unicode artifacts
Mojibake is text corruption caused by an encoding mismatch, often visible as sequences like ’, “,  or . BOM markers can sit at the start of a file, zero-width spaces can hide between words, and smart quotes or ligatures can break search, forms and data imports.
This tool focuses on those practical artifacts: invisible markers, non-breaking spaces, malformed smart quotes, corrupted punctuation and OCR Unicode noise.
Method
How Unicode cleanup works
The browser-side cleanup pipeline repairs common encoding artifacts, applies Unicode normalization, removes invisible characters, strips BOM markers, standardizes punctuation and normalizes spacing without sending your text to a server.
Use it before pasting text into CMS fields, spreadsheets, tickets, forms, search indexes, plain-text files or document cleanup workflows.
Related tools
Continue the text repair workflow
Need broader cleanup?
Use the AI Cleanup Tool when corrupted text also includes Markdown, bullets, chat formatting or wrapped lines.
Copied text still looks messy?
Use Fix Copy-Paste Formatting for layout, spacing and formatting artifacts beyond Unicode corruption.
FAQ
Unicode cleanup questions
Why do strange characters appear in copied text?
Encoding mismatches, PDF extraction, OCR conversion, email clients, websites and document conversions can all introduce strange symbols or hidden Unicode.
How do I remove weird Unicode symbols?
Paste the text, keep the Unicode cleanup options enabled and click Clean Unicode Text to repair common artifacts.
What are invisible Unicode characters?
They are hidden marks such as zero-width spaces, joiners, direction markers, soft hyphens and BOM markers.
Can I fix encoding corruption?
Yes. The tool repairs common mojibake, corrupted punctuation, replacement symbols and stray encoding artifacts.
Does this tool remove zero-width spaces?
Yes. Enable Remove invisible Unicode to strip zero-width spaces and related hidden characters.