Replace Specific Unicode Character in whole collection

Hi, we’re managing deeply nested content (Prosemirror-Documents). Now I have realized, that a lot of such documents contain the unicode character U+00A0, which is a non breaking space, instead of a regular space. this destroys a lot of user interfaces. Can I somehow run a script that traverses through the whole collection and replaces the character with the correct one? :slight_smile:

Hi Nicolas,

I’m afraid the method to do global search/replace of a single character inside a document, deeply nested or otherwise, is not part of the server’s codebase.

A quick and dirty method I can think of is to do a mongoexport on the collection, pipe to sed or tr, then pipe to mongoimport. Something like:

mongoexport -d db -c coll | tr ... | mongoimport -d db -c coll2

Probably not ideal for a large collection. Unfortunately if you need a more granular method, you’d have to write a specialized script.

Best regards,
Kevin

2 Likes