I have an SVG file that contains some HTML embedded deep in it and I need to remove this HTML to be able to upload the SVG file to a certain image database that doesn’t allow SVG files to contain HTML in them.
In a source code editor the file is so complex, dense and heavily-machine-code-like that I don’t think it’s the correct approach to remove the HTML manually.
Oddly enough, the HTML in that file is not nested with <html>, and <body> tags.
I only mean HTML tags, assuming that there is an easy way to match them.
I am in a sticky situation here where Mermaid chart SVGs contain embedded HTML and because of that HTML I can’t upload them to MediaWiki website media libraries:
The exported SVG appears to use embedded HTML for the chart labels. Removing it would remove your labels. If all you want to do is upload the image to a site, it’d probably be easier to just use the PNG export. If you want to keep using SVG, you’d have to convert the labels to svg text tags.
Perhaps I misunderstand what you are saying but if I understand then it seems possible to replace the HTML with just the text. I assume that would require finding each relevant element and then using a HTML parser with it. This question has the regex tag but I would not use a regex for this.