Hey guys!
So I’m trying to do the following:
1.) Grab all elements from the page
2.) Grab their inner text
3.) Replace the inner text with some custom text
Sounds easy? I thought so to!
Everything works perfectly until I come across a node that has child nodes. So stupid me thought I could fix this like this:
var all = document.getElementsByTagName('*');
for (var i = 0, max = all.length; i < max; i++) {
if (all[i].children.length === 0) {
if (all[i].innerText !== '' && all[i] !== null && all[i].nodeName != 'SCRIPT') {
//Do stuff here...
}
}
}
But of course, I totally forgot about the elements that have some text, and then an element nested inside of that element, eg:
<p>This part here would totally get ignored <a href="#">while this part would be captured since it has no child nodes!</a></p>
I could even maybe solve the above problem. But imagine the following html:
<p>This is a p tag <a href="#">and this is a link <span>that has a span in <i>it</i>and it is</a> pretty hard to extract segments like this.</p>
So can anyone recommend an efficient way of traversing the DOM and grabbing the inner text of nested elements? Especially elements that are nested a few times in a row?
Btw also tried a recursive function:
var elements = Array.from(document.getElementsByTagName('*'));
var test = [];
jQuery(elements).each(function() {
traverseChildNodes(this);
});
function traverseChildNodes(node) {
var next;
if (node.nodeType === 1) {
// (Element node)
if ((node = node.firstChild)) {
do {
// Recursively call traverseChildNodes
// on each child node
next = node.nextSibling;
traverseChildNodes(node);
} while ((node = next));
}
} else if (node.nodeType === 3) {
// (Text node)
test.push(node);
}
}
And this doesn’t fail at returning all the nested text, but there’s no way for me to find out where exactly from the DOM is that text, so I can not replace that text with some other custom text.