Extracting node elements from the DOM

silic5494 · April 15, 2019, 5:50pm

Hey guys!
So I’m trying to do the following:
1.) Grab all elements from the page
2.) Grab their inner text
3.) Replace the inner text with some custom text

Sounds easy? I thought so to!
Everything works perfectly until I come across a node that has child nodes. So stupid me thought I could fix this like this:

var all = document.getElementsByTagName('*');
for (var i = 0, max = all.length; i < max; i++) {
	if (all[i].children.length === 0) {
		if (all[i].innerText !== '' && all[i] !== null && all[i].nodeName != 'SCRIPT') {
			//Do stuff here...
		}
	}
}

But of course, I totally forgot about the elements that have some text, and then an element nested inside of that element, eg:

<p>This part here would totally get ignored <a href="#">while this part would be captured since it has no child nodes!</a></p>

I could even maybe solve the above problem. But imagine the following html:

<p>This is a p tag <a href="#">and this is a link <span>that has a span in <i>it</i>and it is</a> pretty hard to extract segments like this.</p>

So can anyone recommend an efficient way of traversing the DOM and grabbing the inner text of nested elements? Especially elements that are nested a few times in a row?

Btw also tried a recursive function:

var elements = Array.from(document.getElementsByTagName('*'));
var test = [];
jQuery(elements).each(function() {
	traverseChildNodes(this);
});

function traverseChildNodes(node) {
	var next;

	if (node.nodeType === 1) {
		// (Element node)

		if ((node = node.firstChild)) {
			do {
				// Recursively call traverseChildNodes
				// on each child node
				next = node.nextSibling;
				traverseChildNodes(node);
			} while ((node = next));
		}
	} else if (node.nodeType === 3) {
		// (Text node)
		test.push(node);
	}
}

And this doesn’t fail at returning all the nested text, but there’s no way for me to find out where exactly from the DOM is that text, so I can not replace that text with some other custom text.

SamuelCalifornia · April 15, 2019, 7:32pm

Start by looking at what others have done. There are very many useful articles there.

I know there is a relatively new set of functions that are designed for this purpose but I cannot find them now.

silic5494 · April 15, 2019, 9:00pm

Every single link on that google search is purple xD
Read a bunch of stuff, stackoverwflow questions, but couldn’t find anything that could actually help me.

system · July 16, 2019, 4:00am

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to do element extraction in javascript? JavaScript	6	944	December 29, 2016
Vanilla JS getting all links under span using XPath and foreach loop JavaScript	2	767	March 6, 2022
DOM hierarchy JavaScript	6	5176	September 1, 2010
Cell text in a span JavaScript	3	983	May 11, 2010
How to extract text from div elements that are in a Nodelist Array JavaScript	2	2012	October 8, 2014

Extracting node elements from the DOM

Related topics