I am trying to create a script which allows deleting any word in an entire webpage if that word includes a special character (colon), wherever in the start of the word, somewhere between the start to the end of the word, or in the end of the word).
I thank you dearly for exampling a code with regex ; I share with all humbleness that words which includes colons such as recipes:chinese_stir_fried_vegetables still appear.
Thanks for notifying that.
I understand the message behind what your wrote and indeed if instead âbodyâ I put the list container I still get everything in the list deleted (i.e. also things without a colon prefix).
Okay, thank to helpful comments above I have understood what I actually need to do:
After specifying the area I want to work in (in the case above, the entire body area of the document),
I just need to delete all <li> elements if they have a colon, so all I did for a successful test of the example was to change ("*") to ("li").
(then, by default, all elements in the documents which are <li> with a colon would be deleted).
Hi @bendqh1, yes you can use include() but for a more robust approach you might not iterate over HTML elements (as returned by querySelector()), but over the actual text nodes in the document; and if a given text node includes(':'), hide the parent element. You can use a tree walker for this like so:
const walker = document.createTreeWalker(
// The root node
document.body,
// Only walk text nodes
NodeFilter.SHOW_TEXT,
// Further filter to only those text
// nodes containing a colon
{
acceptNode (node) {
return node.textContent.includes(':')
? NodeFilter.FILTER_ACCEPT
: NodeFilter.FILTER_REJECT
}
}
)
// Hide the parent elements for all these nodes
let node
while ((node = walker.nextNode())) {
node.parentElement.style.display = 'none'
}
This way you can be assured only to hide the direct parent elements of the text nodes found.
The goal posts keep moving! The title of this thread is about deleting elements, the text of the original post is about deleting words and now this thread is about deleting <li> elements
My post #4 is for deleting words containing a colon.
I find my code does delete words containing underscrores and one colon. For the regular expression to work as required, words can contain only letters, numbers, underscores and one colon.
Hereâs one way of deleting <li> elements containing one or more colons:
var el = document.body.querySelectorAll("LI");
for(let z=0; z<el.length ; z++){
if(el[z].innerText.includes(":") ) el[z].parentElement.removeChild(el[z]);
}
(This often happens with âquotation marksâ as well⌠Macs (and forums, apparently!) in particular are fond of using âfancy quotesâ which arent actually the quotation mark symbol.)
Iâm confused, but if the aim is to delete any word starting with a semicolon within a string:
if(/^;[\a_]+/g.test(str)) {âŚ}
Within the block
let result = str.replace(/^;[\a_]+/g,ââ);
remove double spaces with
str = result.replace(/ /, " ");
str is now the original string minus words which start with a ;.
If the offending words only occur within paragraph elements then select those P elements with
document.getElementsByTagName(âPâ);
Then iterate over the resultant nodeList sending each node to a function containing the above code.
As nodelists are live changing either itâs innerText or value(not sure which) changing it in the function changes it in the document.
Or thinking outside the box, copy the html file and save out as a text file
Then filter that file through the replace commands but add the multiline(m) option along with the global(g) option.
The m and g option is from memory and may be capitalised, so check it yourself.
This info should put you on the right track.
Anything to do with text is easiest accomplished with regular expression.