SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Member
    Join Date
    Jan 2012
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    find text in source page with JavaScript

    Hello all,
    I wonder how can I find a specific text in the source of a web page.
    I know I can find DOM elements with JavaScript, with statements like that
    "color = document.body.style.background"
    but I need to find text that is not part of the DOM Objects.

    I know JavaScript has some functions to search in the page, but I need to search in the source page, not in the rendered html page.

    To be more specific, when I check news on a specific website like this: http://www.bbc.co.uk/news/technology-16382648
    I need to make a script (a bookmarklet or a greasemonkey script) to extract the date, and therefore I have to search for the text "<span class="date">"

    In other words: how can I make a script or a Firefox addon to "grep" for a certain text on a web site?

    I am using Firefox browser, so for me it's enough a solution that only works in that browser.

    thanks

  2. #2
    SitePoint Member
    Join Date
    Jan 2012
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I found the solution, in case anyone else needs something similar:

    var x = document.body.innerHTML;
    x = x.match(/class=.date.*/)[0];
    x = x.replace(/class=.date../, "");
    x = x.replace(/..span/, "");
    alert(x);

  3. #3
    Under Construction silver trophybronze trophy AussieJohn's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, Australia
    Posts
    776
    Mentioned
    11 Post(s)
    Tagged
    0 Thread(s)
    So long as you're using a modern browser you could use document.getElementsByClassName(); to get the date element. If you would need to narrow it down to only a specific area (to save having to look through a whole document) you could use something like

    Code javascript:
    var el = document.getElementById("main-content");
    var articleDate = el.getElementsByClassName("date")[0].innerHTML; // [0] is there to return the first element only.
    var details = {
    . . web: "afterlight.com.au",
    . . photos: "jvdl.id.au",
    . . psa: "usethelatestversion.com"
    }


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •