Extract img link from img tag

Hi all,

I have some problem dealing with extracting image to server.

I wish to extract all images from my database that stored using WYSIWYG to server.

The data are stored in the format such as record1:

<img src=‘book_cover.jpg’>the description is here.<img src=‘image2.jpg’>continue the story here

How can I get the image link out from the records?

  1. book_cover.jpg
  2. image2.jpg
    .
    .
    .
    .

Use regular expression will do your job

<?php
$ptn = “/<img src=‘([^’]+)'>/”;
$str = “<img src=‘book_cover.jpg’>the description is here.<img src=‘image2.jpg’>continue the story here”;
preg_match_all($ptn, $str, $matches);
print_r($matches);
?>

Test PHP regulation expression here


<?php

$doc = new DOMDocument();
$doc->loadHTML('<img src="book_cover.jpg">the description is here.<img src="image2.jpg">continue the story here');
$xml = simplexml_import_dom($doc);
$images = $xml->xpath('//img');
foreach ($images as $img)
{
    echo $img['src'] . "<br />";
}


Thanks for your suggestion, uringgogo…

As the image tag sometimes might be <img src=“”>, sometimes <img alt=“” src=“”> etc… preg_match_all is a bit difficult to apply here (I’m not so familiar to regular expression) …

Dear Ernie,

Thank you for your suggestion. I’ve tried it out.

I got this warning:

PHP Warning: DOMDocument::loadHTML() [<a href=‘domdocument.loadhtml’>domdocument.loadhtml</a>]: AttValue: " expected in Entity, line: 5 in /home/abcde/public_html/directory/img.php on line 14

Do you have any idea?

Will $doc->loadHTML(‘$description’) cause error if the $description do not contain any <img> tag inside?

is there any way …we can just get big images from a page…with out using getimagesize
i have a code that works wonderfully
but with getimagesize on remote image…
if there are many images…the file hangs…or takes more than 4-5 minutes…

Can this be done with jQuery?

haven’t really check if it’s possible with jQuery prodac but if records would come from a database, the sample i gave below is much more practical way to use.
jQuery/javascript/php/ajax is a good combination to perform the task if you really need to do it with jQuery

I realized this is an old thread but somehow someone will need to know some possible answers, here’s my share.


/**
 * @function: Extract img link from img tag
 * @param: $string
 * @returns: image src value
 */

function get_image($string){
	preg_match('/<img([ alt="alt"]*?) src="([a-zA-Z0-9._:\\-\\/]+)"([ alt="alt"]*?)(\\/?)>/i',$string,$matches);
	
	foreach($matches AS $match){
		if(preg_match('/(.jpg|jpeg|.gif|.png)$/i',$match)){
			return $match;
			break;
		}
	}
}


//sample usage
$string = '<img src="http://www.rnel.net/book_cover.jpg" />';
echo get_image($string);
//returns http://www.rnel.net/book_cover.jpg

$string = '<img src="book_cover.jpg" />';
echo get_image($string);
//returns book_cover.jpg

will match any of the following combinations


$string = '<img src="book_cover.jpg" />';
$string = '<img src="book_cover.jpg" alt="alt" />';
$string = '<img alt="alt" src="book_cover.jpg" />';
$string = '<img src="book_cover.jpg">';
$string = '<img src="book_cover.jpg" alt="alt">';
$string = '<img alt="alt" src="book_cover.jpg">';

will look for the following extensions
.jpg .jpeg .gif and .png

Hi RNEL,

The records do indeed come from a database. And your script work great btw :slight_smile:

But the problem I’m facing now is that the website wich i need to use this on is not on a server that suport PHP, It’s on a Adobe BC solution, so really need to figure out to do it with just Javascript.

My string, wich is “{module_webapps,5007,l,1}” returns the following:

<a href="http://..." title="title"><img src="http://..." height="" width="" /></a>

All I need is the actual img src url.

This post might actually belong in the Javascript section of the Forum, so I’ll post it there as well.

Thanks.

With jQuery it’s easy:



<img src="book_cover.jpg" />
<img src="book_cover.jpg" alt="alt" />
<img alt="alt" src="book_cover.jpg" />
<img src="book_cover.jpg">
<img src="book_cover.jpg" alt="alt">
<img alt="alt" src="book_cover.jpg">

<div id="output"></div>

<script>
    $(function() {
        $('img').each(function() {
            $('#output').append($(this).attr('src'));
            $('#output').append($('<br>'));
        });
    });
</script>

That will loop through all the images in the page and write the ‘src’ attribute into the div with id ‘output’.

Thanks Immerse! :slight_smile:

It works, but what I want to achieve is a bit more complex I think…

To put it in context, here is my source code:

<body style="background: #010103 url({module_webapps,5007,l,1}) no-repeat fixed right top;">

When it runs on the server the output becomes:

<body style="background: #010103 url(<a href="linktossomewhere"><img src="/images/image.png" width="" height=""></a>) no-repeat fixed right top;">

What I want to achieve with jQuery is:

<body style="background: #010103 url(/images/image.png) no-repeat fixed right top;">

Btw, I’m som what of a newbie when it comes to jQuery :wink:

Well, that second bit of code you show is totally wrong, it’s putting HTML code in CSS declarations, which is incorrect.

Maybe you can extract that code using jQuery’s .css functions:


alert($('body').css('background-image'));

But I think that won’t work either, as the browser may simply ignore the incorrect style stuff.

I know nothing about Adobe BC, but maybe there’s some documentation that might help fix the style declaration?

Thnx,

Yeah the problem seem to be that the {module_webapps,5007,l,1} writes the format:

<a href="linktossomewhere"><img src="/images/image.png" width="" height="" /></a>

This can’t be changed, cause it’s just the way the module writes it’s output. I guess I’ll have to figure out another way to to do it.

Thanks anyways :wink:

Well, what you could do (but it’d be VERY ugly) is…

  • render those links & images in a hidden div somewhere in your document.
  • read the image source from the img tag
  • apply that to the body as background-image…

<body style="background: #010103 url(blank.png) no-repeat fixed right top;">

    <div style="display: none;" id="background-src">
        {module_webapps,5007,l,1}
    </div>

    <script>

        $(function() {
            var imgSrc = $('#background-src a img:first').attr('src');
            $('body').css({
                backgroundImage: imgSrc
            });
        });

    </script>

</body>

Do i Well, this is the output i get:

<body style="background: #010103 url(/images/blank.gif) no-repeat fixed right top;">

   <div style="display: none;" id="background-src">
       <a   href="/CustomContentRetrieve.aspx?ID=1089029"><img src="/images/backgrounds/fall2010.jpg" border="0" alt="" /></a>

   </div>

   <script>

       $(function() {
           var imgSrc = $('#background-src a img:first').attr('src');
           $('body').css({
               backgroundImage: imgSrc
           });
       });

   </script>

</body> 

I’m I missing something? Cause the background image is no where to be seen…

Try alerting the imgSrc to see if it’s finding it properly:



       $(function() {
           var imgSrc = $('#background-src a img:first').attr('src');
           alert(imgSrc);

           $('body').css({
               backgroundImage: imgSrc
           });
       }); 


If that’s not alerting anything, then maybe jQuery isn’t loaded?

nice javascript/jQuery discussion :slight_smile:

i would like to join please let’s continue it here http://www.sitepoint.com/forums/showthread.php?t=695453

What does the {module_webapps,5007,l,1} translate to in plain English: the module, image id, size and ?? ?

Maybe a mod or admin can merge the threads?

Hi Salathe,

In this particular case the {module_webapps,5007,l,1} translate to:

<a href="somelink.html"><img src="/images/someimage.png" width="" height=""></a>

The {module_webapps,5007,l,1} string is from Adobe Business Catalyst and breaks down to my web app ID 5007, l for list view an 1 to display one item pr list view page. The only thing that is important though is the output that it writes, wich in this case is the one above.