What is correct URL structure for DVD copy of website?

Hi. My company just decomissioned a blog (xyz.com) and my boss asked me to create an archive copy on the intranet. I ran a wget and got all the files in one directory, which I named xyz.com. I used sed to replace all instances of “http://xyz.com” with “http://intranet.mycompany.com/xyz.com/” in all *.html files because this was the path to the site’s new home. This worked fine.

He then asked me to put in on DVD and make it navigable. I’m having a little trouble coming up with the URL to use as a replacement. I know that I have to use “file:///” instead of “http://” because it will be an offline copy on a DVD. I put it in a top level directory named “xyz.com” and replaced “http://xyz.com” with “file:///xyz.com”. But when I created the DVD the actual path when reading the HTML files was “/media/CDRom/xyz.com/”. So now none of the links work.

Has anyone ever done this before? What structure do my URLs need to be in for the site to be navigable on a DVD? Thanks in advance.

What you want here is “relative” urls, rather than absolute ones. So basically, each internal link will be structured depending on the relationship in the folder structure between the page where the link is and the page being linked to.

For example, a link on the home page (index.html) to example.html in the root folder would look like:


But if example.html were in a subfolder called /docs/, then the link would be


If you wanted to link from example.html in this example back to the home page, the link would look like this:


(Those two dots and a slash mean “go up one folder to find index.html”.)

And if example.html were in a folder within the /docs/ folder, say /private/, then the link from the home page would look like this:


And a link from that page back to the index.html page would look like this:


… that is, “go up two folders to find index.html”.

So, this probably means that you’ll have a lot of work ahead of you to reformat all of the links in the site, page by page, but I’m afraid that’s the only way I know how to do this.

Hope that helps!

Here are some moor links that will help to clarify this:


Thanks for the reply. I should have mentioned that relative links were the first thing I tried. They didn’t work.

Try them again. They do work. :slight_smile:

Site-relative paths did not work on my company intranet and they would not work on a DVD.

Document-relative paths are not an option. I work for a large company and there are hundreds of pages involved in this task. I couldn’t possibly go through and calculate the relationship of every link within the site architecture.

I thought perhaps there was another way to accomplish this, with localhost or file:///, but I guess not. :frowning:

It’s true that relative paths would be a big job to set up. That’s the only real impediment, though, as they should work well on an intranet and/or DVD.

I know it doesn’t help much at this stage, but this is the reason to build sites with relative paths from the get-go.

Hmmm, I don’t like relative links much, as they’re a lot more work. Most sites don’t have to be translated like this … though, of course, if there’s a likelihood of that ever happening, relative links are better from the outset.

I tend to develop sites on a test domain, so I need relative links to make the move easy once I’ve finished. So within templates I am very much a supporter of relative links, but once the site is live I’m less fussed about the links within the content.

Me too, so I just use root-relative links, rather than include the whole domain.



That links works from any page.

Unfortunately, links like that are no good on your desktop or on a DVD.

Oops. Yeah sorry that is what I meant.
I didn’t know they didn’t work properly on DVDs - my mistake.

So can y’all come to Raleigh and tell my boss that relative links won’t work on a DVD??

I’ll treat for Carolina-style BBQ and beers. What do you say?? Melbourne is right around the corner, no? :wink:

Maybe you could package a lightweight server(s) on the DVD so it won’t depend on the OS?

I’m curious how you get this done now, as I can see it coming up in the future.

Have you looked at HTTRACK or [URL=“https://addons.mozilla.org/en-US/firefox/addon/spiderzilla/”]SpiderZilla?
I’ve seen them come up in discussions, and they seem to do all the link fixing for you…

I’ll come and tell him they DO work, but that he’d have to pay you lots to set it up. :lol:

But HTTrack does look interesting, as masm50 said.