Importing Content into Drupal

Tweet

One of the most painful steps in any web system is importing content — especially so with Drupal. It is something that I have always dreaded. Images of CSV custom scripts and line by line parsing of data or awkward SQL commands come to mind. Yuck!

That is all in the past. I am happy now. I recently discovered a module that is a crown jewel in the Drupal Contrib community — Feeds.

Feeds Module

Feeds module is useful for many things (RSS/ATOM/OPML aggregation and importing, PubSubHubbub support, CSV importing and more) but for now I will just be focusing on the CSV import functions. It is quite straightforward to set up.

The very fist step is to actually install the module into your Drupal site. In the upcoming Drupal 7 release this will be as easy as clicking — Drupal 7 supports web based installing of modules. For Drupal 6 we don’t have such luxuries, it is manual all the way. So first, download the module. A quick note, the module does not yet have an official release (it is at 6.x-1.0-alpha14) but in saying that, it works very well. Once downloaded, unpack it and move or upload it to your /sites/all/modules directory. Visit your Admin module area (/admin/build/modules/) to install it.

Now it is installed and that was the easy part! There is a bit of clicking ahead, but trust me. It is worth the effort.

First, quickly create a new Feed Importer (/admin/build/feeds/create) and give it a name (this will become the URL to access the importer). Now there are a number of steps which are highlighted with the following image:

Steps to CSV importing

  1. Basic Settings: Make the following changes – Attach to content type change to ‘stand alone form’, minimum refresh set to ‘never’.
  2. Fetcher: Set this either to direct file upload or HTTP retrieval.
  3. Parser: Choose CSV parser.
  4. Processor: Choose Node processor.
  5. Node Processor Settings: Choose the content type you are importing into. You can optionally set existing nodes to be updated and also if you want the imported content to expire
  6. Create Nodes Settings: Here is where you map the CSV column header to the field name of the Node.

That is it for setting up the importing process. A number of steps, but they really are quite simple. Now you need to actually import the content. Visit the importing page (/import) and click on the name of the importer you created above. Upload the CSV file and Feeds module will immediately start work at creating your nodes. If you made a mistake in any of your mappings you can easily delete the last imported content and start again. This has saved me a number of times whilst I am in the testing phase.

This is only tapping a small piece of what this module can do. So I encourage you to try it out.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://www.clearwind.nl peach

    To be honest I havent tried the feed module yet but I’ve always used this module for imports:
    http://drupal.org/project/node_import
    It supports many CCK field types nowadays it even supports image field, which has always been tricky.

    And also the new kid on the block, which has gained a lot of credit:

    http://drupal.org/project/backup_migrate

  • koyama

    I was astonished some time ago that there were no really good tools for importing content from an old site into Drupal.
    Accidentally, I discovered the impressive “migrate” module by mikeryan and moshe weitzman which came to the rescue for a large part.
    http://drupal.org/project/migrate
    Will definitely check out the “feeds” module to see what it offers.
    @peach: Did you mean “migrate” rather than “backup_migrate”

  • http://www.lunadesign.org awasson

    David,
    Thank you for the article about Drupal. Currently I’ve got several (six) fairly large Drupal sites nearing completion and once finished, I’m planning to contribute a few articles to SP on Drupal because I really think it has been overlooked at the SP community.

    This sounds like it might come in handy but I’m not sure. It seems the websites I’m working on are such drastically different beasts from the original websites that I might not be able to even use it. Lately when I’ve been tasked with creating the architecture for the site, I’ll stub out my pages (blank) and then the copy comes in, in Word or ODF. The only CSV stuff I deal with is contact & membership information which gets handled by the CRM or whatever member system we build to handle it. On the rare occasion that I can use original copy, I’ll scrape it off, cleanse it and paste it in the new site which isn’t as painful as it sounds as it’s usually limited to less than 100 pages. I can manage transferring the raw content within an hour or two.

    Anyway, thanks for contributing a Drupal article and I’ll definitely have to have a look at this module.
    Cheers,
    Andrew

  • mmatsoo

    Two questions spring to mind:
    Is there taxonomy support to categorise/organise the imported content?
    Is there some kind of filter to clean the content of “unnecessary” HTML?
    I agree with Andrew that often the original content isn’t exactly what you want in the new site anyway. At least from the perspective that originally a lot of it was probably copy&pasted from Word and there can be tons of that msoNormal and inline styling that shouldn’t be there.
    I’ll have to give the module a look.

  • Dan Carr

    I too support using Node Import: http://drupal.org/project/node_import

    Like “peach” said, the image import and cck import features are great and work well. I’ve also managed to get node_import working with importing product to my ubercart drupal site that I’m developing.

    I’ve used feeds a little bit before, but just for populating my twitter feed through the rss link.

  • http://davidseth.net/ David Peterson

    @peach – Feeds also supports all CCK fields but not sure how it specifically handles the CCK Image Field

    @koyama Yes, the Migrate module is great, especially if you tie it together with the Table Wizard module. An excellent post on importing data with this combination can be found at: http://www.lullabot.com/articles/drupal-data-imports-migrate-and-table-wizard

    @awasson Copy pasting into Drupal can be efficient, but it is still slow. You might try just creating a quick Google Spreadshet with a few headers and copy/paste your data into that. Then save it as a CSV and whack that into Feeds and it will create your nodes in a flash. I look forward to your Drupal contributions!

    @mmatsoo Yes, Fees supports taxonomy importing. I don’t believe there is any built in filtering, but it is *very* easy to create a filter. But of course you have to have some Drupal dev experience.

    @Dan Carr I have created a small (tiny really) module that adds the functionality of importing CSV data into Ubercart products, but out of the box it is not supported.

  • http://www.lunadesign.org awasson

    Thanks David,
    I’ll try that technique (spreadsheet + Feeds) with my next site. I can see how a little setup would make it ultra quick to whack it into Drupal.

  • Nadir H

    Hi,
    Currently its only mapping 1 by 1 , is there any way to map the various fields at the same time.

    Thanks for the help.