SitePoint Sponsor

User Tag List

Results 1 to 24 of 24
  1. #1
    SitePoint Guru
    Join Date
    May 2003
    Location
    virginia
    Posts
    988
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Flat File or Database

    Hi,

    I'm working on a small cms. Originally I thought I'd just store everything in files and take advantage of the fact that a filesystem is tree based. The content would be stored in the same location as the url. A folder is a page, and the files are the properties. If there is a sub-folder, it's a sub-page.

    It seems very logical that the filesystem would hold this info. But, there are of course disadvantages. One that comes to mind is limited searching capabilities compared to a RDB.

    Also, one of the goals at first was to create something that didn't need a database.

    Creating an application to handle a tree structure in a database seems overkill when the file system just does it. Am I wrong?

    Any feedback?

    - matt

  2. #2
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally I thought I'd just store everything in files and take advantage of the fact that a filesystem is tree based. The content would be stored in the same location as the url. A folder is a page, and the files are the properties. If there is a sub-folder, it's a sub-page.
    This is how I do it at the moment actually, and it works without any real trouble. There are scalibility issues I suppose in some regards but for the type of work that I have to do, I don't face those issues...

    I know what is to be expected, so I work to them, but in saying that, provided that your hierarchacal structure is loosely coupled you could easily implement the alternative without too much problems I suspect, but again you need to decide in which approach to use (Adjacency List or Nested Sets for example, or another option...)?

    But going by your original specification (small content management system) I would to start with, use the file system until you get something concrete working, just to make sure the premise works. On the face of it, provided the implementation is designed and scripted well enough, there is little, if any, difference in making the switch from the hierarchy file system, to a database hierarchy system...

    I use the same base Interfaces for the file system, database and XML structures, which are all hierarchacal in nature you see?

  3. #3
    SitePoint Guru
    Join Date
    May 2003
    Location
    virginia
    Posts
    988
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That's what I was thinking. If I use a factory and common interface for all data objects, then switching from db to file etc shouldn't be hard. What about searching? Do you just provide query builder methods?:

    $pages = Page::findAll($where = 'title', $is_like='Company')

    ?

    Thanks!

    Matt

    ps, want to give me more details on what you are doing?
    Last edited by mwmitchell; Aug 24, 2005 at 17:34.

  4. #4
    SitePoint Addict timvw's Avatar
    Join Date
    Jan 2005
    Location
    Belgium
    Posts
    354
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    In my interface it would be like:

    PHP Code:
    $model->setFilter('title LIKE ? AND city=?', array('company%''chicago'));
    $data $model->get(); 

  5. #5
    SitePoint Enthusiast
    Join Date
    Oct 2003
    Location
    norway
    Posts
    92
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    @Dr Livingston: "provided that your hierarchacal structure is loosely coupled", okay..? And since when is a database "hierarchacal in nature"?

    mwmitchell, using files like this works fine for small things. But are you saying they shouldn't have any dynamics in them? These things are a lot easier using a fast database..

    Anyway, one quick-fix to your searchproblem is doublestoring all the data in a database (update it on insert,delete and updates), stripping out all the html and things while doing it. Each entry would have a link or whatever assosiated with it, which tells it which page to go to. All you need is 2 fields, text and link to the page it represents. Ofcourse, you could put all this in a file (eew) instead if you really can't use a database.

  6. #6
    SitePoint Guru BerislavLopac's Avatar
    Join Date
    Sep 2004
    Location
    Zagreb, Croatia
    Posts
    830
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Aphenitry
    ...if you really can't use a database.
    With PHP5, this excuse is about to become extinct.

  7. #7
    SitePoint Enthusiast
    Join Date
    Mar 2005
    Posts
    64
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    since you said files, I'd just be careful that you don't become a victim of success and start suffering if you have thousands of files that the kernel is having to navigate. This won't bother mysql but a few thousand files will start bogging down the kernel.

  8. #8
    SitePoint Zealot
    Join Date
    Mar 2004
    Location
    Australia
    Posts
    101
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You may need to lock the file during write, otherwise they can be corrupted. This comes free with a database. If possible, may be try sqlite, which is a relative simple database.

  9. #9
    SitePoint Guru
    Join Date
    May 2003
    Location
    virginia
    Posts
    988
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Aphenitry
    @Dr Livingston: "provided that your hierarchacal structure is loosely coupled", okay..? And since when is a database "hierarchacal in nature"?

    mwmitchell, using files like this works fine for small things. But are you saying they shouldn't have any dynamics in them? These things are a lot easier using a fast database..
    Hi, there will be dynamic data and it'll be using a database if one is available. But for pages and templates etc, it'll probably be file system.

    Quote Originally Posted by Aphenitry
    Anyway, one quick-fix to your searchproblem is doublestoring all the data in a database (update it on insert,delete and updates), stripping out all the html and things while doing it. Each entry would have a link or whatever assosiated with it, which tells it which page to go to. All you need is 2 fields, text and link to the page it represents. Ofcourse, you could put all this in a file (eew) instead if you really can't use a database.
    That's a great idea and I was thinking about something very similar. But I didn't think about putting in a file if a DB was not available.

    Thanks!

  10. #10
    SitePoint Guru
    Join Date
    May 2003
    Location
    virginia
    Posts
    988
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by BerislavLopac
    With PHP5, this excuse is about to become extinct.
    Yes, can't wait. MySQLite is so tempting for this project. It fits the bill perfectly. But we can't do PHP5!

  11. #11
    SitePoint Guru
    Join Date
    May 2003
    Location
    virginia
    Posts
    988
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by river_jetties
    since you said files, I'd just be careful that you don't become a victim of success and start suffering if you have thousands of files that the kernel is having to navigate. This won't bother mysql but a few thousand files will start bogging down the kernel.
    Can you tell me what you mean by success/start? You mean opening and closing files? Or just reading?

  12. #12
    SitePoint Guru
    Join Date
    May 2003
    Location
    virginia
    Posts
    988
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by wei
    You may need to lock the file during write, otherwise they can be corrupted. This comes free with a database. If possible, may be try sqlite, which is a relative simple database.
    Thanks, haven't implemented that yet but will now.

  13. #13
    SitePoint Enthusiast
    Join Date
    Mar 2005
    Posts
    64
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by mwmitchell
    Can you tell me what you mean by success/start? You mean opening and closing files? Or just reading?

    depending on a lot of different parameters, if you put 10,000 text files in a directory and start doing things like grep'ing through all of them as a search, you will run into scalability problems pretty fast.

    Even doing simple includes will start to get taxing.

  14. #14
    SitePoint Guru BerislavLopac's Avatar
    Join Date
    Sep 2004
    Location
    Zagreb, Croatia
    Posts
    830
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by mwmitchell
    Yes, can't wait. MySQLite is so tempting for this project. It fits the bill perfectly. But we can't do PHP5!
    Just a note: Don't get confused, it's not MySQLite (as in "a lite version of MySQL"). SQLite is a completely different database from MySQL -- all they have in common is that they both recognize a subset of SQL standard.

  15. #15
    SitePoint Guru
    Join Date
    May 2003
    Location
    virginia
    Posts
    988
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Right. Something I guess I really didn't realize. Maybe I need to start using PHP5 now?

  16. #16
    SitePoint Addict
    Join Date
    May 2005
    Posts
    255
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by mwmitchell
    Right. Something I guess I really didn't realize. Maybe I need to start using PHP5 now?
    I can think of plenty of better reasons why you should start using PHP5 now, but it's a start ;-).

  17. #17
    SitePoint Wizard dreamscape's Avatar
    Join Date
    Aug 2005
    Posts
    1,080
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    uhhhhhhhhh...... in case you didn't know, you don't need PHP 5 for SQLite.

  18. #18
    SitePoint Guru
    Join Date
    May 2003
    Location
    virginia
    Posts
    988
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Dr Livingston...

    You said that you use the same approach, what happens if you change the name of a property? Do you go thru all of the directories and change file names to match the new property name? Example:

    name (old)
    title (new)

    /data/about/name.txt (old)
    /data/about/title.txt (new)

    ?

    Matt

  19. #19
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Nope

    What I do is to replace the file if anything changes. I take one file from one directory, (source) and the other directory (target), drop the file to be replaced in the target directory, and move the new file that is the replacement, to the target, from the source directory.

    That way you do not have the problem of having to rename files, (and therefore, maintain the changes, and what all that implies) etc. To me, it made more sense just to start all over again.

  20. #20
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    All you need to do is to feed the source and the target directory pathnames into your common garden file recursion method, with the directory pathname of the new file.

    Once you recurse to the given directory, you drop one file, and move the other

  21. #21
    SitePoint Wizard
    Join Date
    Jan 2004
    Location
    3rd rock from the sun
    Posts
    1,005
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If you are searching for a pragmatic solution and search is problem consider using a soap request to Google. Searches <whatever domain> via google and outputs results into one of your own pages. Just get an API key. There follows some work to do paging <next 10> kind of thing. I know Google can pull the rug under it whenever they want, I used it nearly 2 yrs, with no problems.

  22. #22
    SitePoint Guru
    Join Date
    May 2003
    Location
    virginia
    Posts
    988
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for your replies Dr., would you mind posting some code for me?

    So far what I'm thinking is that each folder will contain a config file that states what is needed for the content. The folder may contain more folders etc. The config file will be manipulated via browser. So there is no content in the structure but pointers to content objects somewhere else. The content objects will be objects like Article, User, Nav etc. and have an associated class/object, template and attribute objects that will be over-ridable depending on the context.

    The content objects will have persistable attribute objects (title, body etc.). The content objects know how to manage themselves and will have a common API along with over-ridable templates. The attribute objects within the content objects will also have a common API and know how to manage themseleves. The content objects will manage rules/validation for the attribute objects.

    There will be 2 main states for the content objects. Viewing the standard output of the object and viewing the form output. The attribute objects are rendered in the same manner, and the content object takes care of telling the attributes which state they should display.

    Does this seem resonable so far?

    Matt

  23. #23
    SitePoint Guru
    Join Date
    May 2003
    Location
    virginia
    Posts
    988
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    A little inspiration here:

    http://www.gadgetopia.com/post/4247
    http://www.gadgetopia.com/post/4237

    And EzPublish. But EzPublish is way too slow, and uses a database!

    Matt

  24. #24
    SitePoint Wizard REMIYA's Avatar
    Join Date
    May 2005
    Posts
    1,351
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    There are some database classes that implement a tree abstraction approach to relational databases. I have seen some at phpclasses.org.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •