SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Enthusiast
    Join Date
    Jul 1999
    Location
    Jacksonville, FL
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Loading Massive Fixed Width File

    Here's the deal. I have a fixed width file with roughly 1500 columns (it's from a vendor). I need to load this file into a database (I'll worry about normalizing later -- I just need to get this puppy into a DB so I can do some analysis). I have the file layout.

    How in the heck can I use the file layout to help me create the table and import the data? I know there must be a way, I just don't have any clue where to start. I have access to MySQL and can script in VBScript if that helps -- but I'm sure there must be an easier way.

    An example of a line from the file layout:
    Code:
    AA	 0025	0130	Organization Name
    0025 represents the start position and 0130 represents the field length.

    Any help would be ENORMOUSLY appreciated!!!

  2. #2
    Drupaler bronze trophy greg.harvey's Avatar
    Join Date
    Jul 2002
    Location
    London, UK
    Posts
    3,258
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Could you not import it in to MS Excel? Then you could spit it out again as a CSV and MySQL will swallow a CSV easily.

  3. #3
    The doctor is in... silver trophy MarcusJT's Avatar
    Join Date
    Jan 2002
    Location
    London
    Posts
    3,509
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Baldrick
    I have a plan so cunning, you could stick a tail on it and call it a weasel!
    Here's how it goes...

    - Create a new table using ADOX
    - Read the layout file and split each line into the four constituent parts using RegEx
    - Assuming no extra carriage returns, the number of lines in the layout file will equal the number of fields (which I will call "N"), so dimension 4 arrays to be of size N-1 (minus one because of the 0-based array index). I'll call these arrays "name array", "length array", and "data array".
    - For each line, use ADOX with the captured values to dynamically add each field to the table with the correct name and length (data types might need some thought, but if they're all text then there's no problem!). At the same time, add the field name and field length to the correct position in the name and length arrays as created above.
    - We should now have an empty table ready to accept data
    - Load the (empty) table into an (updateable) recordset.
    - Open the input data file (use an ADO.Stream object for maximum efficiency) and read a line at a time into a variable.
    - For each line of data, set a position counter (C) to 0, loop from P = 0 to N-1, grab the length of the field (L) from the length array (at position P), read that number of characters from the current position (using MID), and store the substring in the data array (at position P), add L to C, and then use the Recordset.AddNew method to add the data to the table in one fell swoop by using the field name and field data arrays as parameters.
    - At this point we should have reached the end of the data file, so just clean up/close your used objects and you're done!

    Any questions?

    Quote Originally Posted by Blackadder
    Baldrick, you wouldn't know a cunning plan if it danced naked on a table shouting "Cunning plans are here again!"
    Last edited by M@rco; Jun 3, 2003 at 12:42.
    MarcusJT
    - former ASP web developer / former SPF "ASP Guru"
    - *very* old blog with some useful ASP code

    - Please think, Google, and search these forums before posting!

  4. #4
    SitePoint Enthusiast
    Join Date
    Jul 1999
    Location
    Jacksonville, FL
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Marcus,

    Thanks -- I was afraid that's what you were going to tell me to do.

    Wish me luck (I'll be wishing for an easier way)!

    Shawn

  5. #5
    The doctor is in... silver trophy MarcusJT's Avatar
    Join Date
    Jan 2002
    Location
    London
    Posts
    3,509
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It's really not that hard. I could (probably) write it in about an hour, in about 100 lines (or less). It just sounds complicated because it's in English, not code. I've already done what is (arguably) the hardest part for you, so give it a go - we're all here to help.

    PS - Good luck!
    Last edited by M@rco; Jun 4, 2003 at 05:14.
    MarcusJT
    - former ASP web developer / former SPF "ASP Guru"
    - *very* old blog with some useful ASP code

    - Please think, Google, and search these forums before posting!


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •