How to insert array of inputs from a csv into a table

Hi,

I am confused as to how i need to do this. Basically i am inserting the contents of a csv file into a table successfully. But i have the last bit left where i need to also insert a list of “authors”…

I have managed to do it in the back end of the website as the “authors” are just an array of input elements. However i need a bulk method so a large amount of records can be inserted… But if you look at this csv file:

http://www.prima.cse.salford.ac.uk:8080/~ibrarhussain/ConfPaper.csv

You will see “Author” in the end column. How can i run this in a for loop or something on the same lines and insert the author?

My csv code works to insert all but the Author columns as i don’t quite know how to go about doing this…

Can anyone point me in the right direction?

Thanks

Well, as far as my knowledge goes, you won’t be able to avoid an inner loop to deal with the authors. But you could probably save some performance by building a query that will insert all authors in a single query, instead of doing them one by one.

The function you already have should be possible to use as far as I understand… you just need access to it from the current script.

Are you inserting these into a SQL table?

As I understand it, your problem is that there can be many authors associated to any one line in the CSV-file?

As long as you know the number if ‘fields’ before the authors, and there is nothing coming after the authors on the line in the CSV, you should be able to fetch the different ‘fields’ that are not authors, and then be confident that the rest of the ‘fields’ are authors.

At this point you can go about putting them in the database in a few different ways.

The simplest way is to just put all the authors as a string in a single column in the same table as the rest of the CSV-file. This would not be very practical if you wish to search for the author names or edit them later though.

The second option is to put every author name in a new table, together with a number pointing to the ID of the row in which their associated line (article I’m guessing?) is.
In case any one author has written many articles in the CSV-file, then he will appear many times in this second table. That is not optimal, but might be acceptable.

The third solution is the most advanced, but probably the best:

Make one table for the articles, one for the authors, and one for the connections between them.

Each row in the articles table will have a unique ID.
Each row in the authors table will have a unique ID.
In the table for the ‘connections’ you will have one column for author ID and one for article ID.

For every row in the CSV-file, you insert a new row in the articles table, and fetch it’s new ID. Then for every author, you check to see whether he or she is already present in the authors table. If so, get the ID, otherwise, insert the author and get the ID.

Now, insert a new row in the ‘connections’ table, tying the author ID to the article ID.

I hope this helps somewhat, and that I did not misunderstand your question completely… :slight_smile:

Hey,

Thanks for your reply. :wink: I will show you the code i have currently for the csv. I already have 3 separate tables, one for the authors, one for the articles, and one for the connections…

Anyway, this is my csv code which currently inserts everything apart from the authors:


public function InsertCSVFileToDB(){

	$has_title_row = true;
	$not_done = array();

        if(is_uploaded_file($_FILES['csvfile']['tmp_name'])){
            $filename = basename($_FILES['csvfile']['name']);

            if(substr($filename, -3) == 'csv'){
                $tmpfile = $_FILES['csvfile']['tmp_name'];
                if (($fh = fopen($tmpfile, "r")) !== FALSE) {
                    $i = 0;
                    while (($items = fgetcsv($fh, 10000, ",")) !== FALSE) {

                        if($has_title_row === true && $i == 0){ // skip the first row if there is a tile row in CSV file
                            $i++;
                            continue;
                        }

                        $sql = "INSERT INTO ConfPaper SET
                                CPRid = ".$items[0].",
                                Pid = ".$items[1].",
                                CPtitle = '".mysql_real_escape_string($items[2])."',
                                CPabstract = '".mysql_real_escape_string($items[3])."',
                                CPspage = ".mysql_real_escape_string($items[4]).",
                                CPepage = ".mysql_real_escape_string($items[5]).",
                                CPfile = '".mysql_real_escape_string($items[6])."',
                                CPlastedited = now(),
                                CPUid = ".$_SESSION['Uid']."";
                        //die($sql);
                        if(!mysql_query($sql)){
                            $not_done[] = $items;
                        }
                        $i++;
                    }
                }
                // if there are any not done records found:
                if(!empty($not_done)){
                    echo "<strong>There are some records could not be inserted</strong><br />";
                    print_r($not_done);
                }
            }
            else{
                die('Invalid file format uploaded. Please upload CSV.');
            }
        }
        else{
            die('Please upload a CSV file.');
        }
    }

Now this works fine, but i now need to insert the authors. The authors WILL always be the last columns. The way i insert authors from the administration section is with this code:


public function insertAuthor($authArray, $PCorder=0)
    {
        $query = sprintf('SELECT Pid, Pname FROM People WHERE Pname IN(\\'%s\\') ORDER BY Pid ASC', implode('\\',\\'', $authArray));
        $result = mysql_query($query);

        $maxquery = "SELECT MAX(CPid) as max FROM ConfPaper WHERE CPRid = ".$_GET['CPRid'];
        $maxresult = mysql_query($maxquery);
        $max = mysql_fetch_array($maxresult);
        $CPid = $max['max'];

        if($result && mysql_num_rows($result) > 0)
        {
            $sqlValues = array();
            while(list($PId, $PName) = mysql_fetch_row($result))
            {
                if(in_array($PName, $authArray))
                {
                    $sqlValues[] = sprintf("(%d, 1, ".$CPid.", %d, now(), 0)", $PId
                                                               , $PCorder++ );
                    // Author already exists within the Pname table
                    // remove user from $authArray
                    $key = array_search($PName, $authArray);
                    unset($authArray[$key]);
                }
            }
        $sql  = "INSERT INTO PeopleCon(Person_id, PCHid, Pid, PCorder, PCdateadded, PCdeleted) VALUES \
";
        $sql .= implode(",\
", $sqlValues);
        $result = mysql_query($sql);
        }
        // If there are Authors left within the $authArray
        // Add them to the Pname table
        if(count($authArray) > 0)
        {
            People::insertPersons($authArray); // call insertPersons method for remaining authors
            $this->insertAuthor($authArray, $PCorder); // insert the remaining auhtors into PeopleCon
        }
    }

This checks to see if the Authors name already exists (Only a small number of people will be using the site), if the author exists it will insert the author ID into PeopleCon, if not, it inserts the new author into the People table AND also into the PeopleCon. So i just need to use this method and apply it to the CSV file upload… I don’t think i can call the same method can i?

I’m pretty sure i would need to use it within the csv insertion…

Can you help?

Thanks