Hi,

For those of you that have built bulk importing functionality into applicaitons, do you normally report to users non-mandatory but important data that is missing? If so, do you report the specific fields that are missing data or do you simply query the database for only those records that are missing important data?

I have import functionality that I have built in an individual incomplete data filter that inserts each missing data column per imported user into a flag_this_new_incomplete_data table, but my concern is that when using this filter causes the import time to triple, where the time is not prohibitive just not as responsive than without the filter. The filter checks each row of imported data after each column of data is inserted into their respective relational tables so this method:
PHP Code:
        $this->o_LeadManage->group_id $this->checkType($this->group_id);
        if(empty(
$this->o_LeadManage->group_id)){
          
$primary_incomplete_data '';
          
$primary_incomplete_data "$this->first_name $this->last_name";
          
$primary_incomplete_data 'Lead: ' .$primary_incomplete_data ' group';
          
$this->incompleteData('Should this lead be assigned to a group(s)?'$primary_incomplete_data, array('group'));
        } 
And the method:
PHP Code:
  protected function incompleteData($name$primary_data$incomplete_column_names){
      if(!
is_array($incomplete_column_names)){
        throw new 
exception('Third argument needs to be an array, 'typeof($incomplete_column_names) . ' given.');
      }
      if(empty(
$primary_data)){
        throw new 
exception('Second argument was empty you need to supply a string. Typically 
            this is the most important data in a database row. For example, if the table was 
            postal_codes then a good fields primary data would be that of postal_code'
);
      }
      
$col_count count($incomplete_column_names);
      if(
$col_count 30){
        throw new 
exception('You have exceeded 30 incomplete columns. Maximum columns allowed are 30.');
      }
      
$sql="
        INSERT INTO flag_this_new_incomplete_data
        (
          primary_data_field
          ,name
          ,col1 
          ,col2 
          ,col3 
          ,col4 
          ,col5 
          ,col6 
          ,col7 
          ,col8 
          ,col9 
          ,col10 
          ,col11 
          ,col12 
          ,col13 
          ,col14 
          ,col15 
          ,col16 
          ,col17 
          ,col18 
          ,col19 
          ,col20 
          ,col21 
          ,col22 
          ,col23 
          ,col24 
          ,col25 
          ,col26 
          ,col27 
          ,col28 
          ,col29 
          ,col30
        ) VALUES ( 
          :primary_data
          , :name
          
      "
;
      for(
$i=0$i<= 29$i++){
        if(!empty(
$incomplete_column_names[$i])){
          
$sql .= ", '" $incomplete_column_names[$i] ."'
          "
;
        } else {
          
$sql .= ", NULL
          "
;
        }
      }
      
$sql .= " ) ON DUPLICATE KEY UPDATE primary_data_field = :primary_data";
      
$stmt $this->o_Db->prepare($sql);
      
$stmt->bindParam(':primary_data'$primary_data);
      
$stmt->bindParam(':name'$name);
      
$result $stmt->execute();
    } 
This generated results like:

User: Gloria Jones, Generally for mail, email and phone, first name is important. Should this lead have their first name?
User: Gordon Mathes , For reporting purposes is having this leads gender set important?


The alternatives to this approach is to rely more on running an SQL SELECT to find all records that have missing x, y, z however this approach does not leaned itself to getting the more detailed 'missing data' results like the current filter; however it is faster.

The other alternative is to not report any missing data and let every user be updated when it is decided that they are missing information.

Your thoughts on how you handle this type of information is important.

Steve