Naturally the real csv file has many more columns than this, I just wondered if there was an easy way of doing it, rather than re-assigning the values by hand?
Sometimes there is a clash between SRP, JFDI and the clock and sometimes the wrong one wins, and I know it and I feel bad but at least I know why and I don’t care - so shoot me.
If have never needed to do this to a csv file before, in the unlikely event that I ever do then I will go back and refactor.
(just seeing that makes me more inclined to go back NOW… the force … is strong … I feel weak …I can resist…)
I mean I just know, know, know in my bones that I should be unit testing this too, but it has to get out the door and onto a website like, yesterday.
ps Thanks to all for your input on this one by the way, really interesting list of solutions.
I still think additional filtering should be separate from the object that composes the associative entry.
This particular method (or rather any method) should ‘do one thing and one thing only’ to quote Uncle Bob. In this case, it’s determining whether or not the entry is acceptable for the associative array.
It’s a code smell, definitely.
How would you change that filter? You have to alter the object internals, and well, you know how that goes.
Create a filter for each filter, which sounds kinda obvious when you say it like that, from an OO POV.
I’ll probably abstract away the hard coded match between item 14 and ABC, but am I accessing the 15th array item in the best way, or is there something obvious I missed?
There’s nothing stopping the class from bring rewritten to accept an SplFileObject in the constructor, but then you’d have to keep track of whether we’re looking at a CSV file and behave accordingly since pretty much everything else in there is tailored towards this particular, individual use case. That’s a detail for Cups to ponder, to see which he prefers.
I was thinking of doing something similar, but didn’t like the idea of creating the CSV and flags internally. However, the combination of the LimitIterator and FilterIterator is something I hadn’t considered.
I like that a lot.
Thanks again.
Anthony.
ps. If you weren’t so awesome, I think you’d be my current nemesis.
For what it’s worth, a minimalist version of Anthony’s AssociativeSplCsvFileObjectIterator might be of use. It uses a FilterIterator (to only accept rows with count(headers) fields) which extends IteratorIterator (which does the iterator of our SplFileObject without us having to redefine the basic iteration methods).
(The class name is just poking fun, probably best not to keep it for yourself.)
class MinimalistFilterIteratorVersionOfAnthonysAssociativeSplCsvFileObjectIterator extends FilterIterator
{
protected $_headers, $_count;
public function __construct($path)
{
// Build CSV file iterator
$csv = new SplFileObject($path, 'r');
$csv->setFlags(SplFileObject::READ_CSV);
// Remember column names and count
$this->_headers = $csv->current();
$this->_count = count($this->_headers);
// LimitIterator allows us to always skip the headers
parent::__construct(new LimitIterator($csv, 1));
}
// Returns a nice assoc. array
public function current()
{
return array_combine($this->_headers, parent::current());
}
// Skip this line if it does not have $_count number of fields
public function accept()
{
return $this->_count === count(parent::current());
}
}
$csv = new MinimalistFilterIteratorVersionOfAnthonysAssociativeSplCsvFileObjectIterator('data.csv');
foreach ($csv as $row) {
print_r($row);
}
And did not really down the page any further, but revisiting it I just noticed that there are some refs to the SPL, but they just use fgetcsv rather than an SPLFileobject.
It would have been interesting if there were some test cases for all this stuff.
That parsecsv class seems to want to emulate really simple sql statements, and may have been born from frustrations with formatting and encoding - and just seemed overkill to me.
I thought I’d mention it in case anyone else was searching on this subject.
Whoa, 3 very different answers there, thanks people.
@Anthony I was hoping there was an SPL -based solution after seeing your pastebin-housed SPL experiment a few weeks ago.
The initial target block of code you set works fine and seems to handle empty values as expected.
As I have a series of operations to do, that target code will allow me to quickly extract the initial values from a large csv file, to create the smaller one.
I might as well explain the whole thing.
1 I wget a largish csv file nightly ( ~4k rows) and cache it.
2 I extract only those rows of interest to me, around 100 rows.
3 Some of those text fields will then be checked for “transformations” ie turning email into <a></a> links etc, squirt it into a template.
4 The result is then duly cached for 24 hrs.
So you see, between steps 2 and 3 it’ll make much more sense for me, or anyone else using the data to be able to reference the array-elements by name.
Regarding the filtering, see FilterIterator, use it to wrap AssociativeSplCsvFileObjectIterator and only return the rows meeting the criteria you set within FilterIterator::accept().
Other than that, AssociativeSplCsvFileObjectIterator still needs some refactoring but the premise is there.
I’d extend or implement the SPL Iterator interface, assign the keys on the first iteration and return the set key from Iterator::key() on subsequent requests.
You can the use this with the standard SPL File object (which reads CSV’s) and read the data as you go.
P.S. Unless you really know and trust your CSV file, it’s probably worth checking the length of the $data is the same as the number of headers before combining.