Bit of background first. I have a custom IRC bot which logs conversations in a channel. It stores the logs as .txt files in directory on a web server. It’s simple and easy to quickly review events from a known date and time.
What I’d like to do is create a very simple way of searching through this collection of text files, so users and staff can quickly and easily search for a specific word, and get a list of results. This whole thing is very much a hobby project, so a pre-existing solution would very much be preferred. Though I’m not afraid to cobble something together if I have to. Would anyone know of a solution that might be viable?
I don’t know about other languages, but ColdFusion (and most likely Railo and BlueDragon) has “collections” that will do exactly what you want.
If CF9 or earlier, you’ll have a choice of Verity or Solr collections. As of CF10, Adobe let the Verity contract expire and switched to strictly Solr.
You can build a collection (which allows full text searching of documents like txt, pdf, doc, xls, etc., but also can be used on a database) in the CFAdmin panel, index it, optimize it, and make it available to your application or website via CFSEARCH tag.
Lucene will do this pretty easily. The trick would be to store the data in lucene too so you can show it when you retrieve it.
Friends don’t let friends use coldfusion but SOLR in and of itself could also work but it is a pain to setup. Elasticsearch might be easier to get going. Long ago, for a sitepoint contest, I built a lucene based PM searcher that might give you some ideas. See https://vbulletinpm.codeplex.com/ for the sources.