How to search very very fast using PHP & MySQL fast?

Hi.

I want to create my own search engine using PHP & MySQL and i have 10 gigabytes of files i want to create a search engine that can search very very fast capability.

How do you do this? what theory? what techniques etc…
Do you have samples or links to show me?

thank you very much in advance.

-warren

very very fast using PHP & MySQL fast?

so you want it fast then? lol.

Mysql runs as fast as it runs - if you want it to be fast, you should keep your queries to the minimum, and have mysql on a server with alot of memory and processing power.

It depends.

Are you just going to store keywords, chunks of text, word counts, etc. etc. Once you decide on that, it’ll be easier to help you decide on a basis for your search algorithm.

This should probably be in the MySQL forum since that is where the performance is really going to be defined.

Fast… let’s see.
C, file mapping, pointers, threads.
That should give you a good searching speed.

I believe he is talking about what type of search algorithms and the like should he use in order to get faster searches. I think it’s appropriate for it to be in this forum.

Someone correct me if I am wrong because my experience is mostly limited to MySQL, but if you have lots of data and want FAST search, MySQL is certainly not the fastest?

What database would be the fastest

Oracle could be argued to be a bit faster, but not significantly. Definitely not enough for a small company to pay the large cost to purchase an Oracle license.

I’ve always used MySQL and it performs extremely well. We used to run a server for our database and it would perform a crazy amount of queries with very good speeds.

It depends on what you’re searching for, also optimization can go quite far but only to the extent of the hardware resources. If it’s a commercial level then consider hardware upgrades as well as optimization.

Depends on what you are willing to search. Eg., for text searches have a look at MySQL fulltext and [URL=“http://www.sphinxsearch.com/”]Sphinx, the latter being harder to set up and use, but faster.

The way you pre-process and index the data will be critical. For example when you do a google search the Googlebot doesn’t do a lap of the internet checking every page for your search term in real time. Your search would take a month to complete.

Instead Google has a massive index of data and a bucket load of servers to give you sub-second results.

When you add content to your system it’ll need to be analyzed and have meta data produced so the search doesn’t really have to wade through 10GB.

I also have a perception that PostgreSQL might be faster for big data sets, but I can’t back that up with any evidence, just my impression.

Having a lot of RAM will definitely help though. MySQL (and presumably other RDBMSs) loves RAM.