The industry has been changed to protect the innocent. Any similarities are coincidental as I made them up. The numbers are real. The issue is using two separate databases to run a web application.
An ambitious entrepreneur has seen a business opportunity to help organize angry protesters gather for protesting. He hired developer A to develop a detailed scope of the project. It is a 95 page .pdf document. Developer A is gone. The project is already at least 6 months late. We will call it proteststuff.com .
Developer A sold the entrepreneur on the idea of using Sugar CRM as the backbone of their business operations. For those who aren't familiar with sugar its a high power php/mysql customer resource management tool. The idea is similar to Highrise. It has allot of powerful features for tracking customers, customer relations and goals.
The project manager hired Developer B to set up sugar and integrate it with their promotional website. The promotional website outlines what proteststuff.com can do for you. One of the sugar tables has 100k protesting groups targeted for sales already populated. The sales team uses this to track customer relations with the targeted groups.
Developer B and a Developer C integrated a small portion of the promotional site to sugar. A feature where the protesting group can sign up by searching their group. When they sign up the information is pushed to sugar. Meanwhile the entrepreneur is traveling the country talking to protest leaders about his services tracking everything in sugar.
Developer D was hired to kick off the protester specific site. When a group signs up for service they recieve their own sub domain name like cuboulder.proteststuff.com . The protester specific site was built with a custom framework. All that was laid out was was the basic navigation and user permissions. Some protestors are admins some are organizers some are volunteers.
An admin can do anything, the organizers can set up protests and ask for volunteers. The volunteers can choose what protests they want to attend. Everything is tracked including protesting hours, locations, dates. The custom framework is much like what you would see with smarty/zend. Its a templating MVC framework. Very clean, OOP. Everything is done on a LAMP platform.
I am developer E. My job is to take what developer D started and add all the features in the project documentation. I have been pretty isolated from the other developers and their portions of the project. Developer D started the project as its own independent application. It is not tied to the database table with 100k groups.
Halfway into the completion of the protester specific site I find out that the entrepreneur wants to track certain pieces of information from sugar. They don't want to track all the protesters, just the leaders. This isn't too much of a problem I can push any changes to the sugar database inside my classes. However sugar is so huge that there is not a reasonable way to modify it push changes back to the protester specific site. Not a show stopper.
Each protester group has its own profile table. Every call in the framework uses the subdomain to keep any information to the specific group. Every row in every table has a protester_group_id associated with it to keep the groups separated in their own domain. The protester_group table contains the profile and contact information for any group that has an active account.
I have found out that they want the protester specific site to essentially run from their already populated table of 100k groups. The framework was built in such a way that every single page call is instantiated by parsing the subdomain and getting the group id from the subdomain.
I can go in and modify the framework to connect to sugar and essentially make queries to two databases for each page call. So two database connections at all times.
My first problem with this is now the application is coupled with sugar. One cannot run without the other. Portability is down the drain. More importantly at what point does this create a huge overhead on the server? Realistically each group might have 50 users using the site at peak time. However there may be 100 groups (or more if they are successful).
Under these circumstances you could have 5000 users that have to sort through a table of 100k entries for every single page call with 2 database connections on top of the already existing queries (updating profiles, adding events, sending out emails).
This smells bad to me however I've never ran into this issue before so maybe I'm making a big deal out of nothing. Is this as bad as it sounds to me? Will they experience performance issues? Assume the tables are indexed properly.