The J2EE guy still doesn’t get PHP

Tweet

Following on from Friendster’s switch to PHP it’s interesting to see the flame war that’s played out here (Jeff does a neat round up of links here ).

After various comments to the effect of “well clearly Friendster don’t know Java” think this put an end to that line;

We had not one but TWO guys here who had written bestselling JSP books. Not that this necessarily means they’re great Java devs, but I actually think our guys were as good as any team.

Without wishing to add more fuel to fire, it seems that there are some J2EE developers out there who just don’t “get” PHP. Which is interesting in itself as what’s so hard to grasp about PHP? At some fundamental level there seems to be a difference in mind set between the J2EE guy and the PHP guy that means (at least in one direction) one can’t grasp that the other approach also works (and may actually work better).

CGI by Dummy

Perhaps the easiest way to say what PHP does on a web server (as an Apache module) is to compare it with CGI (everyone understands CGI right?).

Side Note / Disclaimer: I’m not the best qualified to talk about PHP request lifecycles, performance and scalability. PHP really needs Sterling, George and Rasmus to get together and write a detailed paper on how it works and why PHP scales so we can all live happily ever after.

My take on CGI is a request lifecycle looks like;

1. Apache receives request for page and sees it needs processing by CGI

2. Apache forks a process to handle the request (incurring overhead)

3. Whatever CGI binary performs it’s startup (more overhead)

4. The CGI binary processes the request an delivers a response to the browser

5. Process / CGI binary “dies” – return to step 1

With PHP (as an Apache module) it’s almost exactly the same except that PHP runs in-process meaning there’s no overhead for forking an external process and much less work that needs to take place in terms of PHP “startup”. In other words we’re talking only steps 1 and 4 above. Attempted to explain further here.

By contrast, an application running as Java servlet is memory resident. Where a PHP developer stores session information in a file or a database, a Java developer may put it in memory. That’s an important point to understand the difference between the Java and PHP paths.

Found a fairly useful paper from 2000 Performance Comparison Of Alternative Solutions For Web-To-Database Applications which discusses in some more detail. It comes out in favour of servlets when considering performance and with key phrases like “Since servlets are written in the highly portable Java language and follow a standard framework, they provide a means to create sophisticated server extensions in a server and operating system independent way.” be warned that here’s someone who’s read the label.

But here’s the big point;

Scalability != Performance (!!!)

Unfortunately the author of The PHP Scalability Myth got that wrong and I’ve been seeing comments to the same effect on Friendster.

Yes the subjects are related but scalability is more about what happens when you add more resources and how that increases the volume of requests your application an handle. See Wikipedia on Scalability. Typically (if you didn’t plan in advance) you start thinking about scaling when performance starts to drop off due to increased load.

That a Java servlet performs better than a PHP script, under optimal conditions (e.g. plenty of free memory) is nothing to do with scalability. The point is can your application continue to deliver consistent performance as volume increases – can you maintain performance by adding hardware, for example?

In other words “This page takes 0.5 seconds to complete it’s response. Can we preserve that performance with another 500,000 hits a day?” (scalability) is what we’re interested in not “This page takes 0.5 seconds. How can be reduce that to 0.1?” (performance).

Generally people talk about two types of scalability – vertical (adding new processors, disk, memory to your existing “big box” or buy a “bigger box”) and horizontal (add extra “boxes” and distribute load between them).

Vertical scalability is easy to implement but generally more expensive long term. There’s typically a limit to how much memory, for example, you can add to your existing “box” and the “next box up” costs three times the price but only increases capacity by 20%.

Horizontal scalability takes more effort / cunning but can prove extremely successful, as mused in The Secret Source of Google’s Power – build a “super computer” out of dirt cheap parts. For filesystem (and by extension database) replication across multiple systems there’s generally a range of mature solutions out there to choose from, many Open Source. Memory replication is another story (now go back to that important point up there).

Who do you trust?
One of the comments here pointed out that “J2EE can run in a cluster”, taking care of memory replication for example.

Right here is where you need to ask “what is Java?”. Is it just a programming language? Or is the runtime + libraries an Operating System?

Reading Twelve rules for developing more secure Java code, tips like “Make your classes nonserializeable: Serialization is dangerous because it allows adversaries to get their hands on the internal state of your objects.”.

What?!?

This security tip simply does not compute in PHP – it’s all or nothing. Those I expose my objects to I implicitly trust (ignoring RPC for a moment). Sure there are some PHP frameworks out there but you wont find hosts providing them as a service.

There in lies my own fundamental problem with the “Java way” and app servers like JBoss. They seem to reinvent a whole bunch of wheels (particularly where replication is concerned) which are already well charted territory. And J2EE is only about four years old.

As Rasmus keeps saying when talking about “shared nothing”, PHP delegates all the “hard stuff” to other systems. Apache (which I think it’s safe to say can be trusted) takes care of handling requests, forking children when needed. Tools like Squid are recommended by Rasmus for balancing the load.

When it comes to session data I’m much more willing to believe in filesystem or database clustering than J2EE clustering, the key points being maturity both in the mechanism and the tools that support it (for sysadmins). There’s also some serious weight going into Linux clustering which presents another path for PHP while other alternatives like MSession or memcached (pecl::memcache) also exist.

An amusing read is Why Java Sucks For Sysadmins with some very valid points like;


java.io.FileNotFoundException: somefile (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(FileInputStream.java:64)
at sun.tools.jar.Main.run(Main.java:186)
at sun.tools.jar.Main.main(Main.java:904)

vs.


Warning: file(somefile): failed to open stream: No such file or directory in /home/hfuecks/scripts/reader.php on line 10

Over on O’Reillys Java SysAdmin(!) section the fun continues with Job Scheduling in Java.

Hmmm…


$ man cron

“4th Berkeley Distribution 20 December 1993″

Mind set Discontinuity
In responding to Rasmus’s comment on shared nothing / infinite horizontal scalability, someone called Mark wrote;

“Rasmus, your post is the very reason _not_ to use PHP. You’re pushing session state, inter-process messaging, and application state off to a database. For Friendster’s sake, I hope they have a huge clustered Oracle instance, because once they exceed the capabilities of their database, the site will fall apart.

JSP is far, far more efficient than PHP when it comes to taking load off the database, for the very reasons you mentioned above. Sandboxing every request is inherently a mistake, because to do any sort of OO would require that you load up the user’s profile every single time you hit a page.”

The typical PHP approach would be to store a user’s profile as part of the session data, which contains everything relevant to that session – you load this once per request and populate any necessary objects with it. Sure the DB call (if that’s what you’re using) is overhead but it’s manageable overhead. Further DB calls (e.g. fetching content) can be eliminated by smart use of caching.

George provides an excellent discussion in Scaling Oracle and PHP which I think highlights a key difference in mindset between a PHP and Java developer. Take this advice for example;

“If the average page in your Web application contains nine images, then only ten percent of the requests to your Web server actually used the persistent connections they have assigned to them. In other words, ninety percent of the requests are wasting a valuable (and expensive, from a scalability standpoint) Oracle connection handle. Your goal should be to ensure that only requests that require Oracle connectivity (or at least require dynamic content) are served off of your dynamic Web server. This will increase the amount of Oracle-related work done by each process, which in turn reduces the number of children required to generate dynamic content.

The easiest way to promote this is by off loading all of your images onto a separate Web server (or set of Web servers).”

This tip is specific to an application and the environment it’s running in. I’d argue that PHP developers think about applications this way; each one is unique and, when it comes to scaling, a unique set to solutions is required.

Meanwhile the Java approach is looking for a “one size fits all” solution – something that will take you away from specific solutions like this and give you an environment you can “fire and forget” in. While that’s an admirable goal, it requires (and has) reinvented a lot of wheels and requires the fundamental belief that software can be mass produced and still meet requirements. While thinking that way the PHP approach seems out of mental range for the J2EE guy.

Ultimately think this a showdown between “Process / Fork” (LAxP) vs. “Runtime / Thread” (J2EE / .NET). When asking “does PHP scale?” you’re really asking “does Process / Fork + X persistent store scale?” . In many ways that questioning whether *Nix scales…

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • stephen

    I was under the impression that creating an extra process and having them talk to each other was orders of magnitude slower than creating a new lightweight thread and communicating in-process.

  • NativeMind

    I had the Java vs PHP debate at work a while back. You know what got management on my side? I told them I would have the stuff done in half the time if I developed in PHP.

    I said the same thing to a Java buff who said he could code just as fast in Java.. I told him I could do a SOAP client in about 30 seconds in less than 10 lines of code.

    His reply “Oh, that’s because you use libraries! [PEAR SOAP]” I told him so does he with every import statement.

    That was the end of the conversation. He never took me up on that challenge.

    Btw, the worst Java performance example has to be Macromedia about a year ago. Their download page even said download may take several minutes to start… (I couldn’t believe it)

    I do like Java, but most of the time it’s not the right tool for server side web development in my opinion.

  • somebody

    Java vs C++ benchmarking right here:
    http://www.sys-con.com/story/?storyid=45250

    You still really think that php faster ?

  • Jaxn

    I agree with the first poster about the time to develop. I think that is what is so exciting about PHP frameworks maturing. The frameworks are taking cutting development time in half or more from an already quick development cycle.

  • http://www.sitepoint.com/ mmj

    As somebody who uses PHP but who has very little experience in writing Java for web applications, the “Share Nothing” characteristic of PHP often annoys me at times, though I appreciate its benefits.

    If pressed to quote its benefits, I would say that a major benefit of it is that it suits the HTTP protocol, which is not designed for state preservation. It is easier to visualise and understand the way a PHP request works, because a PHP script has that 1:1 relationship with the HTTP request – one HTTP request causes one PHP script. Another benefit would be that shared data may make it easier to leave gross security holes in your site, though of course there are other ways of leaving gross security holes in your site with PHP.

    However, if asked before now I would have listed PHP’s share nothing characteristic as primarily a frustration, preventing me from caching data in memory. Thanks for presenting me with the benefits of this approach – you’ve made me look at things in a different way.

  • http://boyohazard.net Octal

    I’m not even going to pretend to understand half the stuff in the PHP v Java debate, I am still trying to navigate a path from competent (intermediate) to advanced PHP knowledge. Needless to say, your post here does make sense to me in many ways one of which I have found especially true; ‘Within each problem lies its solution’ or perhaps another point of view would be ‘using the right tool for the job’

  • neko

    Nice article, now I have a name to put to PHP’s “lack” of application-state – “shared nothing”!

    This really made me think more about more issues with scaling and performance, mainly not trusting entirely to a language to achieve your goals. Which is sorta consistent with unix philosophy as I know it – using lots of specialist tools together…

    I guess Java’s like a hammer, and when you’re holding a hammer, everything starts to look like a nail.

  • George Schlossnagle

    Hi Harry. My trackbacks are broken due to some operational issues here, but here’s my analysis on why PHP scales: http://www.schlossnagle.org/~george/blog/archives/269_Why_PHP_Scales__A_Cranky_Snarky_Answer.html

  • Phil T

    I agree, especially about the horizontal scaling points you make. I would extend your article to include not just PHP, but all LAMP programming. I’m primarily a Perl guy, but have used PHP quite a bit as well. Your arguments apply very well to mod_perl or plain old Perl cgi stuff as well.

    One thing that should not be underestimated, of course is the ease with which you can write insecure code in Perl or PHP. That might not be a problem if you are building Friendster, but could be a huge problem if you are building Citibank.com. I think most of the “built-in” java security is just security through obscurity, but keeping out all the script kiddies is a good start.

  • csshsh

    please? since when are stacktraces not usefull? i agree that java developers just dont get php or any other more simple language sometimes.. but the same goes for php developers with java..

  • NativeMind

    You can use Turk MM Cache, it has a shared memory function to keep things in memory between requests.

  • http://www.lastcraft.com/ lastcraft

    Hi.

    There is another way you can scale and that is application complexity. Suppose you want to build transactions over multiple databases, possibly from different vendors. You then want to push several thousand transactions through a minute. Think you can do that in PHP? PHP is great of doing a large number of a fairly simple task. Fortunately that is also the bulk of the web market.

    The discussion here is mostly about middleware/app serving. Java is forced into this this type of system because the JVM takes an age to load. You have no choice but to work with in memory systems if you’re background is Java.

    Another complexity point is development complexity. Java has the best persistence libraries around for example, with PHP only just starting to enter this area. I also find that SOAP just does not currently work with PHP. We were forced to switch to XML-RPC for our project because of SOAP library flaws and poor documentation. PHP has a very long way to go with library support.

    No one way is better. I can see why a site like Friendster would do better with PHP, but it is horses for courses. Java will continue to hold the high ground here and that’s fine. Give me the mass market anyday.

    Now ehere does that leave ASP?

    yours, Marcus

  • http://mgaps.highsidecafe.com BDKR

    I built a cluster for a Lottery operation down in Venezuela in the summer of ’01. It was based around the then Turbo Cluster 6 from Turbo Linux. Besides Turbo Cluster, we use VB, PHP, and MySQL in a heterogenous environment.

    Anways, I looked into employing msession sometime on late ’02 or early ’03 but found that it has no fault tolerance mechanism. As far as I can see that’s the only thing that is holding it back from being a dead serious killer app in a shared nothing environ.

    We had to employ MySQL for session management for VB and PHP. And while this worked, we would regularly see spikes in load just before post time for races and lotteries. Also keeping in mind that our business (client base) was growing dramatically and every process required a run back to the db, the writing was on the wall. The db (as everyone knows) was/is going to be first serious bottleneck to overcome. Msession (in addition to putting the database on it’s own network) would’ve been great first steps in dealing with that looming performance ceiling, but without a fault tolerance mechanism, I couldn’t take it seriously for our application.

    Anyway, I agree with something a poster stated earlier. It’s something that John Lim intimated recently on his web log as well.
    “…,mainly not trusting entirely to a language to achieve your goals”. We had to use a variety of different things to acheive our goals (you don’t know the half of it!).

    Perhaps those that are talking about a certain language lacking the ability to scale have taken a “Golden Hammer” view of the language they’ve chosen to employ?

  • Andrew Phillipo

    I have to say that this discussion is really a very strange one, in which people are trying to say that PHP scales better than Java. The arguement(s) listed above seem to talking about methodology behind how people implement Java systems rather than any limitation in being able to, for example, schedule a Java process from cron.

    The problem I see is that many people think that EJB = J2EE – let me set the record straight; there are very few justifiable uses for EJB (read Expert One-on-One J2EE Design and Development By Ron Johnson for more details).

    So why can Java developers not create sites in exactly the same patterns using the same ideas above with plain old JSP files and maybe a few Objects? We can use squid and other servers to deliver our image too.

    What your really saying is why develop proper OO software/middleware in Java when I can use a scripting language be it Perl or PHP and a Database, oh and we can load balance the webserver which makes it scale GREAT! Fantastic – but you can do that in Java just fine, using exactly the same technology and enjoy JIT compiled code and FASTER speed.

    Oh add the other great points of writing well designed object oriented code and using design patterns for maintaining code, nevermind the sort of persistence frameworks that would blow a PHP programmers mind (see JDO or Hibernate).

    In conclusion Java can use the advantages of PHP where necessary such as load balancing and Squid and Oracle! What you are saying I believe is that there are a lot of Java features that PHP doesn’t have. Like proper threading support. And JIT compiled code. And Object relational mapping techniques. Oh and proper object support. And new tech like Aspect Oriented Programming. And XDoclet. Etc. Etc.

    I’m not saying that PHP isn’t scalable – it is – but why argue that it is more scalable than Java when any Java based site can do load balancing and any of the other techniques you seem to think are PHP only.

  • http://www.manisharma.s4u.org phpsharma

    If we know the basics. There is much differnce in using them. But i will cast my vote to PHP.All are same but synatax difference.
    http://www.manisharma.s4u.org

  • http://www.phppatterns.com HarryF

    Just taking responsibility…

    You still really think that php faster ?

    No – in fact I said it’s slower. PHP got a reputation for being fast from the days when Perl/CGI was the normal way to build dynamic web pages. PHP brought performance to acceptable levels (and it’s still acceptable). Other technologies will likely be faster (in fact pretty much everything except CGI; mod_perl, mod_python, ASP.NET, JSP). But repeat performance != scalability.

    I was under the impression that creating an extra process and having them talk to each other was orders of magnitude slower than creating a new lightweight thread and communicating in-process.

    Again performance != scalability. Anecdotal evidence suggests that Apache 2.x (usng threads) under Unix is no faster than when forking processes. On Windows Apache 2.x’s threading has made a big difference though.

    I agree, especially about the horizontal scaling points you make. I would extend your article to include not just PHP, but all LAMP programming. I’m primarily a Perl guy, but have used PHP quite a bit as well. Your arguments apply very well to mod_perl or plain old Perl cgi stuff as well.

    and

    So why can Java developers not create sites in exactly the same patterns using the same ideas above with plain old JSP files and maybe a few Objects? We can use squid and other servers to deliver our image too.

    Very true – for “shared nothing” seems like a sane way to do things – make the receiving point for requests as lightweight as possible.

    And I’m not trying to say Java can’t scale (by no means). What’s puzzled me about some of the comments to the effect of “PHP can’t scale” is the mindset of those saying it. Was really trying to understand why “the J2EE guy” can’t see that PHP can be made to work under heavy load.

    I also find that SOAP just does not currently work with PHP.

    Likewise. NativeMind has a point on the one hand, about ease of development, but when I start to look carefully at todays SOAP libraries, I get nervous.

    please? since when are stacktraces not usefull?

    For developers sure but not for sysadmins. They pretty much violate the philosophy of Unix tools and text based output. The philosophy I’m talking about is like;


    $ ps -ef | grep httpd | xargs echo `awk '{print $2 }'`

    Which gives you a space seperated list of PIDs, which can then by parsed by another tool (perhaps to check the mem use of Apache processes). Outside of Unix this approach is much used or understood but it’s very powerful for sysadmins.

    Now where does that leave ASP?

    Unsupported by MS ;) Actually the “application state” in ASP (like mod_perl / python) is potentially another opportunity to shoot yourself in the foot. PHP gives developers a lot of free room to do silly things (esp. security wise) but it’s got “shared nothing” owned methinks.

  • Joel

    Scalability from a business stand point errs heavily on the side of application maintainability. If your client base escalates beyond your predictions, then it is a fairly simple and speedy task to purchase new hardware and migrate your existing application. This applies to both platforms, therefore speed issues can be ruled from the argument.
    Being the small time web app developer that I am, the issue of scalability primarily involves the addition of new features into an existing system. The object oriented paradigm seems to be the most logical tact when you’re dealing with functionally scaling systems, and therefore one would choose to use a language which has had O-O roots from the beginning. Java does this extremely well, far better than PHP in my opinion. I may be a little outdated with this assumption, but the last time my eyes crossed to how PHP implimented O-O, I was shocked to find that there was no way to enforce data encapsulation. No private member variables, were allowed. If breaking one of the very first rules of O-O was not enough to bring the days half digested breakfast to taste, it was also noted that PHP offers no control over threading. An instance of a PHP application serving a large concurrent client base may very well result in a huge number of forked processes provided the coding was poor enough.

    Just my 2 cents.

  • Gordon

    I worked as a sysadmin for a couple years in a J2EE company. We ran on a pretty modern IBM-style architecture, with WAS (WebSphere), DB2, etc. I can confirm that the long stack traces were a big pain for me to work with. I think you really underestimate the problem. Those 5-liners aren’t that hard. It’s the stack traces that go hundreds of lines back to tell you about some wacky CORBA exception that may or may not be actually a problem in the application server. When you see one, you tend to see hundreds of those in a row. Imagine thousands upon thousands of lines of unintelligible stack traces that you want to try to search and understand…

    I worked very closely with these stack traces and stdout/stderr output from the app server to help the dev team find some really heinous system library bugs. In the real world, when the dev team is pressured to release the new product, there’s not much time for them to go poring over these things, so unless there’s a sacrificial sysadmin like myself, they just get ignored and filed away in a bug report.

    On the positive side, I believe some of the newer JVM’s / WAS allowed you to tune the number of child lines in your stack traces, but there are still major readability problems in most stack traces.

  • ChrisM

    I can’t believe the quantity of cluelessness and misinformation I’m seeing posted on here!

    Saying that PHP inherently scales better than Java is a staggeringly naive statement. There are plenty of pros and cons to the language you choose but scalabilty isn’t one of them.

    Scalability depends on your application architecture. Neither Java nor PHP force you to build your application a particular way. If you build something that won’t scale, blame it on the design not the language. Just because Java’s servlet spec offers you the ability to store large amounts of session information in memory does NOT mean you have to use it.

    I’ve built many sites over the years using a variety of technologies. Some of those sites have had to deal with enormous loads. Sure we hit scalability problems, but without fail the source of the bottleneck was either a configuration issue (eg database connection pool settings needed tuning), or we needed to rethink our application design. We sure as hell didn’t need to switch languages!

    eBay serves over 1 BILLION very dynamic and often user-specific pages each day. It’s built entirely using J2EE.

    There’s nothing to see here, move along.

  • http://mgaps.highsidecafe.com BDKR

    PHP has shared nothing “owned” because it has no other choice! I’m not knocking PHP, but it is (as in shared nothing) the only obvious way to do it.

    Just so happens that people are figuring out that Shared Nothing is the easiest (as in KISS) way to building a system that can deal well with a high load.

    Am I knocking PHP? Not at all. I love it. I’m just stating something that’s obvious.

  • NativeMind

    I for one actually find PEAR SOAP to be very well off and so far I have not had a problem with it in enterprise development (talking to you know what, a java web service!).

    The lack of documentation lead me on some discover-yourself exercises, but in the end it is working out OKAY.

    FYI, rpc encoded style works great. multiref tags don’t work, and you can’t write a client that does HTTP digest authentication (but you can do that on the server no problem). I haven’t worked with attachments yet, but if you have something big you should probably just send a SOAP message with a URL to d/l th e file from anyway.

    Of course, I’m in an enterprise and we control both the client and the server… internet applications aren’t quite so lucky :|

  • http://www.phppatterns.com HarryF

    Scalability from a business stand point errs heavily on the side of application maintainability. [...] Being the small time web app developer that I am, the issue of scalability primarily involves the addition of new features into an existing system.

    Very true. That view of scalability is one I’ve entirely ignored here, mainly to keep the discussion focused. In reality think that’s far more important for most people than whether your site can cope with X million hits a day.

    I may be a little outdated with this assumption, but the last time my eyes crossed to how PHP implimented O-O, I was shocked to find that there was no way to enforce data encapsulation. No private member variables, were allowed.

    With PHP4 that’s the case. PHP5 adds support for private, protected, public (see here) but that’s the future.

    My own experience has been enforcing data encapsulation is less of an issue in PHP; other issues like use of globals and procedural code that’s not designed to be extensible are more critical. 99% of PHP code “out there” has the source code freely available so the “burden” of responsiblity can be reasonably transfer to the developer extending the app. An interesting side effect of no enforced encapsulation is it encourages developers to avoid many generations of inheritance; aggregation or composition is typically preferred by those applying OOP in PHP – probably a good thing.

    An instance of a PHP application serving a large concurrent client base may very well result in a huge number of forked processes provided the coding was poor enough.

    One thing I should have perhaps made clearer is it’s actually Apache forking children, PHP running in the children. PHP itself is not responsible for managing forking (Apache already does a very good job).

    Saying that PHP inherently scales better than Java is a staggeringly naive statement.

    That’s not what I’ve been saying (of course not). Was trying to understand why certain developers with a Java background have a fundamental assumption that PHP can’t scale.

    Scalability depends on your application architecture. Neither Java nor PHP force you to build your application a particular way. If you build something that won’t scale, blame it on the design not the language. Just because Java’s servlet spec offers you the ability to store large amounts of session information in memory does NOT mean you have to use it.

    Very true. In fact that’s the point George makes.

    eBay serves over 1 BILLION very dynamic and often user-specific pages each day. It’s built entirely using J2EE.

    Well not quite. There’s still a fair amount of ebay running on that C++ dll e.g. http://cgi3.ebay.com/aw-cgi/eBayISAPI.dll?MemberSearchShow. Hopefully those migrating eBay to J2EE will publish extensive case studies.

  • Stephen Kestle

    And I’m not trying to say Java can’t scale (by no means). What’s puzzled me about some of the comments to the effect of “PHP can’t scale” is the mindset of those saying it. Was really trying to understand why “the J2EE guy” can’t see that PHP can be made to work under heavy load.

    It’s people who follow hype and have a narrow mind-set. Java has them. PHP has them. .NET has them. MS has them. MacOS have them. Pretty much anything that people like or use have them.

    For me, I would have to have a good reason to implement a solution in a database at all (see prevayler.org) – it staves off the scalability issue for a long time (why scale if your app runs 1000 times or more faster than one on a db)? Not without it’s harder side.

    But more in response to the question: the J2EE guy has a complicated system to build. He has business logic that no database is going to be able to take care of. He doesn’t just want his data going on the web. He wants to be able to unit test his business logic, and wants to be able to find out what uses some data before he refactors (and object or database) ( see eclipse.org ).
    He just doesn’t get how you could do stuff of any complexity in PHP, or why you’d want to try to maintain something in that way.

    And then he thinks that all [web/business] applications are complex => PHP sucks.

    (Alternatively he takes his ideas for complex system management, and assumes they are the best for simple systems).

  • Sam Joseph

    Very interesting debate. As someone who has to maintain legacy code in both PHP and Java, I am interested to see that one important aspect of using PHP or Java is not being discussed: Maintainability.

    Naturally the discussion is about scalability, but that is not the only issue to consider when making a choice between these two approaches. As a Java programmer of 8 years I have been very pleasantly surprised about how much faster it is to get things done in PHP. Not so many worries about types, lots of simple to use support for common web site activities like handling file uploads etc.

    It seems clear to me that when it comes to a web application, it is much faster to prototype it in PHP than in Java. Although a sophisticated IDE like Eclipse or Idea can narrow the gap with things like code completion.

    However when it comes to maintainability PHP can be a nightmare. Imagine, “ah here I have a function call in PHP on some object, I want to find where this function is defined – arg! – there’s no way to find out without adding a debug statement to print out the type of object, and then searching for that class definition which could be in any file in the system!”. In Eclipse with Java, I can right click to “Open Declaration” to get immediately to the implementation of anything. Not only that, I can use the refactoring tools to make changes that don’t break the code I am maintaining – very useful when I’m moving through someone else’s unfamiliar code. Add to that the much better support for unit-testing in Java than in PHP (comparing JUnit and PHPUnit), I have to say that maintenance of PHP code is a bit of a nightmare compared to Java.

    Now this can be fixed with widescale adoptions of aspects of php5, such as the use of exception handling in a future PHPunit and the Eclipse PHP plugin being developed further to support code refactoring and lookup, but that might take a couple of years in our open source world.

    In the meantime, if you offer me some legacy code to look after, I’ll probably choose Java over PHP anyday. If you want me to build something from scratch for a deadline, then I’ll go the PHP route – at least if there is some degree of doubt about the projects long term future.

    Just my $0.02

  • Steve

    Interesting comments throughout. My background is large scale websites with traffic patterns of 60,000 hits a day to the homegrown website. SugarCRM is the first respectable enterprise scale application written in php that I have seen. They created a very well defined pattern to extend the application via wizards, etc… I would guess they built a very extensive application just to extend their application so that developers didnt have to deal with the troubles of deciphering a large php code base and learn how it accomplishes resource management etc… In Java this is much simpler. 3 guys built Sugar in 1 year according to the history. I have built custom J2EE applications similar in size and feature sets to Sugar with teams of 5 to 30. The team size and cost of J2EE is really a factor of what the implementor wants to spend. The team of 5 was probably more effectie than the larger team. What I am getting at is a rbust J2EE application deployed on an App server like BEA has many advantages in management, tuning, etc… PHP from my experience is good for prototypes and small apps but the code base reaches a size where the perils of 2-tier approaches and scripting languages begin to take affect on the momentum of developing additional components. It wasnt too long ago people were big TCL fans building large scale enterprise sites in TCL , i.e KB Toys one of top sites for large traffic flow on the net with one of the highest trxn volumes for a retail website. It can be done but talk to folks there about maintaing and extending the code base. As them about training and ramping up team members. Large scale fits well with J2EE and lends you many more options for tuning and monitoring that just are not available in php

  • John Bush

    As a developer who worked on projects with both PHP and Java, I’d say the tool should really fit the project.

    In my experience, PHP is useful for building prototypes or smaller projects with a small team where reqs are very well understood.

    Once the code base gets very large and the number of developers grows, PHP or any weakly -yped language becomes unmanageable. As the reqs change, refactoring becomes very difficult without a compiler and OO.

    So, you may save time in the beginning, but if you are planning on maintaining and enhancing the system over a long period of time, you will pay in the end.

    Pay now, or pay later.

    The other major factor is the quality of the developers you use. Anyone can create crappy code in PHP or Java. To all you managers out there, a kiss ass developer can do the work of 10 crappy developers. Quality people make the difference, so fire the crappy guy and pay the kiss ass guy more.

  • Yuriy Krylov

    Here is some heresay. PHP and Zend is moving to position PHP (5) as a Java alternative for web applications of small to medium sized projects. Odds are it will succeed. When? That’s the million dollar question.

    Pls notice that I didn’t say “J2EE alternative”. And perhaps this is an important semantic distinction.

    J2EE is more than technologies written in Java. Let’s not forget that it embodies the experience and knowledge of a mature developer community. Best practices and patterns for application architecture have been brewing for years.

    Fortunately or unfortunately, the PHP community has alot of growing left to do. Jon Bush is absolutely right, quality of people does make all the difference.

    The sad truth is that the PHP arena is blagued by newbies who Just Don’t Know Any Better. A loosely typed language that does not enforce good programming practices and OO, PHP allows these newbies to get away with Programming Murder. The end result is code that is categorizes as a subset of the following set {not manageable, not extensible, not scalable, not testable}.

    This has nothing to do with PHP. The Java community went through Model 1 and Model 1.5 paradigms for JSP development before realizing the benefits of an MVC implementation. You won’t meet many Java developers in the web world who are not familiar with Struts or WebWork, more recently JSF, and the list goes on and on.

    I love Java because of Hibernate, because of Spring, because of JSF, because of MVC because I can always hire a grey-hair Java guru who knows whats up and understands large-scale n-tier architecture better than me. I hate Java because code-compile-deploy-run cycle is beyond my short attention span. And yes, the learning curve for J2EE technologies is much higher than that for PHP.

    On the other side of the railroad tracks is pure developer laziness. If you still cram all your business logic into a functions.inc.php than odds are you will never find the code you’re looking for a few months down the line. But even this is an unfair statement.

    How big can the application get?
    What is the estimated lifetime of the application?
    How often do requirements change?
    How many simultaneous users will hit the application a second?
    How many transactions do you wish to support per second?
    How dynamic is the content?
    How large of a data store are you maintaining?

    The point is that software development should be an Art Form and for each canvas one must choose the right brushes and the right paints .

    PHP can scale. Your Java can not scale. IMHO Friendster moved to PHP not because PHP scales better than Java but because the architecture they invested in did not allow their Java to scale, and I have to point a finger at the Architects, shame shame shame.

    There is absolutely no reason for a live system to require a complete language change and system overhaul for better scalability in this day and age. And yes, Friendster did not scale: a noticeable effect was performance degradation. Can you imagine how much VC money it must have cost them to throw away their Java impl and go to PHP? What a waste.

    The bottom line is that PHP or Java is not a culprit, you the lazy developer are. Not to say that laziness in developers is bad. The difference is whether or not you want to be lazy at the beginning of the project and work really really hard to clean up your crap later, or whether you want to invest in due dilligence now and be lazy a few months/years down the line when your app is booming.

    I prefer the latter and I love PHP. I use it for most small-medium sized projects and am prototyping a huge project in PHP. But yes, I am Unit testing. Yes, I am using an MVC impelmentation (Phrame). Yes, I do have application tiers: App->Controller->Biz->Integration->Datasouce. Yes I do use caching to reduce trips to the Database and constant object creation. Yes I use Domain Modeling techniques and strict OO. Yes, I use DAOs. I like loose coupling, I believe that objects should do a small amount of things but do them extremely well. Yes I read a shit load of best practices and knowhow books written for Java and then apply the lessons hard-learned by the Java community to PHP development because Java and J2EE rocks and in turn so do my PHP applications. Oh, and yes they will scale to the extent that my design specifies.

    cheers =)

  • Rev Prez

    Anyone want to speak on PHP v. Java concerning deployability?

  • Yuriy Krylov

    As with most things in Java vs. PHP, deployment is more complex with Java simply because you are deploying into a container. Ant and Maven are the industry standards for managing deployment-related tasks for Java but have a learning curve. Another words, add Ant or Maven to your already long lists of thing-to-know when dealing with Java.

    Deploying PHP applications may be as simple as copying files to a Web-server accessible directory and perhaps modifying some configuration files.

  • Pingback: Gosling Didn’t Get The Memo [@lesscode.org]

  • Pingback: JavaStation Builder NetWork 2006 » Digg PHP’s Scalability and Performance

  • Pingback: Realm of the Abacus » Blog Archive » Soapy pears and pearls

  • Pingback: FuzzyBlog » Blog Archive » Recommended PHP Reading — Thank You Harry

  • Pingback: kbglob - tecnologia para geeks, no para tu mamá » PHP vs J2EE, mucha gente enojada

  • Anonymous

    The J2EE Guy Still Doesn’t Get PHP

  • akrabat
  • alenmilk

    George also had a speech why PHP sucks. Hehe. I think that PHP sucks in many ways. And that J2EE is well thought out and more complete framework. But with PHP 5 things are getting better for PHP. PDO is getting there. And yes, it is so much easier to make something that scales better in PHP than in java. And it takes also no time in comparison to do it. Is it maintainable? Well, it is scalable. It can be maintainable but you must put some effort into it. Without any frameworks and methologies it is almoust too easy to write code that is not worth saving. So if you use one week to produce something, and come back to look at it a year later you just throw it away. But who is to blame? I would say this: big frameworks == bad scaling
    light or no frameworks == nice scaling
    Hmm. And reverse that for maintainability. So i think people are arguing over that java has more frameworks and abstractions and takes a lot of time to learn. This is pitted agianst the fact that PHP doesnt have a million super advanced frameworks and has just simple libraries that make you develop fast and scalable solutions. So it is the Java way against the PHP way. In performance and scalability the PHP way wins I think. Does PHP still suck? Yes, in some ways it does. Do the libraries lack maturity. Yes, some. But it is getting better. PHP is getting there for the medium sized projects. Can you do big project in PHP. Yes, but without strict discipline and good architecture it can become hell. Is PHP better than java, yes. Is java better than PHP, yes. Is this bickering pointless? yes

  • jbaou

    As a computer science graduate I find it ridiculous trying to make a point about debugging output (file open example). Not to mention that the java debugging info does NOT reveal the internal directory structure …

    I have to admit though, that I have used J2EE in both my Bachelors and Masters projects just because I knew that even my tutors wouldnt stay up endless nights to find bugs in my code.

  • XXX

    Java, PHP or others… aren’t important. The quality of code is more important.