wwb’s statement “Friends don’t let friends use datasets” made me wish we had a “.NET Best Practices” sticky in this forum. Perhaps if we get enough responses and ideas, they will make it one.
So, what are some lessons you’ve learned or “best practices” that you would recommend to people new to .NET or some of us that have been out of the .NET loop for a bit?
A big one for me I think that people coming from other languages do is use String concatenation rather than using the StringBuilder.
1. StringBuilder will be much better for performance over string concatenation.
Though I would already put some caveats in number 1 there. Mainly that using a StringBuilder makes sense if there is an unknown number of additions, where given a known number of additions using normal concatenation is no less expensive then using a stringbuilder.
2. Thou shalt not use exceptions to manage program flow. Throwing an exception is a very expensive operation–basically the most expensive thing short of calling external resources. Think of an exception as making the computer beep twice and pause for two seconds, then think if you actually want to use one in that place.
3. Thou shalt wrap calls to external resources, such as configuration values or session variables in strongly typed properties.
4. Thou shalt not use DataSets in web applications.
5. Thou shalt wrap any disposable items in a using statement
i.e. using(SqlConnection cnn = new SqlConnection())
6. Thou shalt employ and ALWAYS adhere to naming conventions!
7. Thou shalt not deploy a live site when compiled in ‘Debug’ mode.
8. If you HAVE to throw an exception, thou shalt use throw, not throw ex; or thy stack trace will be reset.
9. Thou shalt disable viewstate on any control that does not need it.
10. Thou shalt always check a variable for null.
11. Thou shalt use HtmlTextWriter when outputing html from code, not Response.Write.
12. Thou shalt use Server.Transfer, not Response.Redirect when moving to another url programatically.
13. Thou shalt only use user controls when it is required on more than one page.
14. Thou shalt turn off session state support if session state is not required.
15. Thou shalt use caching.
16. Thou shalt not hard code paths in code. i.e. string appPath = “c:\ est\blah.txt”;
[b] 17. thou shalt never catch general exception, though shalt use the appopriate exception type
i.e.
try
{
SqlConnection cnn = new SqlConnection(BadCnnString);
cnn.Open();
}
catch(SqlException ex)
{
// now handle it gracefully
}
I got plenty more, but better get back to work, will post more later
[/b]
18. Thou shalt store any application data (connection strings, etc.) in the configuration/appsettings section of the webconfig and reference it accordingly: System.Configuration.ConfigurationSettings.AppSettings[“myKey”]
19. Thou shalt not create a variable if it is only used once.
21. Thou shalt catch all exceptions before they are thrown to the end-user. After all there is a nice HttpApplication.Error event that is just for that purpose. And give user a nice error page, not the ASP.NET generic one
Or dispose them in finally section of you exception hadling statement. (Especialy for the VB.NET people) And instantiate them as late as possible and dispose them as early when you don’t need them anymore. (mainly for DB connections, images, streams)
Also don’t forget to turn off in web.config:
<compilation [B]debug="false"[/B] />
Off Topic:
Thou shalt not overgeneralize: “12. Thou shalt use Server.Transfer, not Response.Redirect when moving to another url programatically.” is IMHO not good, they can be both used and have their distincitve usages.
Now, I’m always a little provoked when faced with absolutes. wwb has several times floated the idea that datasets are generally unwanted in web applications. I assume it’s because of performance concerns.
There are situations where a dataset may actually both improve your program structure and improve performance, unless you go to extreme length with your plain C# objects.
One such example is when you have a hierachical entity which is stored flattened in the database. This could be an agenda of a meeting planning system with several thousands or more agendas. You wouldn’t want to go to the database to find the “root” item and after that find the next level (sub items of the root item) and go on like that recursively. That would generate a database chatty application. Chatty apps is the most common source of scalability/performance problems, especially when the database server is accesses across a network. Rather you would like to get all the items of the agenda at once (since you know you’ll be using them all), and arrange them in the structure within the app. The dataset is ideal for this: You can define relations, switch of index generation while loading, and then simply walk the sub-item relation, using views with orderings. In my experience this scales a lot better than the chatty alternative. Now you could read all the items into business objects, locate the root and try a custom indexing/searching/sorting, but you would have a hard time keeping up with the optimized dataset.
Another - related - example is when you need to display a hierachy of records from multiple tables. In SQL server 2000 you could have only 1 reader open per connection. Walking a hierachy with only one reader can be a bit of a challenge without in effect making several roundtrips to the database. Also you might be tempted to get around the single reader problem by using multiple connections, one per concurrently open reader. With 3 tables participating this will effectly cut your connection pool to 3rd the size, having profound impact on the scalability of the app. Again the dataset can be used to load the relevant records into several tables using a command batch (single roundtrip - no chatting). Utilizing crafted relation walking the hierachy is almost a walk in the park.
In general - if used prudently - the dataset has the capacity to actually improve the performance of a program in the scenarios where you must scramble several tables - or even datasources - and make some structure of them before displaying.
Also the dataset can be great for capturing the business activity (business transaction) before the actual update is to take place.
In short I find the outlawing of the DataSet a little unbalanced. It’s not all black and white.
Now, my post about the DataSet was not meant to encourage its use in web apps. Most web pages have much simpler display requirements, and will work just fine (and faster) using data readers.
Also, if you are using an extensive system of business objects and have coded (or code-generated) custom collections to support eventual queries, the dataset has no place. DataSets essentially promotes the 2-tier approach (I’d call it 2½ tier) where the app “knows” about the database structure. This is perfectly valid for smaller apps, but for more complex apps you should always build business objects. ORM mappers are great ways to have both query capability, database abstraction and automatic caching.
22. Thou shalt initialize default values of page and control members as attribute values or in Page_Init, not in Page_Load.
Initializing variables in page_load will add viewstate pressure, as viewstate tracking has begun at the load time.
23. Thou shalt disable viewstate for controls that does not explicitly need to maintain viewstate.
This is especially true for display-only pages with no interactivity.
24. Thou shalt set EnableSessionState of the @Page directive to false if your page does not require access to session state.
Doing so will allow the app server to not spend cpu cycles on deserializing/serializing the session state before and after execution of the page handler.
25. Thou shalt set EnableSessionState of the @Page directive to ReadOnly if your page does require read access to Session state but does not change any values.
Doing so will allow the app server to schedule more requests in parallel and free it from serializing session state after page rendition.
26. Thou shalt minimize the number of roundtrips to the database server
For the ultimate of scalability and low latency you should only read from the database when nessecary (e.g. not reading from the db if all controls’ state can be safely restored from viewstate). When you do access the database, you should consider reading more data at once, saving later database access. This could preferably be done in a command batch (supported by SQL server).
Yes, that would be a problem, and beginners to object modelling (and object loading) would cause major problems with this. But begginners in anything cause problems. It’s not an object vs relational/dataset issue . It’s a lack of experience issue. It’s also not going to extreme lengths. It may seem challenging if you haven’t done it before, but it’s very basic.
If you have a customer and they have a number of subscriptions and each subscription is to a service and those subscriptions need to be send out on certain days of the week (this example is from actual system).
The customer db object has a method Load that is called with the db resultset as a parameter and goes through each row and loads the data for the customer, or calls load on the Subscription object and adds that to the SubscriptionList object via the subscription db load method. It also calls the Load method of the SendDays which loads its own data. This gives you an object model instead of a relational model which is easy to manipulate and easier to imagine when thinking about the problems, as it’s a logical object model instead of a relational one. And is loaded with a single database call.
I’ve never needed to open a second connection for a query while another query was running (unless to another database). But more generally, I think this goes back to the first issue, which really was the concept that going to a database shouldn’t be done for each part of a problem, it should return all the data needed for the use case scenario in one call - in the same way that any Service Oriented Architecture should avoid chatty calls over a network.
Yeah, its there because it makes life easier. It’s not logical to ignore something like that.
My main problem is that it returns relational data rather than in a logical object model. This makes you think in terms of data rather than in terms of behaviour. A system does stuff. It doesn’t just hold data. And using data as you do in DataSets means it is harder to understand the conceptual entities in the system because the entities are data tables, not objects with behaviour.
I use data readers if processing large records for batch processin because the cost of creating millions of object is too much of a performance hit. You also can’t use datasets for the same reason. For smaller things, I use objects for the above reasons.
I really find that datasets are only useful for very small apps where you don’t need to have a conceptual model of the system becuase it’s so simple, and where performance is not something you need to think about because a dataset keeps all the data in memory.
Absolutely. Datasets are ideal for simple apps that you want finish and ship super fast.
Btw, excellent point on chatty interfaces. People often talk about chatty interfaces in larger scale architectures, but miss the point that accessing anything on another machien (including the database) is exactly the same.
Very valid points. My main reasons for discouraging the use of datasets is that they tend to be overkill for most situations one sees on the web. Your two examples are very good cases where a Dataset would be a reasonable compromise. You also touch on another reason–using a dataset almost by definition encourages the data tier to bleed over into the presentation layer. Finally, beginners tend to overuse them because they are so fast and easy with the drag-n-drop tools. If you have ever inherited one of these apps to fix or integrate, you know what kind of nightmares you are talking about.
So, just like every one of the commandments, there are a number of exceptions to the rules. One should just understand all of Leviticus and Deutoronomy before one starts using the exception.
As for massive data processing, one should be able to generate large SQL updated/insert/delete statements to handle sweeping changes to sets. If one is calculating values on a per-row basis and requiring a cursor, one really should rethink how they are using that bit of SQL.
And I don’t ignore the dataset. I abuse the living daylights out of datasets as temporary storage for non-web projects where one tends to be dealing with less data and definitely far fewer concurrent users. In fact, in the non-web world, the disconnected data model of the data set is a very handy thing.
Thou shalt use Server.Transfer, not Response.Redirect when moving to another url programatically.
Any particular reason for this? In terms of user experience, redirect is more friendly. I think it really is a situational thing.
Thou shalt only use user controls when it is required on more than one page.
Again, any particular reason for this. I find user controls to make sense from an architectual perspective. Also note that I have a slightly different view than most here because I mainly work on apps for my current employer which I will have to live with for years rather than a love-em-and-leave-em consulting gigs.
Thou shalt catch all exceptions before they are thrown to the end-user. After all there is a nice HttpApplication.Error event that is just for that purpose. And give user a nice error page, not the ASP.NET generic one.
Very good point, though a bit difficult to follow. Depending on how you read it, it could be constructed to say “catch all errors and hide them” which is about the worst thing one can do. If your data-driven web app starts having database connection issues, you really should error out before corruption occours. In any case I think what you are really mean is:
Thou shalt bubble exceptions where appropriate to a global error handler which shall wrap the exception in a pretty way for the user and log/notify the administrators.
I would also extend #26:
Thou shalt minimize the number of roundtrips to external resources.
Database server can just as easily be a webservice call (which can take 10x as long).
And I will add one (which summarizes alot of recent points):
I must admit that I have completely left Transfer out of my arsenal, except for very, very special cases.
I know the redirect generates an extra roundtrip, but it merely incurs the overhead of the http request itself. The serverside code that must be executed will be the same. This extra overhead IMHO is outweighted by the potential problems with Transfer. Among the issues are:
It confuses the address line of the browser. Bookmarking a page to which the server transferred will bookmark the previous page url, creating usability problems, confusion.
It tends to make the browser “back” and “refresh” buttons behave in an unintuitive way, e.g. if I use the Transfer back to a “list” page after a details edit page, the refresh button will try to resubmit the last edit.
It can interfere with standard browser statistics programs like webhound. The actual url read from logfiles will not correspond to the pages actually shown.
You can not trust the securing of a single page (by placing it in a restricted folder). Pages that transfer to another page will bypass the folder restriction. While this can certaintly be used to fine-tune security it can most certaintly also result in poor overview of your security structure.