I developed 2 ASP.NET applications using LINQ. One connects to MS SQL Server, another to some proprietary memory structure.
Both applications work with tables of 3 fields, having 500 000 records (the memory structure is identical to SQL Server table). The controls used are regular: GridView and ObjectDataSource.
In the applications I calculate the average time needed for each paging click processing.
LINQ + MS SQL application demands 0.1 sec per page change.
LINQ + Memory Structure demands 0.8 sec per page change.
This Is shocking result. Why the application handling data in memory works 8 times slower than the application using hard drive? Can anybody tell me why that happens?
The parameters involved in gauging performance are much too complex to be simplified as mere harddisk- and memory timings.
When reading 500000+ records, processor cache is largely irrelevant. You will constantly be reading new areas of the memory. The processor has other tasks to perform so between “pages” (from the client?) the cache will have been evicted and repopulated many times over with foreign data.
If the SQL server assumes reading in keyed order, paging will be really, <i>really</i> fast. LINQ2SQL understands .Skip() and .Take() operations which are used when paging is enabled and the data control is bound to a LINQ2SQL source. Typically SQL server will need a max of 2-3 disk accesses if the pages are not already in memory - which the “index” pages will be after the first access.
Contrast that with a memory (datatable?) structure where the objectdatasource will have to iterate through large part of the set to get to the correct selection. This is especially true if you are not using a structure with a linear layout, e.g. you are using a List of some type.
You should be able to outperform the SQL access if you keep the records in an array (which has a linear layout). Indexing a specific item from an array is considerably faster.
Sorry, I use LINQ to SQL when connecting to database. I understand what you saying, optimized SQL statement and indexing helps considerably. But (see my reply to Viflux) the performance gain should be quite striking then! Almost 10 mln times!
How do you read from the SQL server?
No, I don’t use LINQ2SQL, in both cases I use LINQ + ObjectDataSource.
I could provide anybody with both projects to look through. It may be a bit large to put the projects codes here.
I’m just surprised, 'cause as to my previous knowledge, the follolwing reading performance parameters:
Hard disk - 100-200 reads per sec
Memory - 10-20 mln reads per sec
When using processor cache, then memory read performance rises up to 100 and more mln reads per second.
The volume of the in-memory table I’ve got is about 2 MB (4 bytes x 500000records), so it should definitely fit processor cache when reading.
If ai’m not mistaken, then to overperform In-memory handling SQL Server should increase HD reading speed more than 1 million times. Is that ever possible?
It’s impossible to say without some knowledge of the “in-memory” data structure, and some knowledge of the means with which you performed the test.
Given that MSSQL has been developed over 20 years to fine-tune performance, and that a lot of MS technologies have been optimized with respect to their knowledge of MSSQL, it’s not surprising that it would out-perform.
Based on this scanty information I would venture a guess that you are comparing LINQ2SQL with LINQ to objects.
LINQ2SQL uses a neat trick with expression trees where it will actually generate an optimized (a generated where clause) SQL statement and send it off to the server. The server will only return those rows which satisfy the criteria. In essence you have offloaded the problem to the server. The server/database may even have indexes which may help the query on its way.
With LINQ to objects - unless you optimize yourself - LINQ will have no option but to iterate through the entire collection and evaluate the criteria for each item.