Thursday, January 28, 2010

SQL Query Optimization FAQ Part 1 (The SQL Plan)

SQL Query Optimization FAQ Part 1 (The SQL Plan)

Introduction and Goal
What is a SQL execution Plan?
Can we understand step by step how SQL Query plan is created and executed?
Ok, so where do I start from?
What is a Table Scan (Operator number 1)?
Can performance increase by adding a unique key for table scans?
What are physical and logical reads?
Table scan doesn’t look to be efficient?
What is a seek scan (Operator number 2)?
Can we see a seek scan practically?
We have2 types of indexes so do we have different operators for it?

What are heap tables in SQL
Server?
What is RID
Lookup heap in SQL query plan?

What is a bookmark lookup?
Lookups are costly operation, how can we eliminate the same?
What are covering indexes?

Do we have any other efficient way of creating covering indexes?
How does size of the index affect performance?How
does unique column increase index performance?
How
important is the order of indexes?
What is the preferred data type for indexes?
What are indexed views and how does it improve
performance?

Can we see practically how indexed views
increase performance?
How do we specify index hints?
Does index hints increase performance?

References
Detail links which discuss Physical and logical read

Introduction and Goal

In this article we will first try to understand what is a SQL plan, how is it created and then we will move towards understanding how to read the SQL plan. As we read the SQL plan we will try to understand different operators like table scan, index seek scan, clustered scan, RID lookup etc. We will also look in to the best practices associated with clustered and non-clustered indexes and how
they function internally. We will practically see how indexed views increase performance and in what scenarios we should use the same.
Here’s my small gift for all my .NET friends , a complete 400 pages FAQ Ebook which covers various .NET technologies like Azure , WCF , WWF , Silverlight , WPF , SharePoint and lot more Click
here
Video explanation of this article
  1. In case you do not want to read the whole article you can view the below videos
    This video runs through the basic concepts of SQL plan,Iterators and logical reads , click
    here to see the same.
  2. This video explains how table scan performance is improved by unique keys, click here
    to see the same.


What is a SQL execution Plan?

Every SQL query is broken down in to series of execution steps called as operators. Each operator performs basic operations like insertion, search, scan, updation, aggregation etc. There are 2 kinds of
operators Logical operators and physical operators.Logical operators describe how the execution will be executed at a conceptual level while physical operators are the actual logic / routine which perform the action.

On the query plan you always physical operators. One logical operator can map to multiple physical operator. It can also go vice versa but that’s a rare scenario. Some operators are both physical as well as logical. You can check out all the logical and physical operators for SQL Server at http://msdn.microsoft.com/en-us/library/ms191158.aspx . Below table shows some sample mapping between logical and physical operators.

Can we understand step by step how SQL Query plan is created and executed?
There are three phases through which a SQL is parsed and executed.

Parse: - The first phase is to parse the SQL query for syntaxes and create a query processor tree which defines logical steps to execute the SQL. This process is also called as ‘algebrizer’.
Optimize: - The next step is to find a optimized way of executing the query processor tree defined by the ‘algebrizer’. This task is done by using ‘Optimizer’.’Optimizer’ takes data statistics like how many rows, how many unique data exist in the rows, do the table span over more than one page etc. In other words it takes information about data’s data. These all statistics are taken, the query
proces or tree is taken and a cost based plan is prepared using resources, CPU and I/O. The optimizer generates and evaluates many plan using the data statistics, query processor tree, CPU cost, I/O cost etc to choose the best plan.
The optimizer arrives to an estimated plan, for this estimated plan it tries to find an actual execution plan in the cache. Estimated plan is basically which comes out from the optimizer and actual plan is the one which is generated once the query is actually executed.
Execute: - The final step is to execute the plan which is sent by the optimizer.
Ok, so where do I start from?
Till now we have understood operators and the way SQL plan is created and executed. So the first thing is to understand different operators and their logic. As discussed previously operator’s form the basic unit of you SQL plan, if we are able to understand them we can optimize our SQL to a great extent. So let’s understand the basic and important operators.

What is a Table Scan (Operator number 1)?
Before we try to understand table scan operator let’s try to see how can we see a SQL plan. Once you are in your SQL management studio, click on new query and write the SQL for which you want to see the SQL plan. Click on the icon which is displayed on the query window as shown below. Once you hit the icon you will see the query plan.

Now that we have understood how to see a query plan let try to understand the first basic operator i.e. table scan. In order to understand the same we have created a simple companies table with company code and description. Do not create any primary key or indexes. Now write a simple SQL with a select clause on one of the properties, we have selected company code currently as shown in the above figure.
If you hit on the query plan icon you will see table scan in your query plan. Now let’s try to understand what it signifies.

If there is a primary key table scan operator scans row by row every record until it finds the exact record. For instance in our search criteria we have given ‘500019’ as the company code. In table scan it goes row by row until it finds the record. Once it gets the record it sends the same as output to the end client.
In case the table does not have a primary key, it will continue searching ahead to find more matches if there are any.

Can performance increase by adding a unique key for table scans?
Yes, the performance increases if you add a unique key for table scan search criteria. Let’s see an actual demonstration of the same. So right click on the table field and create a unique key on customer code field as shown in the below figure.
Now let’s go to SQL Server analyzer and set the statistics to ‘ON’ using ‘set statistics io on’. With the statistics we also execute our query with the customer code criteria as shown in the below code snippet.
set statistics io on
SELECT TOP 1000 [CustomerCode]
,[CustomerName]
FROM [CustomerNew].[dbo].[Customers]
where customercode=524412

If you click on messages you will see logical reads as ‘3’.
(1 row(s) affected)
Table 'Customers'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Now let’s go and remove the unique key the logical reads are 17, which means performance has degraded as compared with unique key.
(1 row(s) affected)
Table 'Customers'. Scan count 1, logical reads 17, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

What are physical and logical reads?


The main work of database is to store and retrieve data. In other words lot of reads and writes to the disk. Read and writes consume lot of resources and take long time for completion. SQL server
allocates virtual memory for cache to speed up I/O operations. Every instance of SQL server has its own cache.
When data is read it is stored in the SQL cache until it’s not referenced or the cache is needed for some purpose. So rather than reading from physical disk its read from SQL cache. In the same way for writing, data is written back to disk only if it’s modified.

So when data is fetched from SQL cache its terms as logical read and when its read from physical database its termed as physical read.
Table scan doesn’t look to be efficient?

Table scan operators are good if the numbers of records in the tables are less. If the numbers of records in the tables are more, then table scan is very inefficient as it needs to scan row by row to get to the record.So for big number of rows in a table rather than table scan operator, seek scan operator is preferred.

What is a seek scan (Operator number 2)?
Seek scan does not scan all the rows to go to a record, it uses indexes and b-tree logic to get to a record. Below is a sample diagram, which explains how B-Tree fundamental works. The below diagram shows how index will work for number from 1-50:-
• Let us say you want to search 39. SQL Server will first start from the first
node i.e. root node.
• It will see that the number is greater than 30, so it moves to the 50 node.
• Further in Non-Leaf nodes it compares is it more than 40 or less than 40. As
it’s less than 40 it loops through the leaf nodes which belong to 40 nodes.
In table scan it scans all rows while in seek scan it scans less number of rows comparatively. For instance to get to the record 39 it only scanned 9 records rather than travelling through all 50 records.

Can we see a seek scan practically?
On the same above sample lets create a clustered index on the company code field and see the SQL plan again by hitting the SQL plan icon.
If you view the SQL plan for the select query with company code you will see a clustered index seek operator as shown in the below figure. In other words it says that it will use b-tree logic for searching rather than traversing row by row.
In case you have a non-clustered index on the company code it will show a non-clustered index operator as shown below. You may be wondering what the RID lookup and nested loop is in the query plan, we will come to that later on. The below figure indicates that its using the non-clustered index logical operator.

What are heap tables in SQL Server?

A table with no clustered indexes is termed as heap tables. The data rows of a heap table are not sorted, due to this there is a huge overhead of accessing heap tables.
What is RID Lookup heap in SQL query plan? (Operator Number 3)

RID lookup will be seen when you use non-clustered indexes with join queries. For instance below is a simple join with customer and address table on ‘custcode’.
SELECT Customers.CustomerName, Address.Address1
FROM Address INNER JOIN Customers ON Address.Custcode_fk = Customers.CustomerCode

If you do not put indexes on the primary table i.e. customer at this moment and you see the SQL plan, you should see the RID lookup heap as shown below. We will understand what it means, just try to get a glimpse of this operator first.


In order to understand the RID look up we need to first understand how non-clustered indexes work with clustered indexes and heap tables. Below is a simple figure which describes the working and their relationships.

Non-clustered indexes also use the B-tree structure fundamental to search data. In non-clustered indexes the leaf node is a ‘Rowid’ which points to different things depending on two scenarios:-
Scenario 1:- If the table which is having the primary key in the join has a clustered index on the join key (currently the join key is the custcode) then the leaf nodes i.e. ‘rowid’ will point to the index key of clustered index as shown below.



Scenario 2 :- if the table which is having the primary does not have a clustered index then the non-clustered index leaf node ‘rowid’ will point to actual row on the heap table. As the data is stored in a different heap table, it uses the lookup (i.e. RID lookup) to get to the actual row. Below is a query plan which has a join without clustered
indexes.



What is a bookmark lookup?

As discussed in the previous section, when query searches on a column which is not a part of non-clustered index a lookup is required. As mentioned earlier either you need to lookup on the clustered index or you need to lookup on the heap tables i.e. RID lookup.The old definition for these lookup was called as ‘Bookmark’ look up. If you are seeing SQL plan using SQL 2000 you should see a bookmark lookup in your query plan. In SQL 2000 bookmark loop up used a dedicated iterator to determine whether the table is heap table or index table and change the search logic accordingly. So cutting short index lookup and RID lookup are nothing but types of bookmark lookup.

Lookups are costly operation, how can we eliminate the same?
Yes, lookups are costly operations and should be avoided as far as possible. To avoid RID lookup or clustered lookup we can use covering indexes.Below is a simple table which has two fields customer code and customer name. Clustered indexes are defined on Customer code and non-clustered index is defined on customer name as shown in the table figure below.


Lets fire the below and see the query plan for the same. Please note we are searching on the non-clustered index column.SELECT *
FROM [CustomerNew].[dbo].[Customers] where CustomerName='20 microns'
As the non-clustered index needs to do a lookup on the clustered index it uses the RID lookup to get the clustered index. This lookup happens because the customer name non-clustered index does not have information about the clustered index data. If somehow we are able to make aware of clustered index data to non-clustered index the RID lookup can be avoided.



In order that the non-clustered index i.e. customer name is aware of the clustered index i.e. customer code we can create a combined non-clustered index on customer and customer doe as shown in the below figure.

Now if you generate the plan you can see that RID lookup is completely eliminated. The customer code key in this current scenario is called as the covering index.

What are covering indexes?

As explained above. It helps to remove the lookup operation.


Do we have any other efficient way of creating covering indexes?

The composite non-clustered index creation has certain draw backs:-
• You cannot create composite indexes on data types like varchar(max) , XML.
• Size of the index increases the key size which further impacts performance.
• Can exceed the constraint of index key size of 900 bytes.

The best practice for creating covering index is by using the include keyword. So you create the non-clustered index as shown in the below code snippet.
You can see that we have included the customer code primary key as the part of our non-clustered index customer name and when you re-run the plan you can see how book mark lookup is removed and clustered indexes are used.

CREATE NONCLUSTERED INDEX [MyIndex] ON [dbo].[Customers]
(
CustomerName
) INCLUDE (CustomerCode) ON [PRIMARY]


-->

How does size of the index affect performance?

Let us make a statement before we answer this question:- ‘The more the size of the index key, performance will also degrade accordingly’. Let’s do a small practical example to understand the impact of the same. Below is a simple customer table with customer code and customer name field and below are the data types as shown in table.
* Please note that customer code has a clustered index and the data type is int.


Now fire the below SQL statement. In the below SQL script “Customer” is the database and “CustomerTable” is the table. This SQL statement gives details about number of pages which is used to store indexes.

Just to revise people who are new SQL Server pages. Pages are fundamental unit of storage. In other words the disk space allocated on the hard disk is divided in to pages. A page size is normally of 8 kb. Page has data rows of SQL Server. If you can fit your rows in one page you do not need to jump from on page to other page, which increases performance.

Now if you fire the below SQL statement it will give you number of pages consumed to store the clustered index. You can see from the results below clustered index with ‘int’ uses 12 pages.
SELECT  Indextable.Name
       ,Indextable.type_desc
       ,PhysicalStat.page_count
       ,PhysicalStat.record_count
       ,PhysicalStat.index_level
FROM   sys.indexes Indextable
       JOIN sys.dm_db_index_physical_stats(DB_ID(N'Customer'),
                                           OBJECT_ID(N'dbo.CustomerTable'),NULL, NULL, 'DETAILED') AS PhysicalStat
       ON Indextable.index_id = PhysicalStat.index_id
WHERE Indextable.OBJECT_ID = OBJECT_ID(N'dbo.CustomerTable')


Now let’s change the data type to numeric as shown in the below table.


Now if you fire the above query you can see the number of pages have increased which means SQL engine has to jump between pages to fetch the data row. In other words this decreases performance.


If you change the data type to ‘char’ or ‘varchar’ the ‘page_count’ will have the worst value. Below figure shows how the number of pages have increased because of the data type.

So concluding the more the size of the index, the more the number of pages and worse the performance. Avoid data types like ‘char’ , ‘varchar’ , ‘int’ is the preferred data type for indexes.
If the clustered indexes are small we get the following advantages:-
  • Reduces I/O as the number of 8 KB page decrease.
  • As the size decreases caching size is increased.
  • Less storage space required.


How does unique
column increase index performance?


Creating indexes on unique column values
increases performance of indexes. So unique columns always are best candidates
for clustered indexes. Let’s demonstrate how creating clustered indexes on
unique columns increases performances as compared to creating indexes on columns
with non-unique values.



Below is a simple query which is fired on the customer table which has two
fields ‘active’ and ‘customercode’. ‘customercode’ field has unique values while
‘active’ field has non-unique values , it has only two values ‘1’ and ‘0’. Below
is a SQL statement which is fired with statistics IO ON.


set statistics io on
SELECT *
FROM Customers where active=1 and CustomerCode=500008

If you fire the above statement with clustered
index on ‘customercode’ field the logical reads are ‘2’.


(1 row(s) affected)
Table 'Customers'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

If you fire the above SQL statement with
clustered index on ‘active’ field the logical reads are 16. In other words the
performance is hampered.




(1 row(s) affected)
Table 'Customers'. Scan count 1, logical reads 16, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


How important is the order of indexes?


When you create composite indexes the order of columns in composite index is also of prime importance. For instance for the below customer table we have created an index with ‘customercode’ as the first column and ‘customerid’ as the second column.



If most of the SQL statements in your project are as shown below, i.e. with ‘customerid’ as the first column in the where clause then ‘customerid’ should be the first column in the composite index.
Select * from Customer where Customerid=1 and CustomerCode=9898998

If ‘customercode’ is the first column in the where clause as shown below then the composite key should have ‘customercode’ as
the first column in your joint index key.

Select * from Customer where CustomerCode=9898998 and Customerid=1
What is the preferred data type for indexes?

‘integer’ data type is the preferred data type for indexes as compared to string data types like ‘char’ , ‘varchar’. If needed
you can also choose ‘bigint’ , ‘smallint’ etc. Arithmetic searches are faster as compared to string searches.
What are indexed views and how does it improve performance?
Indexed view is a virtual table which represents an output of a select statement. In general when we create a view the view does not store any data. So when we query a view it queries the underlying base table.But when we create indexed views, result set is persisted on the hard disk which can save lot of overheads. So let’s understand advantages and disadvantages of indexed views.
Advantages
By creating indexed views you can store pre-computed values , expensive computations and join in the indexed views so that we do not need to recalculate them again and again.



Disadvantages
Indexed views are not suitable for tables which are highly transactional because SQL engine needs to also update indexed views if base table changes.

Can we see practically how indexed views increase performance?

Below is a simple query which calculates count as per ‘registerdate’ from the customers table.
SELECT dbo.Customers.RegisterDate as RDate,
count(*) as CountCust
FROM dbo.Customers
group by dbo.Customers.RegisterDate

In order to understand the same we will do the following:-

• We will run the above aggregate select query on the table without indexes and measure logical reads for the same. We will also measure logical reads for inserts on the same.
• In the next step we will create a clustered index and again record logical scan for select and insert.
• Finally we will create an indexed view and again measure logical scans for select and insert.



Benchmarking without indexes
So let’s fire the above aggregated statement without any indexes with statistics io on as shown in the below snippet.

Set statistics io on
go
SELECT dbo.Customers.RegisterDate as RDate, count(*) as CountCust
FROM dbo.Customers
group by dbo.Customers.RegisterDate

If you see the SQL plan it uses hash match.




The total logical reads for select query for the above operation is 16.
Table 'Worktable'. Scan count 0, logical
reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical
reads 0, lob read-ahead reads 0.
Table 'Customers'. Scan count 1, logical reads 16, physical reads 0, read-ahead
reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

For insert the total logical read operation is 1.
set statistics io on
insert into Customers values(429,'Shiv','1/1/2001')
Table 'Customers'. Scan count 0, logical reads 1, physical reads 0, read-ahead
reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
Benchmarking with clustered indexes
Now let’s go and create a clustered index on the ‘customers’ table , set the statistics io on and see the plan. You can see that it uses the index scan and uses the stem aggregate logical operator to give the results.



The select query takes 10 logical reads to execute the above SQL query.
(6 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead
reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Customers'. Scan count 1, logical reads 10, physical reads 0, read-ahead
reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

With clustered index insert takes 2 logical reads.
Table 'Customers'. Scan count 0, logical
reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical
reads 0, lob read-ahead reads 0.
(1 row(s) affected)

Benchmarking with indexed view

Now let’s create an indexed view and try to benchmark again for select and insert queries. Below is the SQL snippet to create index views. Please note ‘count’ will not work with indexed views, we need to use ‘count_big’. Some function like ‘avg’ etc are also not allowed with indexed views.

CREATE VIEW IndexedView
WITH SCHEMABINDING as
SELECT dbo.Customers.RegisterDate as RDate, COUNT_BIG(*) as CountCust
FROM dbo.Customers
group by dbo.Customers.RegisterDate

GO

CREATE UNIQUE CLUSTERED INDEX NonInx ON IndexedView(RDate);

GO

If you see the plan you will see that there are no aggregate logical operators because it’s pre-calculated in the indexed views.



The select statistics show the logical reads as 1.
Table 'Customers'. Scan count 0, logical
reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical
reads 0, lob read-ahead reads 0.

Insert logical scan is 18 for indexed views as shown below.
Table 'IndexedView'. Scan count 0, logical
reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical
reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 3, logical reads 12, physical reads 0, read-ahead
reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Customers'. Scan count 0, logical reads 1, physical reads 0, read-ahead
reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)

So concluding indexed views are good for select aggregate queries they increase performance. On the other hand if your table is highly transactional then indexed views can decrease your performance. Below is a simple performance impact graph which plotted using the above benchmark data. You can see how logical scan is decreased with select plus indexed views and how insert logical is increased with indexed views.



Performance is impacted in insert’s on indexed views because any updates to the base table needs to be also reflected back to indexed views which decreases performance.



Note: - If your base table is highly
transaction indexed views will decrease performance of your insert, update and
delete queries.

How do we specify index hints?
Does index hints increase performance?

References

http://en.wikipedia.org/wiki/Query_plan
• Execution plan basics by Grant FritChey http://www.simple-talk.com/sql/performance/execution-plan-basics/
• The best guide on logical and physical operators http://msdn.microsoft.com/en-us/library/ms191158.aspx
• Inside Microsoft SQL Server 2005: Query Tuning and Optimization by Kalen Delaney

Detail links which discuss Physical and logical read

http://netindonesia.net/blogs/kasim.wirama/archive/2008/04/20/logical-and-physical-read-in-sql-server-performance.aspx
http://msdn.microsoft.com/en-us/library/aa224763(SQL.80).aspx
http://msdn.microsoft.com/en-us/library/ms184361.aspx

Sunday, January 24, 2010

Simple 7 steps to run your first Azure Blob Program









Simple 7 steps to run your first Azure Blob
Program



Introduction


Step 1:- Ensure you have
things at place


Step 2:-
What will we do?


Step 3:-
Create a web role



Step 4:- Set the blob
connection string




Step 5:- Create the blob
on webrole onstart




Step 6:- Code your ASP.NET UI



Step 7:- Run the project and enjoy




Introduction




In this section we will create our first program using Azure blobs. This article
creates a simple web page where we upload image files which are stored in azure
blobs. We have also created a simple search text box which will help us to
search the image blobs with the image file name.

In case you are a complete newbie to azure you can download my two azure basic
videos which explain what azure is all about Azure Faq Part 1 :-
Video1 , Azuer Faq Part 2 :-
Video2.

Please feel free to download my free 500 question and answer eBook which covers
.NET , ASP.NET , SQL Server , WCF , WPF , WWF@
http://www.questpond.com/ .



Step 1:- Ensure you have
things at place


In case you are a complete fresher to Azure,
please ensure you have all the pre-requisite at place. You can read the below
article to get the basic prerequisite

http://computerauthor.blogspot.com/2010/01/simple-5-steps-to-run-your-first-azure.html
.


Step 2:-
What will we do?


Azure Blobs help to store large items like
files, in other words its file storage system. In this article we will create a
simple program to upload image files in Azure blob system.


Step 3:-
Create a web role


The first step is to a create a web role
project. In case you are fresher in Azure, you can go through

http://computerauthor.blogspot.com/2010/01/simple-5-steps-to-run-your-first-azure.html
to understand
how to create a web role project.
So let’s create a simple project with name ‘BlobStorage’. Once you have created
the project it creates two projects one is the cloud service project and the
other is the web role project. Cloud service project has all the necessary
configuration needed for your cloud service project while the web role project
is your asp.net project.






Step 4:- Set the blob
connection string


Now the next step is to define a blob
connection string in the service configuration file. So expand the ‘BlobStorage’
project, right click on roles and select properties.







Once you select properties, go to settings tab and add the blob connection
string as shown in the below figure. In the below figure we have added blob
connection string name as ‘BlobConnectionString’.





Click on the right hand eclipse and select ‘Use
development storage’. All the changes done using the setting UI will be
reflected in the ‘ServiceConfiguration’ file as shown above.



Step 5:- Create the blob
on webrole onstart


Now it’s time to start coding. Open the web
role project and open ‘WebRole.cs’ file.





Now let’s write a code on the ‘onstart’ event
to create the blob container.


public override bool OnStart()

{





}

Use the ‘CloudStorageAccount’ static class to
set the configuration environment.


public override bool OnStart()

{

// Set the configuration file

DiagnosticMonitor.Start("DiagnosticsConnectionString");

CloudStorageAccount.SetConfigurationSettingPublisher((configName, configSetter) =>

{

configSetter(RoleEnvironment.GetConfigurationSettingValue(configName));

});

....

....

....

....

}

The next step is to get a reference of the
cloudstorageaccount object using the blob connection string which was provided
when you setup your web role project.


// get the blob connection string

CloudStorageAccount objStorage = CloudStorageAccount.FromConfigurationSetting("BlobConnectionString");

Once we have access to the storage account
object, use the blob end point to create the blob client.


// get the client reference

CloudBlobClient objClient = new CloudBlobClient(objStorage.BlobEndpoint, objStorage.Credentials);

Give a nice name to the container and create
the container object using the client object which you have just created using
the blob end point. Call the ‘CreateIfnotExist’ method of the container to
ensure that you create the blob container only if it does not exist to avoid any
errors.


// Get the reference to container

CloudBlobContainer objContainer = objClient.GetContainerReference("mycontainer");



// Create the container if it does not exist

objContainer.CreateIfNotExist();

Step
6:- Code your ASP.NET UI


The final step is to create the ASPX page which
will help us upload image files in the blob container which we just created in
the ‘WebRole.cs’ file. You can see in t he below figure we have create a browse
button which help us upload image files and a search text box which will help us
search blob files.

So create the below defined ASPX UI.





In the above ASPX CS UI first get the reference
to the below specified name spaces.


using Microsoft.WindowsAzure;

using Microsoft.WindowsAzure.StorageClient;

In the file upload button we need to insert the
below code snippet to upload the file. So get access to the container object
‘MyContainer’ and call the ‘GetBlobReference’ function to get access to the
cloud blob object.


// Get the storage account reference

CloudStorageAccount objStorage = CloudStorageAccount.FromConfigurationSetting("BlobConnectionString");

// get the Client reference using storage blobend point

CloudBlobClient objClient = new CloudBlobClient(objStorage.BlobEndpoint, objStorage.Credentials);

// Get Container reference

CloudBlobContainer objContainer = objClient.GetContainerReference("mycontainer");

// Get blob reference

CloudBlob obj =objContainer.GetBlobReference(FileUpload1.FileName.ToString());

Set the meta data of the cloud object and open
a blob stream object to write the file. Do not forget to close the blob steam
object once you are done.


// Set meta values

obj.Metadata["MetaName"] = "meta";

// Open a stream using the cloud object

BlobStream blobstream = obj.OpenWrite();

// Write the stream to the blob database

blobstream.Write(FileUpload1.FileBytes, 0, FileUpload1.FileBytes.Count());

blobstream.Close();

Once we upload the file, we will browse through
the blob list to get the list of blobs present in the container.


// Browse through blob list from the container

IEnumerable<IListBlobItem> objBlobList = objContainer.ListBlobs();

foreach (IListBlobItem objItem in objBlobList)

{

Response.Write(objItem.Uri + "<br>");

}

In the same UI we have provided a search object
to search a blob. To search a blob first get access to the container object and
call the ‘GetBlobReference’ function with the blob name to get reference to the
cloud object.


// Get the blob reference using the blob name provided in the search

CloudBlob obj = objContainer.GetBlobReference(txtSearch.Text);

BlobStream blobstream = obj.OpenRead();

Read the blob stream using the blob steam
object and finally attach this stream with the Image object to display the same
in the HTTP response.


// Create the image object and display the same on the browser response

System.Drawing.Image objimg=null;

objimg = System.Drawing.Image.FromStream(blobstream,true);

Response.Clear();

Response.ContentType = "image/gif";

objimg.Save(Response.OutputStream,System.Drawing.Imaging.ImageFormat.Jpeg);


Step 7:- Run the project and enjoy


Finally enjoy your first blob program. You can
see in the below figure we have uploaded some image files in the blob.





We can also search the blob using the search
blob text box and you should be able to get the below image display from the
blob database.











Monday, January 11, 2010

9 simple steps to run your first Azure Table Program

9 simple steps to run your first Azure Table Program


Introduction



Azure has provided 4 kinds of data storages blobs, tables, queues and SQL azure.
In this section we will see how to insert a simple customer record with code and
name property in Azure tables.

In case you are complete fresher and like me you can download my two azure basic
videos which explain what azure is all about Azure FAQ Part 1 :-
Video1 Azure FAQ Part 2 :-
Video2.

Please feel free to download my free 500 question and answer eBook which covers
.NET , ASP.NET , SQL Server , WCF , WPF , WWF , Silver light , Azure @ http://tinyurl.com/4nvp9t .


Whatwill we do in this article?


We will create a simple customer entity with
customer code and customer name and add the same to Azure tables and display the
same on a web role application.



Step 1:- Ensure you havethings at place


In case you are a complete fresher to Azure,
please ensure you have all the pre-requisite at place. You can read the below
article to get the basic prerequisite

http://computerauthor.blogspot.com/2010/01/simple-5-steps-to-run-your-first-azure.html

.



Step 2:- Create a web role project


The next step is to select the cloud service
template, add the web role project and create your solution.









Step 3:- Specify the connection
string


The 3rd step is to specify the connection
string where your table source is currently. So expand the roles folder , right
click on webroletable and select properties as shown in the below figure.





You will be then popped up with a setting tab.
Select the settings section, add a new setting, give a name to your connection
string and select type as ‘connectionstring’.





We also need to specify where the storage
location is , so select the value and select ‘Use development storage’ as shown
in the below figure. Development storage means your local PC currently where you
Azure fabric is installed.





If you open the ‘ServiceConfiguration.cscfg’
file you can see the setting added to the file.






Step 4:- Reference
namespaces and create classes


In order to do Azure storage operation we need
to add reference to ‘System.Data.Services.Client’ dll.





Once the dlls are referred, let’s refer the
namespaces in our code as shown below. Currently we will store a customer record
with customer code and customer name in tables. So for that we need to define a
simple customer class with ‘clsCustomer’. This class needs to inherit from
‘TableServiceEntity’ class as shown in the below figure.



The second class which we need to create is the data context class. The data
context class will take the entity class and enter the same in tables. You can
see in the below figure we have created one more class ‘clsCustomerDataContext’.






Step 5:- Define partition and
row key


The next step is to define the properties of
the customer class. In the below figure we have defined two properties in the
customer class customer code and customer name.



Every row in the table needs to be defined with a partition key and a unique row
key. In the constructor we have initialized the partition key with a text
“Customers” and the unique key is set to the current date time tick count.






Step 6:- Create your
‘datacontext’ class


The next step is to create your data context
class which will insert the customer entity in to azure table storage. Below is
the code snippet of the data context class.



The first noticeable thing is the constructor which takes in location of the
credentials. The second is the ‘Iqueryable’ interface which is used by the cloud
service to create tables in azure cloud service.





In the same data context we have created an
‘AddCustomer’ method which takes in the customer entity object and call’s the
‘AddObject’ method of the data context to insert the customer entity data in to
Azure tables.





Step 7:- Create the table structure on the
‘onstart’


The next step is to create the table on the
‘onstart’ of the web role.





So open ‘webrole.cs’ file and put the below
code on the ‘onstart’ event. The last code enclosed in curly brackets gets the
configuration and creates table’s structure.





Step 8:-
Code your client


The final thing is to code the client. So below
is the UI / ASPX file which we have created to insert the table entity values.





On the button click we need to consume the data
context and the entity class.



So the first step is to get the configuration setting of the data connection.


// Gets the connection string
var customer = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");

The next step is to pass these credentials to
the data context class and create a object of the same.


// Create the customer datacontext object
var customerContext = new clsCustomerDataContext(customer.TableEndpoint.ToString(), customer.Credentials);

Flourish the entity object with data and pass
it to the data context class to add the same in to tables.


// Create the entity object
clsCustomer objCustomer = new clsCustomer();
objCustomer.CustomerCode = txtCustomerCode.Text;
objCustomer.CustomerName = txtCustomerName.Text;
// Pass the entity object to the datacontext
customerContext.AddCustomer(objCustomer);

Finally we loop through the context customer
entity collection to see if the customer is added in to the table.


//Loop through the records to see if the customer entity is inserted in the tabless
foreach (clsCustomer obj in customerContext.Customers)
{
Response.Write(obj.CustomerCode + " " + obj.CustomerName + "<br>");
}

Step
9:- Run your application


It’s time to enjoy your hard work, so run the
application and enjoy your success.





Source code


You can get the source code of the above sample
from
here