Maximizing the StreamReader.

StreamReader from System.IO namespace, reads text as a stream. It can not read binary data. In this example below, I tried to measure/benchmark how much efficient can it get,in terms of speed. Here’s the basic setup on my local desktop.
Medium size text file  (.txt) –> 1382 KB
OS –> Windows 7 Ultimate N, 12 GB RAM, 64 bit
I have compared various StreamReader’s Readline constructors as well as File.OpenRead/File.Open, — the latter works with reading chunk of bytes instead of text. While using ReadLine , I avoided using string and always used StringBuilder
to keep memory management happy. Also with StreamReader’s ReadLine, the complete file is NOT loaded into memory , instead it’s being read in batches. The plus for this is, you can work with large files & the minus being more disk I/Os. I have opened the file in Read mode as well as FileShare set to Read, which means other process or sessions can safely open the file to read, without encountering any kind of lock. MSDN also recommends that “FileOptions.SequentialScan” makes the whole thing more efficient. StreamReader’s ReadLine actually reads from buffer and not from file. By default the buffer size is 1024. If the line being read, exceeds this limit, it generates a OutOfmemory exception. Now, the buffer size can be altered in the constructor. Here are the numbers, reflecting their relative performance, atleast on my local setup.

private void UsingStreamReaderReadLine()
{
int nLinesRead = 0;
if(!File.Exists(Server.MapPath("~/App_Data/dictC2.txt")))
{
    return;
}
//FileStream fs = new FileStream(Server.MapPath("~/App_Data/dictC2.txt"), FileMode.Open, 
//        FileAccess.Read, FileShare.Read,1024,FileOptions.SequentialScan);
//Time elapsed : 15.2895 ms.
//Read : 16732 lines.

//FileStream fs = new FileStream(Server.MapPath("~/App_Data/dictC2.txt"), 
//        FileMode.Open, FileAccess.Read, FileShare.Read);
//Time elapsed : 16.0057 ms.
//Read : 16732 lines.

FileStream fs = new FileStream(Server.MapPath("~/App_Data/dictC2.txt"), 
               FileMode.Open, FileAccess.Read, 
               FileShare.Read, 4096, FileOptions.SequentialScan);
//Time elapsed : 10.1082 ms.
//Read : 16732 lines.

using (StreamReader sr = new StreamReader(fs))
{
    StringBuilder sb = new StringBuilder();
    Stopwatch sw = Stopwatch.StartNew();
    while (!sr.EndOfStream)
    {
        sb.Append(sr.ReadLine());
        //Do whatever you want with the line read
        sb.Clear();
        nLinesRead++;
    }
    sw.Stop();
    Label1.Text = "Time elapsed : " + sw.Elapsed.TotalMilliseconds.ToString() + " ms.";
    Label1.Text += "<br>Read : " + nLinesRead.ToString() + " lines.";
}
fs.Close();
fs.Dispose();
}

Next, I tried with reading the file in bytes and not as text. Less I/O calls always increases performance. Here’s  the data.

private void UsingFileReadBytes()
{
StringBuilder sb = new StringBuilder();
Decoder decoder8 = Encoding.UTF8.GetDecoder();
if (!File.Exists(Server.MapPath("~/App_Data/dictC2.txt")))
{
    return;
}
FileStream fStream = File.Open(Server.MapPath("~/App_Data/dictC2.txt"), 
    FileMode.Open, FileAccess.Read, FileShare.Read);
//Or an equivalent is
//FileStream fStream = File.OpenRead(Server.MapPath("~/App_Data/dictC2.txt"));
//From msdn -- This method is equivalent to the 
//FileStream(String, FileMode, FileAccess, FileShare) constructor overload 
//with a FileMode value of Open, a FileAccess value of Read and a FileShare value of Read.
fStream.Position = 0;
int TotLen = (int)fStream.Length;
//You may read the whole file in one shot or read in chunks as below
Byte[] bytes = new Byte[TotLen]; //Time elapsed : 13.2992 ms.
//Byte[] bytes = new Byte[1024];  //Time elapsed : 12.4172 ms.
//Byte[] bytes = new Byte[4096];  //Time elapsed : 10.4671 ms.

Stopwatch sw = Stopwatch.StartNew();
try
{
    while (fStream.Position < fStream.Length)
    {
        int nBytes = fStream.Read(bytes, 0, bytes.Length);
        int nChars = decoder8.GetCharCount(bytes, 0, nBytes);
        char[] chars = new char[nChars];
        nChars = decoder8.GetChars(bytes, 0, nBytes, chars, 0);
        sb.Append(new String(chars, 0, nChars));
    }
}
catch
{
    throw;
}
finally{
    fStream.Close();
    fStream.Dispose();
}
sw.Stop();
Label1.Text = "Time elapsed : " + sw.Elapsed.TotalMilliseconds.ToString() + " ms.";
}

From the above, it appears, StreamReader->ReadLine, if used correctly with little tuning can run good in many scenarios. Thanks for reading.

Advertisements
This entry was posted in General ASP.Net C#. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s