Show Blogger Panel Hide Blogger Panel

Saturday, August 8, 2009

Making IDisposable more usable, part 1

IDisposable is used quite frequently, so here I’m going to provide few hacks allowing to improve its usability. First of all, let's create IDisposableExtensions with a single method:

/// <summary>
/// Safely disposes an <see cref="IDisposable"/> object.
/// </summary>
/// <param name="disposable">Object to dispose (can be <see langword="null"/>).</param>
public static void DisposeSafely(this IDisposable disposable)
{
if (disposable!=null)
disposable.Dispose();
}

This gives us an ability to dispose the objects without repeating null-check:

try {
writer.DisposeSafely();
}
finally {
writer = null;
}

Let's go further now. Frequently you must safely dispose two or more IDisposable objects. Note that "safely" is an essential word here. You can't write something like:

disposable1.DisposeSafely();
disposable2.DisposeSafely();

This code may fail: an exception can be thrown on execution of its first line, and if this happens, its second line won't be executed at all. So disposable2 has a chance of not being disposed. That's why it's a good idea to implement few helpers allowing us to deal with such issues safely.

Let's add JoiningDisposable type:

/// <summary>
/// Disposes two <see cref="IDisposable"/> objects.
/// </summary>
[Serializable]
public sealed class JoiningDisposable : IDisposable
{
private IDisposable first;
private IDisposable second;

/// <summary>
/// Gets the first object to dispose.
/// </summary>
public IDisposable First {
get { return first; }
}

/// <summary>
/// Gets the second object to dispose.
/// </summary>
public IDisposable Second {
get { return second; }
}

/// <summary>
/// Joins the <see cref="JoiningDisposable"/> and <see cref="IDisposable"/>.
/// </summary>
/// <param name="first">The first disposable to join.</param>
/// <param name="second">The second disposable to join.</param>
/// <returns>New <see cref="JoiningDisposable"/> that will
/// dispose both of them on its disposal</returns>
public static JoiningDisposable operator &(JoiningDisposable first, IDisposable second)
{
if (second==null)
return first;
return new JoiningDisposable(first, second);
}


// Constructors

/// <summary>
///   <see cref="ClassDocTemplate.Ctor" copy="true"/>
/// </summary>
/// <param name="disposable1">The first disposable.</param>
/// <param name="disposable2">The second disposable.</param>
public JoiningDisposable(IDisposable disposable1, IDisposable disposable2)
{
this.first = disposable1;
this.second = disposable2;
}

/// <inheritdoc/>
public void Dispose()
{
var d1 = first;
first = null;
try {
d1.DisposeSafely();
}
catch (Exception ex) {
using (var ea = new ExceptionAggregator()) {
ea.Execute(_this => {
var d2 = _this.second;
_this.second = null;
d2.DisposeSafely();
}, this);
ea.Execute(e => { throw e; }, ex);
}
}
d1 = second;
second = null;
d1.DisposeSafely();
}
}

As you see, the code in its Dispose method relies on ExceptionAggregator. Here it is:

/// <summary>
/// Provides exception aggregation support.
/// </summary>
[Serializable]
public class ExceptionAggregator : 
IDisposable, 
ICountable<Exception>
{
private Action<Exception> exceptionHandler;
private List<Exception> exceptions;
private string exceptionMessage;

private bool isDisposed = false;

/// <summary>
/// Gets or sets the exception handler.
/// </summary>
public Action<Exception> ExceptionHandler
{
[DebuggerStepThrough]
get { return exceptionHandler; }
[DebuggerStepThrough]
set { exceptionHandler = value; }
}

/// <summary>
/// Gets the number of caught exceptions.
/// </summary>
public long Count
{
[DebuggerStepThrough]
get { return exceptions!=null ? exceptions.Count : 0; }
}

#region Execute(...) methods

/// <summary>
/// Executes the specified action catching all the exceptions from it,
/// adding it to internal list of caught exceptions and
/// and passing it to <see cref="ExceptionHandler"/> handler.
/// </summary>
/// <param name="action">The action to execute.</param>
/// <exception cref="ObjectDisposedException">Aggregator is already disposed.</exception>
public void Execute(Action action)
{
if (isDisposed)
throw Exceptions.AlreadyDisposed(null);
try {
action();
}
catch (Exception e) {
HandleException(e);
}
}

/// <summary>
/// Executes the specified action catching all the exceptions from it,
/// adding it to internal list of caught exceptions and
/// and passing it to <see cref="ExceptionHandler"/> handler.
/// </summary>
/// <typeparam name="T">The type of action argument.</typeparam>
/// <param name="action">The action to execute.</param>
/// <param name="argument">The action argument value.</param>
/// <exception cref="ObjectDisposedException">Aggregator is already disposed.</exception>
public void Execute<T>(Action<T> action, T argument)
{
if (isDisposed)
throw Exceptions.AlreadyDisposed(null);
try {
action(argument);
}
catch (Exception e) {
HandleException(e);
}
}

/// <summary>
/// Executes the specified action catching all the exceptions from it,
/// adding it to internal list of caught exceptions and
/// and passing it to <see cref="ExceptionHandler"/> handler.
/// </summary>
/// <typeparam name="T1">The type of the 1st action argument.</typeparam>
/// <typeparam name="T2">The type of the 2nd action argument.</typeparam>
/// <param name="action">The action to execute.</param>
/// <param name="argument1">The 1st action argument value.</param>
/// <param name="argument2">The 2nd action argument value.</param>
/// <exception cref="ObjectDisposedException">Aggregator is already disposed.</exception>
public void Execute<T1, T2>(Action<T1, T2> action, T1 argument1, T2 argument2)
{
if (isDisposed)
throw Exceptions.AlreadyDisposed(null);
try {
action(argument1, argument2);
}
catch (Exception e) {
HandleException(e);
}
}

/// <summary>
/// Executes the specified action catching all the exceptions from it,
/// adding it to internal list of caught exceptions and
/// and passing it to <see cref="ExceptionHandler"/> handler.
/// </summary>
/// <typeparam name="T1">The type of the 1st action argument.</typeparam>
/// <typeparam name="T2">The type of the 2nd action argument.</typeparam>
/// <typeparam name="T3">The type of the 3rd action argument.</typeparam>
/// <param name="action">The action to execute.</param>
/// <param name="argument1">The 1st action argument value.</param>
/// <param name="argument2">The 2nd action argument value.</param>
/// <param name="argument3">The 3rd action argument value.</param>
/// <exception cref="ObjectDisposedException">Aggregator is already disposed.</exception>
public void Execute<T1, T2, T3>(Action<T1, T2, T3> action, T1 argument1, T2 argument2, T3 argument3)
{
if (isDisposed)
throw Exceptions.AlreadyDisposed(null);
try {
action(argument1, argument2, argument3);
}
catch (Exception e) {
HandleException(e);
}
}

/// <summary>
/// Executes the specified function catching all the exceptions from it,
/// adding it to internal list of caught exceptions and
/// and passing it to <see cref="ExceptionHandler"/> handler.
/// </summary>
/// <typeparam name="TResult">The type of function result.</typeparam>
/// <param name="function">The function to execute.</param>
/// <returns>Function execution result, if no exception was caught;
/// otherwise, <see langword="default(TResult)"/>.</returns>
/// <exception cref="ObjectDisposedException">Aggregator is already disposed.</exception>
public TResult Execute<TResult>(Func<TResult> function)
{
if (isDisposed)
throw Exceptions.AlreadyDisposed(null);
try {
return function();
}
catch (Exception e) {
HandleException(e);
}
return default(TResult);
}

/// <summary>
/// Executes the specified function catching all the exceptions from it,
/// adding it to internal list of caught exceptions and
/// and passing it to <see cref="ExceptionHandler"/> handler.
/// </summary>
/// <typeparam name="T">The type of the function argument.</typeparam>
/// <typeparam name="TResult">The type of function result.</typeparam>
/// <param name="function">The function to execute.</param>
/// <param name="argument">The function argument value.</param>
/// <returns>Function execution result, if no exception was caught;
/// otherwise, <see langword="default(TResult)"/>.</returns>
/// <exception cref="ObjectDisposedException">Aggregator is already disposed.</exception>
public TResult Execute<T, TResult>(Func<T, TResult> function, T argument)
{
if (isDisposed)
throw Exceptions.AlreadyDisposed(null);
try {
return function(argument);
}
catch (Exception e) {
HandleException(e);
}
return default(TResult);
}

/// <summary>
/// Executes the specified function catching all the exceptions from it,
/// adding it to internal list of caught exceptions and
/// and passing it to <see cref="ExceptionHandler"/> handler.
/// </summary>
/// <typeparam name="T1">The type of the 1st function argument.</typeparam>
/// <typeparam name="T2">The type of the 2nd function argument.</typeparam>
/// <typeparam name="TResult">The type of function result.</typeparam>
/// <param name="function">The function to execute.</param>
/// <param name="argument1">The 1st function argument value.</param>
/// <param name="argument2">The 2nd function argument value.</param>
/// <returns>Function execution result, if no exception was caught;
/// otherwise, <see langword="default(TResult)"/>.</returns>
/// <exception cref="ObjectDisposedException">Aggregator is already disposed.</exception>
public TResult Execute<T1, T2, TResult>(Func<T1, T2, TResult> function, T1 argument1, T2 argument2)
{
if (isDisposed)
throw Exceptions.AlreadyDisposed(null);
try {
return function(argument1, argument2);
}
catch (Exception e) {
HandleException(e);
}
return default(TResult);
}

/// <summary>
/// Executes the specified function catching all the exceptions from it,
/// adding it to internal list of caught exceptions and
/// and passing it to <see cref="ExceptionHandler"/> handler.
/// </summary>
/// <typeparam name="T1">The type of the 1st function argument.</typeparam>
/// <typeparam name="T2">The type of the 2nd function argument.</typeparam>
/// <typeparam name="T3">The type of the 3rd function argument.</typeparam>
/// <typeparam name="TResult">The type of function result.</typeparam>
/// <param name="function">The function to execute.</param>
/// <param name="argument1">The 1st function argument value.</param>
/// <param name="argument2">The 2nd function argument value.</param>
/// <param name="argument3">The 3rd function argument value.</param>
/// <returns>Function execution result, if no exception was caught;
/// otherwise, <see langword="default(TResult)"/>.</returns>
/// <exception cref="ObjectDisposedException">Aggregator is already disposed.</exception>
public TResult Execute<T1, T2, T3, TResult>(Func<T1, T2, T3, TResult> function, T1 argument1, T2 argument2, T3 argument3)
{
if (isDisposed)
throw Exceptions.AlreadyDisposed(null);
try {
return function(argument1, argument2, argument3);
}
catch (Exception e) {
HandleException(e);
}
return default(TResult);
}

#endregion

#region IEnumerable<...> methods

/// <inheritdoc/>
[DebuggerStepThrough]
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}

/// <inheritdoc/>
public IEnumerator<Exception> GetEnumerator()
{
if (exceptions==null)
return EnumerableUtils<Exception>.EmptyEnumerator;
else 
return exceptions.GetEnumerator();
}

#endregion

/// <summary>
/// Invoked on any exception caught by <see cref="Execute"/> methods.
/// </summary>
/// <param name="exception">The caught exception.</param>
/// <remarks>
/// If this method throws an exception, it won't be caught.
/// I.e. it will throw "through" any of <see cref="Execute"/> methods.
/// </remarks>
protected virtual void HandleException(Exception exception)
{
if (exceptionHandler!=null)
exceptionHandler(exception);
if (exceptions==null)
exceptions = new List<Exception>();
exceptions.Add(exception);
}


// Constructors

/// <summary>
/// <see cref="ClassDocTemplate.Ctor" copy="true"/>
/// </summary>
public ExceptionAggregator()
: this(null, null)
{
}

/// <summary>
/// <see cref="ClassDocTemplate.Ctor" copy="true"/>
/// </summary>
/// <param name="exceptionMessage">The message of <see cref="AggregateException"/>.</param>
public ExceptionAggregator(string exceptionMessage)
: this (null, exceptionMessage)
{
}

/// <summary>
/// <see cref="ClassDocTemplate.Ctor" copy="true"/>
/// </summary>
/// <param name="exceptionHandler">The exception handler.</param>
/// <param name="exceptionMessage">The message of <see cref="AggregateException"/>.</param>
public ExceptionAggregator(Action<Exception> exceptionHandler, string exceptionMessage)
{
this.exceptionHandler = exceptionHandler;
this.exceptionMessage = exceptionMessage;
}

// Descructor

/// <see cref="ClassDocTemplate.Dispose" copy="true"/>
/// <exception cref="AggregateException">Thrown if at least one exception was caught 
/// by <see cref="Execute"/> methods.</exception>
public void Dispose()
{
if (exceptions!=null && exceptions.Count>0) {
Exception exception = string.IsNullOrEmpty(exceptionMessage) ? 
new AggregateException(exceptions) : 
new AggregateException(exceptionMessage, exceptions);        
exceptions = null;
isDisposed = true;
throw exception;
}
}
}

Finally, both these types use AggregateException:

/// <summary>
/// Aggregates a set of caught exceptions.
/// </summary>
[Serializable]
public class AggregateException : Exception,
IHasExceptions<Exception>
{
private ReadOnlyList<Exception> exceptions;

/// <summary>
/// Gets the list of caught exceptions.
/// </summary>
public ReadOnlyList<Exception> Exceptions
{
[DebuggerStepThrough]
get { return exceptions; }
}

/// <inheritdoc/>
IEnumerable<Exception> IHasExceptions.Exceptions
{
[DebuggerStepThrough]
get { return Exceptions; }
}

/// <inheritdoc/>
IEnumerable<Exception> IHasExceptions<Exception>.Exceptions
{
[DebuggerStepThrough]
get { return Exceptions; }
}

/// <summary>
/// Gets the "flat" list with all aggregated exceptions. 
/// If other <see cref=" AggregateException"/>s were aggregated, 
/// their inner exceptions are included instead of them.
/// </summary>
/// <returns>Flat list of aggregated exceptions.</returns>
public List<Exception> GetFlatExceptions()
{
var result = new List<Exception>();

foreach (var exception in exceptions) {
var ae = exception as AggregateException;
if (ae!=null)
result.AddRange(ae.GetFlatExceptions());
else
result.Add(exception);
}

return result;
}

/// <inheritdoc/>
public override string ToString()
{
StringBuilder sb = new StringBuilder(64);
sb.Append(base.ToString());
sb.AppendFormat("\r\n{0}:", Strings.OriginalExceptions);
int i = 1;
foreach (Exception exception in exceptions)
sb.AppendFormat("\r\n{0}: {1}", i++, exception);
return sb.ToString();
}

#region Private \ internal methods

private void SetExceptions(IEnumerable<Exception> exceptions)
{
var list = exceptions as IList<Exception> ?? exceptions.ToList();
this.exceptions = new ReadOnlyList<Exception>(list);
}

private void SetExceptions(Exception exception)
{
var list = new List<Exception>();
list.Add(exception);
exceptions = new ReadOnlyList<Exception>(list);
}

#endregion


// Constructors

/// <summary>
/// <see cref="ClassDocTemplate.Ctor" copy="true" />
/// </summary>
public AggregateException()
: base(Strings.ExASetOfExceptionsIsCaught)
{
}

/// <summary>
/// <see cref="ClassDocTemplate.Ctor" copy="true" />
/// </summary>
/// <param name="text">Text of message.</param>
public AggregateException(string text)
: base(text)
{
}

/// <summary>
/// <see cref="ClassDocTemplate.Ctor" copy="true" />
/// </summary>
/// <param name="message">Text of message.</param>
/// <param name="innerException">Inner exception.</param>
public AggregateException(string message, Exception innerException) 
: base(message, innerException)
{
SetExceptions(innerException);
}

/// <summary>
/// <see cref="ClassDocTemplate.Ctor" copy="true" />
/// </summary>
/// <param name="exceptions">Inner exceptions.</param>
public AggregateException(IEnumerable<Exception> exceptions) 
: base(Strings.ExASetOfExceptionsIsCaught, exceptions.First())
{
SetExceptions(exceptions);
}

/// <summary>
/// <see cref="ClassDocTemplate.Ctor" copy="true" />
/// </summary>
/// <param name="message">Text of message.</param>
/// <param name="exceptions">Inner exceptions.</param>
public AggregateException(string message, IEnumerable<Exception> exceptions) 
: base(message, exceptions.First())
{
SetExceptions(exceptions);
}


// Serialization

/// <see cref="SerializableDocTemplate.Ctor" copy="true" />
protected AggregateException(SerializationInfo info, StreamingContext context)
: base(info, context)
{
exceptions = (ReadOnlyList<Exception>)info.GetValue("Exceptions", typeof (ReadOnlyList<Exception>));
}

/// <see cref="SerializableDocTemplate.GetObjectData" copy="true" />
public override void GetObjectData(SerializationInfo info, StreamingContext context)
{
base.GetObjectData(info, context);
info.AddValue("Exceptions", exceptions);
}
}

And now we're going to the final step. Let's add one more extension method to our DisposableExtensions:

/// <summary>
/// Joins the specified disposable objects by returning
/// a single <see cref="JoiningDisposable"/> that will
/// dispose both of them on its disposal.
/// </summary>
/// <param name="disposable">The first disposable.</param>
/// <param name="joinWith">The second disposable.</param>
/// <returns>New <see cref="JoiningDisposable"/> that will
/// dispose both of them on its disposal</returns>
public static JoiningDisposable Join(this IDisposable disposable, IDisposable joinWith)
{
return new JoiningDisposable(disposable, joinWith);
}

When this is done, you can use the following code with Join to safely dispose two or more IDisposables:

/// <inheritdoc/>
[DebuggerStepThrough]
public override object OnEntry(object instance)
{
// ...
var sessionScope = sessionBound.ActivateContext();
var transactionScope = Transaction.Open(sessionBound.Session, true)
return transactionScope.Join(sessionScope); // Joins 2 disposables into one
}

// ...

/// <inheritdoc/>
[DebuggerStepThrough]
public override void OnExit(object instance, object onEntryResult)
{
var d = (IDisposable) onEntryResult;
d.DisposeSafely(); // Safely disposes 2 disposables
}  

P.S. All the code provided here was taken from our Xtensive.Core assembly. There are JoiningDisposable, DisposableExtensions and everything else I just described.

Next time I'll tell you how to simplify safe dealing with IDisposable further.

Monday, July 20, 2009

Separate Microsoft Help 2.0 SDK installation

It's well-known that Microsoft Help 2.0 SDK can be installed only as part of Visual Studio 2005/2008 SDK - there is simply no separate installer for it. Thanks to us, there is such an installer now, moreover, it contains both Microsoft Help 1.0 and Microsoft Help 2.0 SDKs.

You can download it here (as .rar archive) - there are just 2.8 MB instead of more than 100 MB you should normally install (Visual Studio 2008 SDK).

Installation instructions:
- Extract appropriate SDK folder(s) to "C:\Program Files\"
- Use Install.bat and Uninstall.bat inside SDK folders to install or uninstall a particular one.
- If you can't use "C:\Program Files\" as installation location, update paths in .reg files.

Happy documenting ;)

P.S. Check out FiXml - quite likely it will be helpful for you as well.

Thursday, May 14, 2009

Fast expression compilation

Recently we've faced a well-known problem: LambdaExpression.Compile is quite slow. Mainly, because of:
- Huge amount of reflection
- Excessive security checks
- Absence of any caching!

We've faced the issue while implementing compilation to executable providers for our RecordSet engine for DO4. We should compile generally any expression, such as predicate passed to FilterProvider here. Event quite simple LINQ queries using just a single .Where invocation normally require compilation of 2 expressions: one for original .Where criteria, and one - for index range definition for RangeSetProvider (we build it as expression).

Obviously, we recommend everyone to use our cached queries in DO4 - they eliminate the problem. But what if you can't cache the query - e.g. if it is built dynamically, and actual instances almost always differ?

So we've implemented LambdaExpression compilation cache. Being implemented well, it solves all the above problems.


Raw results

Simple expression compilation test:
- Expression: (int a, int b) => a + b;
- Without caching: 15,884 K compilations/s.
- With caching: 45,158 K compilations/s.

Complex expression compilation test:
- Expression: (int a, int b) => new {Result = a + b * 2 / a}.Result + DateTime.Now.Day * a * b - a + b
- Without caching: 0,945 K compilations/s.
- With caching: 25,747 K compilations/s.

As you see, the acceleration factor varies from ~ 3x to 26x! Even this result seems very good. But we decided to investigate why results of standard .NET expression compilation differs so much here and implemented one more test:

Always new expression compilation test:
- Expression: (int a, int b) => i;
- Without caching: 1,707 K compilations/s.
- With caching: 27,960 K compilations/s.

i
is an integer increasing its value by one on each compilation attempt. So it looks like there is already some kind of caching in .NET expression compilation logic. But IT FAILS even on such a simple case (difference in constant value)! Probably, they simply cache the compilation result using expression instance as key.

So "true" acceleration factor is at least 16x.


Implementation

Note: Further I'll use GenericType(of T) instead of standard C# notation with square brackets, because Blogger hates them - it simply cuts them out.

There is Xtensive.Core.Linq.ExpressionCompileExtensions class providing a set of additional .CachingCompile() methods to Expression type. Its usage is almost the same as of original .Compile method:
- Original code: var compiledLambda = lambda.Compile();
- New code: var compiledLambda = lambda.CachingCompile();

How it could work:
- Find cached version of provided expression tree in some dictionary
- Return its compiled version, if it exists; otherwise, compile & cache it.

Actually everything is much more complex:
- Expressions aren't comparable for equality. They neither override Equals, nor GetHashCode. But comparison for equalitry is required to use them as key in dictionary. We can't use default implmentation as well - Expression instances are almost always built anew rather than cached.
- Even if they would be comparable for equality, they won't really be equal because of closures: a new instance of closure is referenced by corresponding ConstantExpression on the subsequent creation of expression tree. Even if we could be able to compare such constants for equality, they would almost always differ.

So we've implemented:
- ExpressionTree class implementing IEquatable(of ExpressionTree), and properly overriding default GetHashCode & Equals. This allows us to compare expressions.
- ConstantExtractor - a visitor rewriting the original expression and removing any dependencies on constants from it. In fact, we rewrite original expression replacing each const of type T there to ((T) additionalLambdaParameter[constNumber]) there. E.g. () => 1 would be rewritten to (consts) => ((int) consts[0]), and we would cache this expression. ConstantExtractor builds the array of constant values (actual consts value) during processing of the original expression. Since this happens on any attempt to compile the expression by our caching compiler (because first of all we must build caching key - an ExpressionTree containing no constants), we always have this array of constants.
- So in the end we always have both compiled expression (processed by ConstantExtractor) and the array of constants. So we should just "bind" this array to the comiled expression. "Bind" means converting a pair of f(consts,a,b,c, ...) and extractedConsts to g(a,b,c, ...) = f(extractedConsts,a,b,c, ...). This actually done by corresponding overload of .CachingCompile method. Its typical code is:

public static Func(of T1, T2, ..., TResult) CachingCompile(of T1, T2, ..., TResult)
(this Expression(of Func(of T1, T2, TResult)) lambda)
{
var result = CachingExpressionCompiler.Instance.Compile(lambda);
return ((Func(of object[], T1, T2, ...)) result.First).Bind(result.Second);
}
.Bind is one more extension method provided by Xtensive.Core.Helpers.DelegateBindExtensions class.
All .Bind and .CachingCompile versions are generated by T4 templates - finally we've found the place where we could use them ;)

And finally, we use our ThreadSafeDictionary as actual cache, so any cached compiled result is never purged. Initially this may look as a serious lack, but actually it isn't: currently .NET is incapable of unloading any IL code. Even if we don't use cache, expression compilation eats some RAM, and this RAM is never released. So our "caching compiler" just eats a bit more - certainly, only if we don't hit the cache.


Conclusions

Pros and cons:
- We've got much faster expression compilation
- It's quite likely this solution significantly decreases the amount of RAM consumed by complied expressions during the application lifetime, since it significantly decrease the number of actual compilations.
- The compiled expression we return is a bit slower, because we replace constants to array accessors there and add one more delegate call (.Bind method does this). But I feel this will be acceptable in 99.9% of cases.

Possible improvements:
- Use lightweight expression adapters instead of Expression descendants as result of ConstantExtractor, such as the ones from MetaLinq. .NET expressions perform lots of checks during their construction, which aren't necessary here. We must just be able to compare such tree for equality with another one, compute its hash code, and much rarely - convert it to .NET expressions to get it compiled. I feel this should improve the performance at least twice.


Usage

- Download DO4 - it contains compiled version of Xtensive.Core.dll, as well as its source code
- Add reference to Xtensive.Core.dll to your project
- Add "using Xtensive.Core.Linq;" to C# file containing lambda.Compile()
- Replace lambda.Compile() to lambda.CachingCompile().

Wednesday, May 6, 2009

Built-in Visual Studio 2008 Code Generator

Have you ever heard about code generator built-in VS 2008 by default? Microsoft named it "T4: Text Template Transformation Toolkit". It is integrated into Visual Studio and very easy to use. Let me demonstrate it on example.

Let us suppose we need a class with methods like those for such primitive types as Int32, Double, Decimal, DateTime, TimeSpan and other.

  public static Double ParseDouble(string value)
  {
    if (value == "Default")
      return default(Double);
    if (value == "MinValue")
      return Double.MinValue;
    if (value == "MaxValue")
      return Double.MaxValue;
    return Double.Parse(value);
  }

  public static Double? ParseNullableDouble(string value)
  {
    if (value == "Default")
      return default(Double);
    if (value == "MinValue")
      return Double.MinValue;
    if (value == "MaxValue")
      return Double.MaxValue;
    if (value == "Null")
      return null;
    return Double.Parse(value);
  }


Because it isn't very interesting to implement it manually I suggest to use built-in code generator. How we can do this:

1. Create class SmartParser.cs and implement those methods for one type, e.g. for Int32
2. Rename SmartParser.cs to SmartParser.tt
3. Now it is possible to use ASP.NET-like tags <# #> and <#= #> to generate this code for all types we need and Visual Studio will automatically create file SmartParser.cs

In our case SmartParser.tt will look like this:

using System;

public static class SmartParser
{
<#
Type[] types = new Type[] {
  typeof(Int32),
  typeof(Double),
  typeof(Decimal),
  typeof(DateTime),
  typeof(TimeSpan)
};
foreach (Type type in types) {
#>

  public static <#=type.Name#> Parse<#=type.Name#>(string value)
  {
    if (value == "Default")
      return default(<#=type.Name#>);
    if (value == "MinValue")
      return <#=type.Name#>.MinValue;
    if (value == "MaxValue")
      return <#=type.Name#>.MaxValue;
    return <#=type.Name#>.Parse(value);
  }

  public static <#=type.Name#>? ParseNullable<#=type.Name#>(string value)
  {
    if (value == "Default")
      return default(<#=type.Name#>);
    if (value == "MinValue")
      return <#=type.Name#>.MinValue;
    if (value == "MaxValue")
      return <#=type.Name#>.MaxValue;
    if (value == "Null")
      return null;
    return <#=type.Name#>.Parse(value);
  }
<#
}
#>

}


Now we can open automatically generated SmartParser.cs just to ensure, that everything is OK and our "huge" class is successfully generated (-;

You can generate not only *.cs files, but *.xml, *.txt, *.html and generally text files with any extension. There are also several useful directives, for example:

<#@ output extension="txt" #>
<#@ template language="C#v3.5" #>
<#@ import namespace="System.Linq" #>
<#@ assembly name="System.Core" #>


Any additional information on T4 can be easily found оn the web.

Sunday, November 9, 2008

Are internal members really internal?

Each C# developer knows the "internal" access level keyword — The "internal" keyword is an access modifier for types and type members. Internal types or members are accessible only within the same assembly.

But don't be fully confident in it when dealing with microsofties. Those artful guys invented in the .Net Framework 2.0 a special attribute — "InternalsVisibleToAttribute" and now your internal members can become - guess what? - public! (of course they will be visible only for specified assemblies).

No one knows the real reason for such an "invention" but this innovation has been highly estimated by test-driven development adopters because the usage of this attribute allows your test libraries to access internal classes and methods for additional testing and coverage.

I think that any other usage of it can be considered a lack of architectural design but what about the microsofties? Do they use it for product-level assemblies or for testing purposes only? Let's see. Oren Eini made some interesting investigation on how the attribute is used inside the .Net Framework:


System.Data allows:
  • System.Data.Entity
  • SqlAccess
  • System.Data.DataSetExtensions
Microsoft.NETCF.Tools allows:
  • System.Web.Services
Microsoft.Office.Tools.Common.v9.0 allows:
  • Microsoft.Office.Tools.Word.v9.0
  • Microsoft.VisualStudio.Tools.Office.Designer.Office2007
  • Microsoft.VisualStudio.Tools.Office.Designer.Office2007Tests
  • Microsoft.VisualStudio.Tools.Office.Outlook.UnitTests
Microsoft.Build.Conversion allows:
  • Microsoft.Build.Conversion.Unittest
Microsoft.Build.Engine allows:
  • Microsoft.Build.Engine.Unittest
System.Core allows:
  • Dlinq.Unittests
And so on.

As for Xtensive products, unit testing is the only application of the "InternalsVisibleTo" attribute.

Friday, November 7, 2008

Link: object disposal, finalization and resource management

This time I publish a link to an article - the article itself is really perfect. "Must know" for any .NET developer.

Here are some quotations from it to make you a bit more interested:
- Do allow your Dispose method to be called more than once. The method may choose to do nothing after the first call. It should not generate an exception.
- Consider setting disposed fields to null before actually executing Dispose when reference cycles in an object graph are possible.
- Avoid throwing an exception from within Dispose except under critical situations where the containing process has been corrupted.
- Do not assume your finalizer will always run.
- Do write finalizers that are tolerant of partially constructed instances.
- Do write finalizers that are threading-agnostic. Finalizers can execute in any order, on any thread, can occur on multiple objects concurrently, and even on the same object simultaneously.
- Do gracefully handle situations in which your finalizer is invoked more than once.

The original article is here.

Thursday, October 30, 2008

The cost of [ThreadStatic] attribute

First of all, raw results:

Instance field: Operations: 2,650 B/s.
Static field: Operations: 2,630 B/s.
Volatile instance field: Operations: 2,649 B/s.
[ThreadStatic] field: Operations: 43,611 M/s.
Thread data slot: Operations: 4,068 M/s.

The test is actually quite simple: we read specified field in a loop. As before, on Core 2 Duo @2.66GHz. The code can be found in DataObjects.Net 4.0 test suite, see Xtensive.Core\Xtensive.Core.Tests\DotNetFramework\FieldTypeTest.cs.

Conclusions:
- Reading regular, static or volatile field is quite cheap: ~ 0.2x in previously introduced metrics, or 20% of virtual method call time
- [ThreadStatic] fields are actually quite costly in comparison to others: ~ 14x.

Now the main question: why? It isn't so obvious [ThreadStatic] is ~ 60 times slower than static.

JITted [ThreadStatic] access code actually always consists of two parts:
- Call to a system routine returning address of [ThreadStatic] field by its token
- Regular field access instruction.

Obviously, the first part (call) "eats" almost the whole execution time: there is no more efficient way to get the address of such a field by its token rather than using hash table. As I've mentioned before, reading from a system hash table takes ~ 10x. So that's nearly what we have in this case.

Why they're implemented this way in .NET? I can't imagine why they didn't use some faster approach. E.g. I suspect calculating the lowest stack boundary (as well as the upper one) from the current stack pointer value is quite simple operation - something like bitwise and. Why we can't store the address of the first [ThreadStatic] location as fixed address nearby it, and use constant offset for each [ThreadStatic] field relatively to the address of the first one? In this case it would take ~ 1x to access it...

Ok, this is what could be, but in reality we have 14x.

Finally, there are thread data slots as well. But they're 10 times slower than [ThreadStatic] fields, so it's always better to simulate their behavior with e.g. a Dictionary stored in [ThreadStatic] field.

Wednesday, October 29, 2008

The cost of method calls

Here I'll talk about the cost (or performance) of various ways of method invocation in .NET.

First of all, let's assume:
- Virtual method call time is 1x. This is about ~ 600M calls / sec. on Core 2 Duo @ 2.66 GHz. To be precise, we're talking about instance method getting no arguments and returning a single Int32 value (i.e. it is an average property getter).

So:
- Delegate method call time is 1.5x (the same method invoked by delegate pointing to it)
- Interface method call time is 2x (the same method invoked on reference of interface type)

A bit surprising, yes? The explanation is here. Briefly, interface method dispatch is more complex than delegate dispatch, since we must locate appropriate interface method table for the instance we have first. If count of implemented interfaces on a particular type is quite large (that's actually almost impossible), the only good way to do this is to use hash table. But if it is small (that's the most frequent case), there must be just a search in small [possibly - ordered] list. But this is anyway more complex than in case with delegate, since delegate already contains the exact method address (this isn't correct for open delegates).

Now one more fact to think about: all the tests I'm talking about are loop-based, performed in Release configuration, and their code is nearly the following:

for (int i = 0; i<...; i+=10) {
o = c.GetValue();
o = c.GetValue();
o = c.GetValue();
o = c.GetValue();
o = c.GetValue();
o = c.GetValue();
o = c.GetValue();
o = c.GetValue();
o = c.GetValue();
o = c.GetValue();
}


.NET is able to cache resolved interface method tables (and possibly - even method addresses), so calling interface method in a loop must be a bit faster, than calling it once. So in general the cost of interface method call in comparison to delegate call is even bigger.

This explains why we have such types as Hasher(of T) (Comparer, Arithmetic, etc.). They cache the delegates performing a set of operations on T type faster than with use of similar interface in their fields. See e.g. Hasher(of T).GetHash field. Certainly, such an approach is used only when performance is essential - i.e. when it's well known these operations will be invoked many times e.g. on any index seek.

Let's look of few more metrics:
- Creating a delegate pointing to non-virtual method time is 7.5x
- Creating a delegate pointing to virtual method time is 50x
- Creating a delegate pointing to interface method time is 150x

So delegate construction isn't cheap at all.

Now we're turning to virtual generic methods:
- Virtual generic method call time is 10x - independently on of it is called on interface or not. Dependency on count of generic arguments almost absents as well - adding one more argument makes the call longer by ~ 0.5%.
- Creating a delegate pointing to generic virtual method time is 1000x. Not sure, why - it looks like because of some bug in .NET. Since delegates in .NET may point to fully parameterized methods only (they store method address), so the time of calling such a delegate is 1.5x, as before.

Why it's so costly to call virtual generic method? Because there can be generally any number of its implementations, dependently on the argument substitution, so .NET framework resolves its address using internal hash table, that must be bound to corresponding virtual (or interface) method table.

So we may also take that:
- Internal hash table seek time is ~ 10x - I'll use this time in my future posts to show how to estimate the cost of generally any basic operation in .NET.

And few conclusions related to generic virtual methods:
- The smallest heap allocation time is 3.5x; Int32 boxing time is 4x (with its heap allocation). So it's almost always cheaper to have non-generic virtual method with object type argument, rather than generic virtual method.
- If generic virtual method seems anyway preferable, you might think about implementing its "call caching" with use of delegates. E.g. we use DelegateHelper.CreateDelegates and DelegateHelper.ExecuteDelegates methods to perform the operations on Tuples of the same structure faster.

Sunday, October 26, 2008

Open delegates

Do you know what is open delegate? I was quite surprised when I knew this. They exists since .NET 2.0, but I suspect they were not documented in MSDN initially, so e.g. I have known this much later.

Links:
- An example
- Official documentation.

So what is open delegate? It is a delegate bound only to a method, but not to an instance, so you can use the same delegate to call the method it is bound to on many instances. That's it.

And few more notes:
- Open delegate invocation must take at least the same time as the underlying virtual or interface method invocation, since .NET can't use the same way of invocation as for the regular one (there is no single method call address to use). We didn't test this yet, but if we will, I publish the results here.
- An open delegate bound to virtual generic method must be the slowest one - by the same reason. Do you know why virtual generic methods are the slowest ones? My next post here will explain this.
- Btw, various ways of invocation in .NET are perfectly described here. The article covers i.e. regular, virtual, interface methods and delegates.