Wednesday, December 30, 2009

Query preprocessors, Inversion of control & Localization support

LINQ translator extension

As I promised earlier, we were going to make LINQ translator extendable and finally we’ve made this. The extension mechanism is called “Custom LINQ query preprocessors” and is already included into DataObjects.Net 4.1 code base.

Custom LINQ preprocessors must implement public interface IQueryPreProcessor which is defined in Xtensive.Storage assembly. Here it is:

public interface IQueryPreProcessor
{
  Expression Apply(Expression query);
}

As you might see, the contract is quite simple and straightforward: your preprocessor receives the whole query, modifies it in  the way you need, and returns the modified one. All preprocessors are called before the query is processed by internal LINQ translator, so it is the right time and place to apply necessary modifications.

Connecting preprocessors to translator (IoC)

After you have written you preprocessors, it is time to plug-in them to DataObjects.Net. This is done with the help of Inversion of Control concept. In order to follow it, you need to take the following steps:

1. Add reference to Microsoft.Practices.ServiceLocation.dll assembly. It is shipped with DataObjects.Net 4 and can be found in %DataObjects.Net Directory%\Lib\CommonServiceLocator directory.

2. Configure IoC container through application configuration file.

Add this line to configSections part:

    <section name="Services" type="Xtensive.Core.IoC.Configuration.ConfigurationSection, Xtensive.Core"/>

Add the corresponding configuration section:

<Services>
  <containers>
    <container name="domain">
      <types>
        <type type="Xtensive.Storage.IQueryPreProcessor, Xtensive.Storage" mapTo="Xtensive.Storage.Samples.Localization.QueryPreProcessor, Xtensive.Storage.Samples.Localization" singleton="true" />
      </types>
    </container>
  </containers>
</Services>

Note the usage of named service container (“domain”), the  IQueryPreProcessor type as an interface of a service and how it is mapped to the concrete implementation.

3. The last step is to configure Domain object & the above-mentioned service container.

// Building domain
domain = Domain.Build(DomainConfiguration.Load("Default"));

// Configuring domain-level services
var configurationSection = (ConfigurationSection)ConfigurationManager.GetSection("Services");
var container = new ServiceContainer();
container.Configure(configurationSection.Containers["domain"]);
domain.Services.SetLocatorProvider(() => new ServiceLocatorAdapter(container));

LINQ preprocessor in action

Having these actions done, we get the capability of using non-persistent localizable properties (Domain model can be found here) in LINQ queries:

using (var ts = Transaction.Open()) {

  Console.WriteLine("Implicit join through preprocessor");
  var pages = from p in Storage.Query.All<Page>()
  where p.Title=="Welcome!"
  select p;
  Console.WriteLine(pages.ToList().Count);

  ts.Complete();
}

Pay attention that neither PageLocalization type nor its members participate in the query, original p.Title expression in Where clause is used instead. As we know, Page.Title is not a persistent property and regular LINQ translator doesn’t know how to translate this expression. But having the initial query preprocessed with Xtensive.Storage.Samples.Localization.QueryPreProcessor makes such kind of expressions possible to use. The only thing the preprocessor makes is the replacement of p.Title expression to something like this:

p.Localizations.Where(localization => localization.CultureName==LocalizationContext.Current.CultureName)
          .Select(localization => (string)localization[“Title”])
          .FirstOrDefault();

That’s it.

The source code is available in our public repository in Xtensive.Storage.Samples.Localization folder.

Happy preprocessing! =)

Monday, December 28, 2009

Logging, part 3. Configuring logging through log4net

In the previous post I demonstrated how to configure and use internal DataObjects.Net logging capabilities. These are rather useful and flexible but in case you want much more flexibility or something, using one of the external logging frameworks could be the right choice for you.

As I already mentioned, external logging components are connected to DataObjects.Net through the mechanism of adapters: DataObjects.Net => adapter for logger => logger.

In case of log4net you’ll need the following assemblies:

  • log4net.dll (can be found here)
  • Xtensive.Adapters.log4net.dll (is contained in DataObjects.Net installer)

The next step is to configure both DataObjects.Net & log4net.

Add these sections to configSection block of your application configuration file:

    <section name="log4net" type="log4net.Config.Log4NetConfigurationSectionHandler,log4net"/>
    <section name="Xtensive.Core.IoC" type="Xtensive.Core.IoC.Configuration.ConfigurationSection, Xtensive.Core"/>

Xtensive.Core.IoC namespace goes for basic Inversion of Control implementation however it is powerful enough to accomplish most of appropriate tasks. In this case it is used to map Xtensive.Core.Diagnostics.ILogProvider interface to some external implementation (Xtensive.Adapters.log4net.LogProviderImplementation type).

<Xtensive.Core.IoC>
  <containers>
    <container>
      <types>
        <type type="Xtensive.Core.Diagnostics.ILogProvider, Xtensive.Core" mapTo="Xtensive.Adapters.log4net.LogProviderImplementation, Xtensive.Adapters.log4net" singleton="true"/>
      </types>
    </container>
  </containers>
</Xtensive.Core.IoC>

The last step is log4net configuration:

<log4net>
  <appender name="FileAppender" type="log4net.Appender.FileAppender">
    <file value="log-file.txt" />
    <appendToFile value="true" />
    <lockingModel type="log4net.Appender.FileAppender+MinimalLock" />
    <layout type="log4net.Layout.PatternLayout">
      <conversionPattern value="%date %-5level %logger - %message%newline" />
    </layout>
  </appender>
  <root>
    <level value="WARN" />
    <appender-ref ref="FileAppender" />
  </root>
<!-- To log warnings & errors from Xtensive.Storage.* loggers --> <logger name="Storage" additivity="false"> <level value="WARN" /> <appender-ref ref="FileAppender" /> </logger> <!-- To log all SQL statements --> <logger name="Storage.Providers.Sql" additivity="false"> <level value="ALL" /> <appender-ref ref="FileAppender" /> </logger> </log4net>

Having these configuration steps done, you’ll get DataObjects.Net & log4net bundle configured & working.

Friday, December 25, 2009

Logging, part 2. Architecture & configuration

The main goal was: how to make logging and its configuration easy for simple scenarios and in the meantime highly adaptable for complex ones.

In order to achieve the required level of flexibility most logging frameworks have the following components:

  • Loggers (named instances of some public class or interface (usually ILog) that provides developers with functionality to write diagnostic messages to).
  • Appenders (output destinations of above-mentioned loggers).
  • Log manager or log provider (usually a central access point of a framework. It resolves loggers by their names).

DataObjects.Net logging system follows exactly this pattern but bearing in mind that it must provide the possibility to plug-in any logging framework it introduces its own set of abstract components (actually loggers & log manager, but not appenders) which in fact just wrap up the plugged-in ones and simply redirect diagnostic messages to them. Moreover, in case when none of standalone logging frameworks is plugged-in, DataObjects.Net contains its own simple implementation of those components.

The main access point is the public static LogProvider class with one method LogProvider.GetLog(string logName), which is used to resolve a required ILog instance by its name. Once instance of ILog is obtained, it can be used to log Debug, Info, Warning, Error & FatalError messages through the corresponding methods.

It could be considered as a good practice when members from one namespace log their messages into the same logging space. This namespace-based approach is quite useful as usually a namespace contains a set of classes that are closely coupled and execute some shared piece of programming logic, therefore the idea to merge their diagnostic output in one log seems to be rather sensible. Due to this approach DataObjects.Net contains a set of predefined loggers for most frequently used namespaces, such as: Xtensive.Core, Xtensive.Storage, Xtensive.Storage.Building and so on. Each of these loggers has name which corresponds to its namespace except "Xtensive." prefix. Say, logger for Xtensive.Core namespace is named as "Core".

For usability reasons the above-mentioned namespaces contain public static class named Log which exposes the same set of logging methods as ILog interface. As you might understand, each of these static Log classes is no more than a connector between its consumers (classes which use it as a logger) and corresponding ILog instance that is transparently constructed on demand.

Configuring internal DataObjects.Net's log output

Internal logging subsystem is not as powerful as some well-known logging monsters but rather flexible and doesn't require any additional components. Configuration of DataObjects.Net's logging is made in application configuration file (app.config or web.config).

First of all, include Xtensive.Core.Diagnostics section into configSections section:

  <configSections>
    <section name="Xtensive.Core.Diagnostics" type="Xtensive.Core.Diagnostics.Configuration.ConfigurationSection, Xtensive.Core" />

The second step is to configure logs (appenders in terms of log4net):

<Xtensive.Core.Diagnostics>
  <logs>
    <!-- Use these settings for Xtensive.Storage.* logs -->
    <log name="Storage" events="Warning,Error,FatalError" provider="File" fileName="Storage.log" />
  </logs>
</Xtensive.Core.Diagnostics>

Note that each log has a name which is equal to the namespace where it is located except "Xtensive." prefix. This is true for logs from DataObjects.Net only and might not be true for logs from your own application.

Types of events: Debug, Info, Warning, Error, FatalError.

Types of providers: File (you need to provide file name as well), Debug, Console, Null (no logs at all, analogue of /dev/null), Error.

The example of log:

2009-12-17 00:00:02,052 DEBUG Storage.Providers.Sql - Session 'Default, #9'. Creating connection 'sqlserver://*****'.
2009-12-17 00:00:02,052 DEBUG Storage.Providers.Sql - Session 'Default, #9'. Opening connection 'sqlserver://*****'.
2009-12-17 00:00:02,052 DEBUG Storage.Providers.Sql - Session 'Default, #9'. Beginning transaction @ ReadCommitted.
2009-12-17 00:00:02,068 DEBUG Storage.Providers.Sql - Session 'Default, #9'. SQL batch: 
SELECT [a].[Id], [a].[TypeId], [a].[Name], [a].[Code], [a].[Description], [a].[LongDescription],
 [a].[IsForChildren], [a].[BasePrice], [a].[Price], [a].[SizeString], [a].[HasNoInnerCover]
 FROM [dbo].[Product] [a] ORDER BY [a].[Id] ASC
2009-12-17 00:00:02,068 DEBUG Storage.Providers.Sql - Session 'Default, #9'. Commit transaction.
2009-12-17 00:00:02,068 DEBUG Storage.Providers.Sql - Session 'Default, #9'. Closing connection 'sqlserver://*****'.

Looks pretty good, right?

In the next post I’ll describe how to use external logging framework with DataObjects.Net.

Wednesday, December 23, 2009

Logging, part 1. Introduction

In the next posts I’m going to describe how logging in DataObjects.Net is designed, how it works and how to configure and use it in most effective way. In the meantime, I’m writing exactly the same chapter in the manual, so this work will be paralleled, although I suppose that the blog version will be a bit more informal than manual’s one.

Let’s start then.

In general, logging is the feature most of software engineers use to track how the system works and analyze when it starts to behave in improper manner. It goes without saying that logging capabilities are essential for any product designed primarily for developers and software companies. But having decided that your framework must use some kind of logging, you immediately face up to another challenge: which logging system to use as there are plenty of them (log4net, NLog, etc.). Moreover, you might want to invent your own super-duper logging system.

This choice is rather simple for small products or libraries: they just use one of the most famous and simple in usage, i.e. log4net, or writes everything to some place that can be set up somewhere in configuration file, i.e. "C:\Debug\". But is this straightforward approach good enough for their customers that use these small standalone components to build something more complex and non-trivial?

In such cases the right word is "Transparent integration", it really matters how your small library can be integrated into large system, is its logging subsystem flexible enough to be easily integrated with logging framework that is used there? These are the questions DataObjects.Net development team was thinking about when logging subsystem was about to be implemented.

To be continued…

BTW, we are going to publish the updated localization sample with the generalized LINQ pre-processor that is used to automatically and transparently join localizable & localization entities and substitute calls to localizable properties soon. Stay tuned!

Wednesday, December 16, 2009

Localization support, part 4. Queries

Another interesting part in the localization support story (part 1, part 2 & part 3) is how to make queries for localizable objects.

Problem

The only problem here is the virtuality of localizable properties.

public class Page : Entity
{
  ...
  public string Title
  {
    get { return Localizations.Current.Title; }
    set { Localizations.Current.Title = value; }
  }

  public string Content
  {
    get { return Localizations.Current.Content; }
    set { Localizations.Current.Content = value; }
  }
  ...
}

As these are not persistent properties, domain model doesn’t contain even a bit of information about them nor Page table doesn’t contain corresponding Title & Content columns, therefore LINQ translator simply doesn’t know what to do when it encounters them during LINQ query parsing stage. The following query leads to InvalidOperationException:

var pages = Query<Page>.All.Where(p => p.Title=="Welcome");

Exception:
“Unable to translate '$<Queryable<Page>>(Query<Page>.All).Where(p => (p.Title == "Welcome"))' expression”.

And the Exception.InnerException shows us the detailed description of the failure:
“Field 'p.Title' must be persistent (marked by [Field] attribute)”.

Therefore, the query in order to be executable must be rewritten in the following way:

var pages = from p in Query<Page>.All
join pl in Query<PageLocalization>.All
  on p equals pl.Target
where pl.CultureName==LocalizationContext.Current.CultureName && pl.Title=="Welcome"
select p;

Certainly, it is not convenient to write such overloaded queries every time you want to filter or sort by localizable properties. The only way we can optimize it is to introduce some level of abstraction.

Solution

First step is to define LocalizationPair, a pair of target entity and corresponding localization:

public struct LocalizationPair<TTarget, TLocalization> where TTarget: Entity where TLocalization: Model.Localization<TTarget>
{
  public TTarget Target { get; private set; }
  public TLocalization Localization { get; private set; }
}

The next one is to build a class that hides the complexity of join and filter operation. I named this class as “Repository”, but frankly speaking it isn’t real repository as it doesn’t implement all functionality from well-known DDD Repository pattern. Anyway, here it is:

public static class Repository<TTarget, TLocalization> where TTarget: Entity where TLocalization: Model.Localization<TTarget>
{
  public static IQueryable<LocalizationPair<TTarget,TLocalization>> All
  {
    get
    {
      return from target in Query<TTarget>.All
        join localization in Query<TLocalization>.All
          on target equals localization.Target
        where localization.CultureName==LocalizationContext.Current.CultureName
        select new LocalizationPair<TTarget, TLocalization>(target, localization);
    }
  }
}

And this is how we can use these 2 classes in queries:

var pages = from pair in Repository<Page, PageLocalization>.All
where pair.Localization.Title=="Welcome!"
select pair.Target;

Simple and functional enough to use.

Conclusion

The above-mentioned sample is built on basis of DataObjects.Net 4.1, no changes were made to ORM itself to achieve the declared features and I think this is quite promising in terms of maturity and flexibility.

There are 2 ways how we are going to develop the very idea of localization:

  • LINQ extension API will be implemented. This will help us to connect custom LINQ query rewriters which will transparently alter queries in order to insert joins, filters and stuff. Particularly this feature will eliminate the necessity of LocalizationPair & Repository classes, defined in the sample. As soon as this functionality appears, I’ll update the sample and make a post on this topic.
  • The actual localization support on ORM level will be implemented at the same time with full-text search realization or a bit later because these features are interconnected.

Friday, December 11, 2009

Localization support, part 3. CRUD operations

In the previous posts we discussed the options for localization implementation on database level & on domain model level. Let’s continue the topic and see how CRUD operations can be applied to localized entities.

Scenarios

Generally, there are two scenarios:

1. In first you deal with localized properties as if they are regular ones and DataObjects.Net does all the other stuff in background. You act as there is no such notion as localization. You simply get and set values to properties as usual. In this case you deal with currently active culture and corresponding localization object. Switching of Thread.CurrentCulture property or localization scope usage are the best ways to do this.

var english = new CultureInfo("en-US");
var russian = new CultureInfo("ru-RU");
var welcomePage = new Page();

// Editing localizable properties through localization scope
using (new LocalizationScope(english)) {
  welcomePage.Title = "Welcome!";
  welcomePage.Content = "My dear guests, welcome to my birthday party!";
}

// Editing localizable properties through CurrentThread.CurrentCulture
Thread.CurrentThread.CurrentCulture = russian;
welcomePage.Title = "Здравствуйте!";
welcomePage.Content = "Добро пожаловать, дорогие гости! На базаре сейчас всё так дорого.";

Note, code that works with localizable properties, is the same that one that work with any other regular persistent properties.

2. In the second scenario your code knows that some localization takes place and wants to get or update localized properties for several cultures at a time. Say, you edit Page object from website administration system and want to see all available localizations for it. Therefore, you should have the possibility to get them all and to make some changes directly to the chosen ones. The approach with localization scope is not an option for such kind of task.

var goodbyePage = new Page();

// Editing localizations directly
goodbyePage.Localizations[english].Title = "Goodbye!";
goodbyePage.Localizations[english].Content = "Goodbye, my dear friends.";
goodbyePage.Localizations[russian].Title = "До свидания!";
goodbyePage.Localizations[russian].Content = "Надеюсь больше никогда вас не увидеть.";

Both scenarios are supported in the above-mentioned Localization sample and Page instances as well as PageLocalization instances are persisted transparently.

Results

Here are localizations for 2 Page instances:

Table

Note that first 2 columns are key columns. First one is string representation of CultureInfo, and the second on is a reference to localizable object. All other columns are localized versions of properties from Page class.

In the next post we’ll figure out what should be done to use LINQ with localized entities.

Thursday, December 10, 2009

Localization support, part 2. Domain modeling

In the previous post we discussed the theoretical possibilities of localization feature implementation in terms of physical organization on database level. Let’s continue the discussion on Domain model level.

Obviously, all these implementation approaches require some common infrastructure, such as automatic binding to Thread.CurrentCulture, temporary switching of current culture and so on. In that case, let’s start with this part.

Infrastructure

  1. To begin with, let’s think of what information do we need.
    First and the most necessary one is an instance of CultureInfo class which is used to represent current culture in particular application code block.
  2. Another point, not so obvious as first one, but also important is some kind of localization policy which describes what should be done if requested localization is not found. Say, you have localizations for “en-US” & “ru-RU” cultures but don’t have one for “fr-FR” culture. What should be done if someone switches Thread.CurrentCulture to “fr-FR” culture and tries to access localized properties? Should new localization for the specified culture be created or default one should be used? If so, which culture is default then?

To answer these questions the notion of immutable LocalizationContext class is introduced. It is defined in pseudo-code as follows:

public class LocalizationContext
{
  public CultureInfo Culture { get; }

  public string CultureName { get; }

  public LocalizationPolicy Policy { get; }

  public static LocalizationContext Current { get; }

}

LocalizationContext.Current property provides a programmer with valid current localization context everywhere it is required.

For now, LocalizationContext.Current is bound to Thread.CurrentThread.CurrentCulture property and changes its value each time current culture of current thread is being changed. Hence, if you want to temporarily change localization context (activate another culture) you are to change Thread.CurrentThread.CurrentCulture property and after doing some work revert it back, which is not robust nor convenient at all. To overcome this problem, LocalizationScope class is added. It acts as a disposable region where specified localization context is activated and after disposal it restores the previous localization scope value. Here is how it works:

// LocalizationContext.Current.Culture is en-US

using(new LocalizationScope(new CultureInfo("ru-RU"))) {
  // LocalizationContext.Current.Culture is ru-RU
  // do some work with ru-RU culture
}

// LocalizationContext.Current.Culture is en-US again

So now LocalizationContext.Current property logic must take into account the presence and configuration of currently active localization scope and fall back to Thread.CurrentCulture in case current localization scope is absent.

Modeling Domain

Say we have a Page class in Domain model with 2 persistent properties: Title & Content, both of them we want to make localizable. Then this is how we do it:

  1. We define PageLocalization - localization class for Page, which contains localized persistent properties. Its primary key consists of 2 fields: a reference to Page and a string representation of CultureInfo, which in turn can be represented as CultureInfo.Name.
  2. We define localizable properties in Page class as NOT persistent. They are no more than wrappers for appropriate localized persistent properties located in localized instance.

Here is localization for page class:

[HierarchyRoot]
public class PageLocalization : Localization<Page>
{
  [Field(Length = 100)]
  public string Title { get; set; }

  [Field]
  public string Content { get; set; }

  public PageLocalization(CultureInfo culture, Page target)
    : base(culture, target)
  {}
}

It inherits Localization<Page> class where key fields are declared.

And here is the Page class:

[HierarchyRoot]
public class Page : Entity
{
  [Field, Key]
  public int Id { get; private set; }

  public string Title
  {
    get { return Localizations.Current.Title; }
    set { Localizations.Current.Title = value; }
  }

  public string Content
  {
    get { return Localizations.Current.Content; }
    set { Localizations.Current.Content = value; }
  }

  [Field, Association(PairTo = "Target", OnOwnerRemove = OnRemoveAction.Cascade)]
  public LocalizationSet<PageLocalization> Localizations { get; private set; }
}

Localizable properties such as Title & Content redirect all calls to currently active PageLocalization which is accessed through Page.Localizations.Current property. What is LocalizationSet<PageLocalization> then?

Believe it or not, LocalizationSet<PageLocalization> is no more than common EntitySet<T> with some additional functionality:

public class LocalizationSet<TItem> : EntitySet<TItem> where TItem : Localization
{
  public TItem this[CultureInfo culture] { get; }

  public TItem Fetch(string cultureName)

  public TItem Current { get; }

  private TItem GetCurrent()

  private TItem Create(CultureInfo culture)
}

Here the decision what to do if localization for the current culture is requested, is made according to localization policy in current localization context.

In the next post we’ll try to figure out how to deal with CRUD operations, LINQ queries and localized entities.

Stay tuned.

Monday, December 7, 2009

Localization support, part 1. Theory

DataObjects.Net 3.x, the successful predecessor of DataObjects.Net version 4.x, contained tons of useful features and feature named “Multilingual database support” was among them. It was implemented at the very core level of ORM as any other feature in 3.x branch by the only reason: the product was architecturally monolithic and all modules were highly coupled with each other.

The design of 4.x version propagates the idea of low coupling, the ORM consists of a set of separate modules which are mostly independent from each other. This approach we are going to apply to localization feature as well.

Requirements

Let’s list requirements for the feature. What do we want from it?

  1. Simplicity and power in one place. The less developer need to do to add localization support to his application, the better. Declarative approach with minimum coding activity on Domain modeling stage will be the right one.
  2. Adequate performance. Usage of localization should not add performance drawbacks.
  3. Automatic integration with Thread.CurrentCulture and/or Thread.CurrentUICulture infrastructure.
  4. Possibility to add new culture(s) in runtime (optional). The less changes to database schema are required in order to add new culture, the better.
  5. Transparent LINQ support.
  6. The less ORM knows about localization, the better. Ideal option is standalone add-on, made on top of the ORM.
  7. There should be a way to get not only one localization for current culture for the particular entity but a set of localizations. This might be required for application administration (translation, adding new cultures and so on).

Database schema level implementation

While there is numerous ways to implement localization support in Domain modeling level, there are only several ways to do it on database schema level. Personally, I see the following options:

1. Localized columns

LocalizableColumns

Every localizable persistent field is mapped to a set of columns, each of them is mapped to the corresponding culture. This approach was used in DataObjects.Net 3.x.

Pros:

  • No performance drawbacks. No additional queries, joins, subqueries is required.
  • Simplicity and obviousness. It is easy to edit culture-dependent columns right in database.
  • Data integrity out of the box because all culture dependent columns are stored in the same row as entity itself.
  • Possibility to configure parameters of each localizable column (length, nullability, type).

Cons:

  • ORM must know about localization in order to fetch or persist to the required set of columns.
  • Database schema alteration is required in order to add new culture to application.
  • No way go retrieve column values for cultures other than the current one.
  • Is not clear how to cache localized data in Session-level and Domain-level cache.

2. Localized tables

LocalizableTables[1]

Localized columns are moved to separate tables, one for the particular culture. This is an analogue of approach in .NET application localization when localized strings and other resources are located in separate assembly and are loaded automatically.

Pros:

  • Data integrity is provided by the foreign keys constraints with ON REMOVE = CASCADE option.
  • Simplicity. All culture-dependent values for the particular entity are located in corresponding table.
  • Possibility to configure parameters of each localizable column (length, nullability, type).

Cons:

  • Join operation is required in order to fetch columns for appropriate culture.
  • All from “Localized columns” approach.

3. Localized entities

LocalizedEntities

Localizable entity is split into 2 parts: common part - “Page” and localized one - “PageLocalization”. Second entity contains localizable fields and its primary key consists of 2 fields: a reference to localizable entity and string representation of CultureInfo (generally, CultureInfo.Name).

Pros:

  • ORM doesn’t know anything about localization at all.
  • Database schema alteration is not required in order to add new culture.
  • It is rather easy to fetch all translations for the particular entity with 1 query.
  • Standard ORM-level caching out of the box.
  • Data integrity is provided by the foreign keys constraints and ORM.
  • Possibility to configure parameters of each localizable column (length, nullability, type).

Cons:

  • Join operation is required to fetch a set of columns for corresponding culture.

4. Localized strings

LocalizedStrings

The most weird one. Could be invented by some geek in experimental goals only. Anyway, let’s investigate it.

Pros:

  • Database schema alteration is not required in order to add new culture.
  • It is rather easy to fetch all translations for the particular entity with 1 query.
  • Data integrity is provided by the foreign keys constraints.

Cons:

  • ORM must know about localization in order to fetch or persist to the required set of columns. 
  • Is not clear how to cache localized data in Session-level and Domain-level cache.
  • Separate query is required to fetch localized data for the particular entity.
  • Parameters of each localizable column (length, nullability, type) can’t be configured separately. “Strings.Value” column is used for all localizable fields, such as strings, integer types, dates and so on, hence maximal value size should be used with no constraints.
  • It is not obvious how to handle localizable field renaming.

Having this options evaluated should definitely help us to choose the most appropriate one for adding localization support to DataObjects.Net 4.1.

Which option for ORM would you take if you are to decide?

Wednesday, December 2, 2009

Npgsql2 provider, version 2.0.7

The time has come, the long awaited version of Npgsql2 provider is released.

You may ask me about the reason, why it was being awaited for so long? The answer is: this summer our PostgreSql support team carried out a set of investigations and proposed some critical speed-related improvements to some parts of Npgsql2 provider source code and sent the patch to PgFoundry. The patch contains 2 optimizations:

  1. Parameters handling in NpgsqlCommand.GetClearCommandText method. A set of parameters was stored in an array-based collection and linear search algorithm was used to find the requested one. This pattern is good enough for relatively small number of parameters but as you might know -- DataObjects.Net 4 uses Über-Batching Technology™, hence large number of parameters in one command might be achieved with ease. Using a dictionary to store and search for requested parameter speeded up this method up to 5 times (400% increment).
  2. NpgsqlCommand contains static readonly Regex field which is used for splitting the entire command text into chunks. We suggested to add RegexOptions.Compiled to its initialization, which might increase the startup time but yields faster execution.

Both of them were accepted and merged into the main branch. So now, with the recent release of Npgsql2 provider version 2.0.7 which includes the above-mentioned changes, we can safely move to Npgsql2 provider in DataObjects.Net (Npgsql provider version 0.97 is currently used).

And last but not the least, we are proud to know that we made such a good contribution to a well-known, successful open source project such as Npgsql2 provider.

Thursday, November 26, 2009

Nested transactions

Hi there, did you miss me? =)

I’ve got several good news about current DataObjects.Net’s development phase.
First of all, we started to implement nested transactions. It might had required to change the current transaction API but we managed to keep the compatibility with the previous version of the API.

In DO 4.0.5 we wrote:

using (var tx = Transaction.Open()) {

  // your code here
  tx.Complete();
}
and that meant that we wanted to say that a transaction is required for that part of code so DO must open one if it is absent; otherwise DO should do nothing.

In DO 4.1 this code means exactly the same thing: I need a transaction, please, open one if it is absent; otherwise do nothing.

And this is how we are going to tell that a new transaction is definitely required:

using (var tx = Transaction.Open(TransactionOpenMode.New)) {

  // your code here
  tx.Complete();
}

Note new TransactionOpenMode enum. It is introduced in DO 4.1 and has 2 options: Auto and New. Auto goes for default behavior (I don’t care which transaction I need, just provide me with one) and New goes for new (or nested one, if an outer transaction is already opened).
Rolling back nested transaction does not make any harm to outer transaction whereas commit of outer transaction automatically commits all nested transactions.

Nested transaction are implemented for SQL-based storages with the help of Savepoint feature. It is supported by most SQL servers such as MS SQL Server, Oracle, PostgreSQL, etc. Moreover, Savepoint notion is included in SQL standard.

P.S. Please remember that this is preliminary API that might be changed in final version.

Tuesday, November 24, 2009

New personal blog

Hello there!

I’m glad to present you another personal blog from a member of DataObjects.Net team – Alex Kofman’s blog.

I bet, he is going to write about interesting thoughts and facts.

Good luck to him! =)

Wednesday, November 11, 2009

DataObjects.Net goes to Ohloh.net

Here is the link to project’s page.

Join our camp! We’d be glad to see you among DataObjects.Net contributors and users.

Thanks!

Tuesday, November 3, 2009

DataObjects.Net goes to Google Code

Great news!

DataObjects.Net v4 moved to the public Mercurial-based repository. From this moment anyone can join the project and participate, and even build his own DataObjects.Net version. Why not? It is so easy. I bet that Alex is going to write a post in his blog describing how to check out the source code and build your own local copy of DataObjects.Net v4.

In the update list you can might that the major part of DataObjects.Net team except one or two developers works mainly on manual and manual-related tasks. This part has the highest priority for now. And this work is not hidden anymore, anyone can browse the repository, see the overall progress and read the manual even if it is not completed yet. Here is the link to Manual folder. Check it out and begin to read. Every help, suggestions, every found mistake is highly appreciated. Thanks in advance!

BTW, maintaining the source code in public repository, we can exclude the source code from the future DataObjects.Net installers.

CodeProject

Friday, October 30, 2009

DataObjects.Net v4 manual

Finally, we started with long awaited DataObjects.Net v4 full-fledged manual.

But before we started we had spent lot of time in discussions about tools to be used for manual creating.
Three options were suggested:

Microsoft Word approach

Use Microsoft Word for editing then export as HTML and clean up the resulting HTML from Word-ish garbage.

Advantages:
  • WYSIWYG;
  • Easy editing & formatting;
  • Proofing;
  • Inserts images;
  • Inserts highlighted source code from Visual Studio.
Disadvantages:
  • You must run “Export to HTML” & make a cleanup every time you want to view page in browser;
  • Almost no version control;
  • Difficult to make references between resulting HTML pages;
  • Difficult to manage inserted images.

Wiki approach

Use Wiki to build the whole manual then grab its content and convert into local HTML pages.

Advantages:
  • No need for local editors, just your browser and you;
  • Simple version control;
  • Supports highlighting of source code fragments.
Disadvantages:
  • Grabbing & converting wiki into manual is not a trivial task;
  • No proofing (proofing can be provided by some browsers while writing text in textarea control);
  • Wiki markup (you are to know it to format your text).

Plain HTML approach

Advantages:
  • There are a lot of WYSIWYG HTML editors that support Microsoft Word’s functionality (formatting, proofing);
  • Clean HTML code (you control it);
  • You see the resulting HTML at any moment (no need for grabbing or conversion);
  • You control where and how your images are placed and named;
  • Making references between pages is simple;
  • Full version control support as HTML file is a text file.
Disadvantages:
  • Good enough HTML editor must be found;
  • No support for inserting highlighted source code fragments from Visual Studio.

 

After intense discussions we decided to use Plain HTML approach. Firstly we used SharePoint Designer 2007 as HTML editor, then moved to Visual Studio. At last we found Microsoft Expression Web 3, which actually had been made on top of SharePoint Designer. It is faster than VS and more convenient. I like it except its black theme – it is too dark for me.

Microsoft Expression Web 3

One more useful tool that we found is “Copy As HTML” add-in for Visual Studio. It allows you to copy source code from the Code Window and convert it into HTML while preserving syntax highlighting, indentation and background color.

Copy As HTML

 

And what about you? What tools do you use to write help or manuals? Are there any other, more powerful tools or more convenient ways to do this?

Thanks in advance for your ideas.

CodeProject

Tuesday, October 27, 2009

Arbitrary keys & hierarchies. Complete reference set

Here are references to all parts of the “Arbitrary keys & hierarchies” series gathered in one place:

  1. Introduction
  2. Hierarchies
  3. Evolution of Key
  4. Working with keys
  5. Key providers
  6. Identity fields
  7. Custom key generators

Thanks for your interest in DataObjects.Net v4.

CodeProject

Monday, October 26, 2009

Arbitrary keys & hierarchies, part 7. Custom key generators

As DataObjects.Net v4 supports wide variety of keys but has default implementation only for key generators with exactly one identity field, there could be scenario in which custom key generator is required. To close the gap DataObjects.Net v4 declares the following abstract class:

public abstract class KeyGenerator
{
  public KeyProviderInfo KeyProviderInfo { get; private set; }

  public abstract Tuple Next();

  public virtual void Initialize() {}

  protected KeyGenerator(KeyProviderInfo keyProviderInfo) {}
}

Custom generator type must inherit KeyGenerator type and implement at least the abstract method KeyGenerator.Next(). All necessary information concerning the structure of key, caching behavior and so on can be found in KeyProviderInfo type. Here is the custom implementation of key generator for Guid type:

public sealed class MyGuidKeyGenerator : KeyGenerator
{
  public override Tuple Next()
  {
    return Tuple.Create(KeyProviderInfo.TupleDescriptor, Guid.NewGuid());
  }

  public MyGuidKeyGenerator(KeyProviderInfo keyProviderInfo)
    : base(keyProviderInfo)
  {}
}

In order to indicate that a hierarchy must be served with custom key generator, KeyGeneratorAttribute was introduced. Here is how it is intended to be  used:

[HierarchyRoot]
[KeyGenerator(typeof(MyGuidKeyGenerator))]
public class Author : Entity
{
  [Field, Key]
  public Guid Id { get; private set; }

  [Field]
  public EntitySet<Book> Books { get; private set; }
}

In scenarios when key generator for a hierarchy is not required at all, this must be set up appropriately:

[HierarchyRoot]
[KeyGenerator(KeyGeneratorKind.None)]
[TableMapping("Metadata.Type")]
[Index("Name", Unique = true)]
public class Type : MetadataBase
{
  [Field, Key]
  public int Id { get; private set; }

  [Field(Length = 1000)]
  public string Name { get; set; }

  public Type(int id, string name) 
    : base(id)
  {
    Name = name;
  }

Here is how system class Metadata.Type is declared in DataObjects.Net v4. Pay attention to KeyGeneratorAttribute usage together with the absence on parameterless constructor. This means that identity field values are provided from the outside and there is actually no need in key generator.

This is the last post in “Arbitrary keys & hierarchies” series. Hope you’'ll find it useful.

Part 6. Identity fields

P.S.
If you want me to blog on some particular topic concerning DataObjects.Net v4 domain – make a request in comments.

CodeProject

Arbitrary keys & hierarchies, part 6. Identity fields

DataObjects.Net v4 supports the following .Net types to be used in identity fields:

  • Boolean;
  • Byte, SByte, Int16, UInt16, Int32, UInt32, Int64, UInt64;
  • String, Char;
  • Double, Single, Decimal;
  • Guid, DateTime, TimeSpan;
  • Reference to Entity (is stored as Key)

Usage of Structure and EntitySet<T> types is not allowed, however restriction for Structure usage could disappear in future versions of DataObjects.Net.

Identity fields are set once for all persistent hierarchy, so all descendants of hierarchy root share the same Key structure. In order to set up structure of Key KeyAttribute class should be used:

[HierarchyRoot]
public class Animal : Entity
{
  [Field, Key]
  public int ID { get; private set; }

  [Field]
  public int Name { get; set; }
}

Key attribute must be placed on each identity field. Also note, that identity field must be immutable, it is prohibited for identity field to have a public, protected or internal setter.

For persistent types with complex keys (which have more than one identity field) explicit identity field order is strictly recommended, because the particular order of list of properties, got with the help of .NET reflection, is not guaranteed at all.

[HierarchyRoot]
public class BookReview : Entity
{
  [Field, Key(1)]
  public Person Reviewer { get; private set; }

  [Field, Key(0)]
  public Book Book { get; private set; }

  [Field(Length = 4096)]
  public string Text { get; set; }

  public BookReview(Book book, Person reviewer)
    : base(book, reviewer)
  {}

Pay attention to KeyAttribute usage (Key(0), Key(1) lines). In this example it also used to set the position of identity field within complex key. So for BookReview type the structure of Key is {Book, Person}. Also note that values for both identity fields are required in BookReview constructor and passed to the base constructor of Entity where key is constructed.

Part 5. Key providers, Part 7. Custom key generators

CodeProject

Friday, October 23, 2009

Arbitrary keys & hierarchies, part 5. Key providers

In the previous posts we discussed the evolution of Key, its structure and the ways to work with keys. Now it’s time to answer the question: where keys come from? But first, let’s see scenarios where we do need new keys.

As you know, Entity instance is uniquely identified by key and Entity.Key property is immutable during whole Entity’s lifecycle. The only moment when Entity.Key property can be set is the moment of Entity construction. There are two scenarios and DataObjects.Net v4 supports both of them:

1. Values of identity fields are provided by outer code (not ORM but application is responsible for this). In this scenario user code is responsible for passing identity values directly to Entity constructor where these values are automatically transformed into key.

// Use this constructor for types with explicitly set identity values
protected Entity(params object[] values)

Example:

[HierarchyRoot]
public class Book : Entity
{
  [Field, Key]
  public string ISBN { get; private set; }

  // Accepts identity value (ISBN) and passes it to the base Entity constructor
  public Book(string isbn)
    : base(isbn) { }
}

2. Values of identity fields are provided by ORM (or with the help of ORM). This scenario might involve usage of database identity generators, tables with auto-increment column as well as any custom identity generators, e.g. Guid generator in order to get identity values for an Entity and build its key.

// Use this constructor for types with auto-generated identity values
protected Entity()

Example:

[HierarchyRoot]
public class Book : Entity
{
  [Field, Key]
  public int Id { get; private set; }

  // Nothing to be done here as base empty Entity constructor will be called automatically
  public Book() { }
}

While the first scenario is clear and doesn’t need any ORM participation at all (except creating key from values passed to constructor), the second one is not so clear because it requires ORM to know how to obtain the next unique identity values for the specified persistent type. To solve this task the concept of key providers was introduced in DataObjects.Net v4. KeyProviderInfo is a type from Domain model which contains all necessary information concerning generation of unique keys for particular persistent type(s), including:

  • type of identity generator implementation;
  • structure of key (key columns and descriptor of tuple);
  • size of key cache, if generator supports caching;
  • mapping name, if generator is mapped to a particular database entity;
  • etc.

DataObjects.Net v4 contains several key generator implementations both for In-Memory & SQL storages out of the box. They are called default key generators. They support caching and can be used for persistent types with exactly one identity field. This means that for persistent types with complex key (more than one identity field) custom key generator must be provided by user. In order to increase key generator performance and to simplify persistent interfaces support we made the following architectural decision:

We try to share single instance of every key generator type between all hierarchies it can serve.

All hierarchies which have identical key structure and identical key generator type, e.g. typeof(MyKeyGenerator), are served by the same instance of MyKeyGenerator. In other words, we create only one instance of each key generator type, registered in Domain. For example:

[HierarchyRoot]
public class Book : Entity
{
  [Field, Key]
  public int Id { get; private set; }

  public Book() { }
}

[HierarchyRoot]
public class Author : Entity
{
  [Field, Key]
  public int Id { get; private set; }

  public Author() { }
}

Here we have 2 hierarchies with default key generator (default key generator is used unless custom key generator is explicitly defined via KeyGeneratorAttribute). As both hierarchies have identical key structure (one field type of int) and the same type of key generator – default key generator, then only 1 key generator (Int32-Generator actually) will be created and it will serve both hierarchies at once. This means that sequence of Book identifiers as well as sequence Author identifiers won’t be strictly sequential. Say, one can get 1,2,3,6 for Book identifiers and 4,5,7,8,9 for Author identifiers. Nevertheless, all keys produced by this key generator are unique and can be used in any number of hierarchies which can be served by this key generator type without any restrictions.
The only one “negative effect” of this scheme is the presence of gaps in identifier sequence for concrete persistent type. But does it matter? Such gaps are common as objects don’t live forever, sometimes they are removed.

In the next post I’ll tell you about complex keys and custom key generators and their future.

Part 4. Working with keys, Part 6. Identity fields

CodeProject

Wednesday, October 21, 2009

Arbitrary keys & hierarchies, part 4. Working with keys

In the previous post I described the structure of Key. Now it’s time to discuss the ways of working with keys.

Obtaining a key

Key can be obtained from Entity through Entity.Key property, even if Entity instance marked for removal (Entity.IsRemoved == true).

Creating a key

There are 2 common scenarios in which instance of Key need to be created:

1. New Entity instance is created via its constructor. In this scenario we need to get the next unique key in corresponding key sequence for the specified persistent type. This can be done by these group of static methods:

// Active session is required. It is used to get access to Domain object.
Key.Create<Dog>();
Key.Create(typeof (Dog));

// Active session is not required as Domain is passed as argument.
Key.Create<Dog>(Domain);
Key.Create(Domain, typeof (Dog));

Note that methods don’t receive any arguments except Domain & Type. This means that we ask to generate the next unique key in key sequence.

2. For example, to fetch Entity instance from storage we need a key that identifies it. In this case we need to construct a key with already known value(s).

// Active session is required
Key.Create<Dog>(params object[] values);
Key.Create(typeof (Dog), params object[] values);

// Active session is not required
Key.Create<Dog>(Domain, params object[] values);
Key.Create(Domain, typeof (Dog), params object[] values);

If we want to build a key for an instance of Dog class with identifier equals to 25 we write something like this:

var key = Key.Create<Dog>(25);

Or we can use one of three other overloads.

Another group of methods for building a key accepts instance of Tuple.

// Active session is required
Key.Create<Dog>(Tuple value);
Key.Create(typeof (Dog), Tuple value);

// Active session is not required
Key.Create<Dog>(Domain, Tuple value);
Key.Create(Domain, typeof (Dog), Tuple value);

Accordingly, to build key from Tuple for an instance of Dog class with identifier equals to 25 you write:

var value = Tuple.Create(25);
var key = Key.Create<Dog>(value);

As you see, there is no public constructor in Key class. The reason for this and the usage of Factory method pattern instead of constructors is that we have several Key implementations, 4 of them are designed and extremely optimized for short keys (Key<T1>, Key<T1,T2>, Key<T1,T2,T3>, Key<T1,T2,T3,T4>) and one is for keys with unrestricted length (LongKey).

KeyHierarchy

Serializing & deserializing a key

Key can be easily serialized into the corresponding string representation using Key.Format() and deserialized from the string using Key.Parse method.

For example:

var key = Key.Create<Dog>(25);
var str = key.Format();
Console.WriteLine(str);
// This will print: "103:25"
// where 103 is identifier of Dog type in Domain.Model
// and 25 is the value of identifier field.

var key2 = Key.Parse(Domain, str);
Assert.AreEqual(key, key2);

That’s all about working with keys. In the next posts I’ll describe key providers and key mappings. Stay tuned.

Part 3. Evolution of Key, Part 5. Key providers

CodeProject

Monday, October 19, 2009

Arbitrary keys & hierarchies, part 3. Evolution of Key

As you might remember, Int64 type was used as a global unique identifier in DataObjects.Net v3. What were the reasons for changing this working approach? I’ll tell you: the main reason was the strong requirement to support Bottom-up development model. In case when a customer has a database and he wants to use an ORM to work with, in that case we just can’t say: “Hey, man, change all primary keys in all your tables to bigint (analogue of Int64 in Microsoft SQL Server) in order to use our super-duper ORM!”.
So if we want to support existing databases, we must support all kinds of primary keys that can potentially be implemented there.

How could we do it?

First of all, let’s enumerate possible logical structures of primary key:

  • single value key
  • multiple value key

Also note that from physical point of view primary key can contain field(s) of various types: int, long, Guid, string and so on. If so, we need some structure that can hold one or more fields of various types: something like List<object> or object[].

Taking into consideration primary key immutability, we must also make Key immutable. So some kind of ReadOnlyList<object> wrapper must be applied on top of initial List<object>.

Thus, at this moment Key class will contain List<object> and implement some interface to expose values in read-only manner.

[Serializable]
public class Key
{
  private List<object> values;

  // Safe way to expose values 
  public object[] GetValues()
  {
    return values.ToArray();
  }

  public Key(params object[] values)

}

Going further.

Should two instances of Key with equal field structure and equal values be considered as equal or not? It seems that yes, they must be equal. Then in order to meet the requirement, we must override GetHashCode & Equals methods (where we are to provide field-by-field value equality check) as well as implement IEquatable<Key> interface.

[Serializable]
public class Key : IEquatable<Key>
{
  ...
  // Equality support methods 
  public bool Equals(Key other)
  public override bool Equals(object obj)
  public static bool operator ==(Key left, Key right)
  public static bool operator !=(Key left, Key right)
  public override int GetHashCode()

  ...
}

The next step is an attempt to implement an Identity map pattern, which will be responsible for correspondence between Keys and Entities. The implementation will be simple and straightforward: we’ll use Dictionary<Key, Entity> for it.

Imagine the following domain model:

We have 2 persistent types here: Dog & Cat. Structure of identity fields are equal: one field of int type. If both classes are mapped to separate tables with simple autoincrement identity field then there is high probability of situation when values of identity fields from Dog & Cat tables will be equal. Say, we could have the following keys: dogKey = new Key(25) for Dog instance and catKey = new Key(25) for Cat instance, where 25 is the value of identity field.

var identityMap = new Dictionary<Key, Entity>();

var dogKey = new Key(25);
var catKey = new Key(25);

var dog = Query<Dog>.Single(dogKey);
identityMap[dogKey] = dog;

Assert.IsNotNull(identityMap[dogKey]); // True
Assert.IsNull(identityMap[catKey]);    // False

Pay attention to the last line. Although we didn’t add any Cat instance to the identity map, it says that it has one, either for dogKey or catKey. The reason is that both keys are considered as equal. So the problem is that keys with equal values for the same type must be equal, but for other types mustn’t be. In order to solve the problem we must distinguish keys made for different types, i.e. inject some type-dependent information into Key and take this into account in Equals & GetHashCode methods implementation. The most evident approach is to add property of Type type.

[Serializable]
public class Key : IEquatable<Key>
{
  ...
  public Type Type { get; private set; }

  public Key(Type type, params object[] values)
}

Now we’ve got good chances to build a fully functional identity map.

So what’s then? Let’s look at the result and calculate the cost of Key usage.

  • Every Key instance has a reference to List<object> where values of identity fields are stored. Identity field values are stored as objects, so creating a Key with 1 identity field leads to creating of 2 objects in managed heap: first is for List<object> (identity field values container), and the second is for identity field value. More identity fields in Key => more small objects in heap. Not good.
    Fortunately, we can easily substitute ineffective List<object> to quite effective and compact Tuple implementation with typed access to fields (and yes, Xtensive made its own Tuple implementation. I’m going to describe it in details in a separate post).
    This approach is highly scalable and can be used in a wide variety of scenarios: from widespread scenario with only one identity field (type of int, long, Guid, etc.) to rare ones with a group of identity fields (we call such keys as complex keys).
  • Every Key instance has a reference to Type object. But before DataObjects.Net v4 could somehow manipulate with Key instance it must get corresponding TypeInfo object for it. TypeInfo is a class from Xtensive.Storage.Model namespace that fully describes persistent type in DataObjects.Net v4 domain. Resolving TypeInfo for Type costs 1 lookup in Dictionary<Type, TypeInfo>. In order to prevent these lookups (as they are going to be rather frequent) we decided to put reference to TypeInfo directly into Key instead of reference to Type.

Here is how Key class is really designed:

public class Key : IEquatable<Key>
{
  public Tuple Value { get; }
  public TypeInfo Type { get; }

  // Instance methods 
  public bool Equals(Key other)
  public override bool Equals(object obj)
  public static bool operator ==(Key left, Key right)
  public static bool operator !=(Key left, Key right)
  public override int GetHashCode()
  public override string ToString()

  // Static methods 
  public static Key Parse(string source) + 1 overload
  public static Key Create() + many overloads
}

In the next posts I’m going to describe other topics related to Key: the absence of public constructor, serialization, patterns of creations, key generators (key providers), identity fields mappings, other speed and resource utilization improvements.

Stay tuned.

Part 2. Hierarchies, Part 4. Working with keys

CodeProject