Saturday, November 15, 2014

Spring Data, Mongo, and Lazy Mappers

In a previous post, I mentioned two things that every developer should do when using Spring Data to access a MongoDB datastore. Specifically, you should be sure to annotate all of your persistent entities with @Document(collection="<custom-collection-name>") and @TypeAlias("<custom-type>"). This decouples your Mongo document data from your specific Java class names (which Spring Data will otherwise closely couple by default) making things like refactoring possible.

With my particular application, however, I ran into an additional problem. Let me recap. My application is a drawing application of sorts. Drawings are are modeled by, well, a Drawing class. A Drawing can contain multiple instances of Page, and within each Page, multiple Shape objects. Shape, in turn, is an abstract class, containing a number of subclasses (Circle, Star, Rectangle, etc).

For our purposes, let's focus on the relationship between a Page and its Shapes. Here's a snippet from the Page class:

@Document(collection="page")
@TypeAlias("myapp.page")
public class Page extends BaseDocument {

    @Id 
    private String id;

    @Indexed
    private String drawingId;

    private List<Shape> shapes = new ArrayList<Shape>();

    // ....

}

First not that I've annotated this class so that I have control over the name of the collection that stores Page documents (in this case, "page"), and so that Spring Data will store an alias to the Page class (in this case, "my app.page") along with the persisted Page documents, rather than storing the fully-qualified class name.

Also of importance here is that the Page class knows nothing about any specific Shape subclasses. This is important from an OO perspective, of course; I should be able to add any number of Shapes to my app's ecosystem, and the Page class should continue to work with no modifications.

Now let's look at my Shape class:

public abstract class Shape extends BaseDocument {

    @Id
    private String id;

    @Indexed
    private String pageId;

    // attributes
    private int x;

    private int y;

    // ...

}

Nothing surprising here. Note that Shape has none of the SpringData annotation; that's because no concrete instance of Shape will be persisted along with any Pages. It is abstract, after all. Instead, a Page will contain instances of Shape subclasses. Let's take a look at one such subclass:

@Document(collection="shape")
@TypeAlias("myapp.shape.star")

public class Star extends Shape {

    private int numPoints;
    private float innerRadius;

    private float outerRadius;

}

The @Document(collection="shape") annotation is currently unused, because per my app design, any Shape subclass instance will always be stored as a nested collection within a Page. But it would certainly be possible to store different shapes directly into a specific collection.

The @TypeAlias annotation, however, is very important. The purpose of that one is to tell Spring Data how to map the different Shapes that it finds within a Page back into the appropriate class. After all, if a Page containing a nine-point star is persisted, then when it's read back in, that star had better be mapped back into a Star class, not a simple Shape class. After all, Shape itself knows nothing about number of points!

Feeling pretty happy with myself, I tried out my code. Upon trying to read my drawings back in, I began getting errors of this type:

org.springframework.data.mapping.model.MappingInstantiationException: Could not instantiate bean class [com.myapp.documents.Shape]: Is it an abstract class?; nested exception is java.lang.InstantiationException

Indeed, Shape is an abstract class, and so indeed, it cannot be directly instantiated. But why was Spring Data trying to directly instantiate a Shape? I played around, tweaked a few things, but nothing fundamentally changed. I checked StackOverflow and the Spring forums. Nothing. So it was time to dig into the documentation.

As with most typical Spring Data/Mongo apps, mine was configured to use a bean of type org.springframework.data.mongodb.core.convert.DefaultMongoTypeMapper to map persistence documents to and from Java classes:

     <bean id="mongoTypeMapper" class="org.springframework.data.mongodb.core.convert.DefaultMongoTypeMapper">
        <constructor-arg name="typeKey" value="_alias"></constructor-arg>

    </bean>

    <bean id="mappingMongoConverter"
class="org.springframework.data.mongodb.core.convert.MappingMongoConverter">
        <constructor-arg ref="mongoDbFactory" />
        <constructor-arg ref="mappingContext" />
        <property name="typeMapper" ref="mongoTypeMapper"/>
    </bean>

    <bean id="mongoTemplate" class="org.springframework.data.mongodb.core.MongoTemplate">
        <constructor-arg ref="mongoDbFactory" />
        <constructor-arg ref="mappingMongoConverter" />

    </bean>

The docs indicated that DefaultMongoTypeMapper was responsible for reading and writing the type information stored with persistent data. By default, this would be a _class property pointing to com.myapp.documents.Star; with my customizations it became an _alias property pointing to may app.shape.star. But if DefaultMongoTypeMapper wouldn't do the trick, perhaps I needed another mapper.

Looking through the documentation, I found org.springframework.data.convert.MappingContextTypeInformationMapper. Here's what its Javadocs indicated:
TypeInformationMapper implementation that can be either set up using a MappingContext or manually set up Map of String aliases to types. If a MappingContext is used the Map will be build inspecting the PersistentEntity instances for type alias information.
That seemed promising. If I could replace my DefaultMongoTypeMapper with a MappingContextTypeInformationMapper that could scan my persistent entities and build a type-to-alias mapping, then that should solve my problem. The docs also said something about manually creating a Map, but a) It wasn't readily apparent how to create a Map myself, and b) I didn't like that approach; I didn't want to have to manually configure an entry for any new Shape that might be created.

One problem. You'll notice above that my DefaultMongoTypeMapper is wired into my MappingMongoConverter by way of the latter's typeMapper property. In fact, typeMapper is itself of type MongoTypeMapper. While DefaultMongoTypeMapper implements MongoTypeMapper,  MappingMongoConverter does not. Fortunately, DefaultMongoTypeMapper allows you to chain together fallback mappers by way of an internal property, mappers, which itself is a List<? extends TypeInformationMapper>. And as luck would have it, MappingMongoConverter implements TypeInformationMapper.

So I would keep my DefaultMongoTypeMapper, and add a MappingMongoConverter to its mappers list. I modified my spring XML config like so:

  <bean id="mongoTypeMapper" class="org.springframework.data.mongodb.core.convert.DefaultMongoTypeMapper">
<constructor-arg name="typeKey" value="_alias"></constructor-arg>
    <constructor-arg name="mappers">
        <list>
            <ref bean="mappingContextTypeMapper" />
        </list>
    </constructor-arg> 
</bean>
  <bean id="mappingContextTypeMapper" class="org.springframework.data.convert.MappingContextTypeInformationMapper">
      <constructor-arg ref="mappingContext" />

  </bean>

I redeployed and ran my app.

And I ran into the same exact error. Damn.

At this point, I became concerned that maybe all of the TypeAlias information was completely ignored by SpringData with nested documents, such as my Shapes nested within Pages. So I decided to roll up my sleeves, fire up my debugger, and start getting intimate with the Spring Data source code.

After a bit of debugging, I learned that Spring Data was indeed attempting to determine if any TypeAlias information applied to the Shapes that were being read in for any Page. But in a lazy, half-hearted way.

When I say lazy, I mean that there was absolutely no scanning of entities to search for @TypeAlias annotation like I'd assumed there would be. Everything was done at runtime, as new data types were discovered. The MappingMongoConverter would read my base entity; i.e. a Page document. It would then discover that this document had a collection of things called shapes. Then it would examine the Page class to find the shapes property, and discover that shapes was of type List<Shape>. And finally it would examine the Shape class to determine if it had any TypeAlias data that it could cache for later.

In other words, it was completely backwards from what I needed. This mapper wouldn't work for me, either.

By this time, I'd developed enough understanding as to what was going on, that creating my own mapper didn't seem too tough. And that's what I did. Really, all I needed was a mapper that I could configure to scan one or more packages to discover persistent entities with TypeAlias information, and cache that information for later use.

My class was called EntityScanningTypeInformationMapper, and its full source code is a the end of this post. But the relevant parts are:

  • Its constructor takes a List<String> of packages to scan.
  • It has an init() method that scans the provided packages
  • Scanning a package entails using reflection to read in the information for each class in the package, determining if it is annotated with @TypeAlias, and if so, mapping the alias to the class.

My Spring XML config was modified thusly:

  <bean id="mongoTypeMapper" class="org.springframework.data.mongodb.core.convert.DefaultMongoTypeMapper">
<constructor-arg name="typeKey" value="_alias"></constructor-arg>
    <constructor-arg name="mappers">
        <list>
            <ref bean="entityScanningTypeMapper" />
        </list>
    </constructor-arg> 
</bean>
  <bean id="entityScanningTypeMapper" class="com.myapp.utils.EntityScanningTypeInformationMapper" init-method="init">
    <constructor-arg name="scanPackages">
        <list>
            <value>com.myapp.documents.shapes</value>
        </list>
    </constructor-arg> 

  </bean>

I redeployed and retested, and it ran like a champ.

So my lesson is that Spring Data, out of the box, doesn't seem to work well with polymorphism, which is a shame given the schema-less nature of NoSQL data stores like MongoDB. But it doesn't take too much effort to write your own mapper to compensate.

Oh, and here's the EntityScanningTypeInformationMapper source:

public class EntityScanningTypeInformationMapper implements TypeInformationMapper {

    private Logger log = Logger.getLogger(this.getClass());
    
    private final List<String> scanPackages;
    private Map<String, Class<? extends Object>> aliasToClass;

    public EntityScanningTypeInformationMapper(List<String> scanPackages) {
        this.scanPackages = scanPackages;
    }

    public void init() {
       this.scan(scanPackages);
    }
    
    private void scan(List<String> scanPackages) {
        aliasToClass = new HashMap<>();
        for (String pkg : scanPackages) {
            try {
                findMyTypes(pkg);
            } catch (ClassNotFoundException | IOException e) {
                log.error("Error scanning package " + pkg, e);
            }
        }
    }
    
    private void findMyTypes(String basePackage) throws ClassNotFoundException, IOException {
        ResourcePatternResolver resourcePatternResolver = new PathMatchingResourcePatternResolver();
        MetadataReaderFactory metadataReaderFactory = new CachingMetadataReaderFactory(resourcePatternResolver);

        String packageSearchPath = ResourcePatternResolver.CLASSPATH_ALL_URL_PREFIX +
                                   resolveBasePackage(basePackage) + "/" + "**/*.class";
        Resource[] resources = resourcePatternResolver.getResources(packageSearchPath);
        for (Resource resource : resources) {
            if (resource.isReadable()) {
                MetadataReader metadataReader = metadataReaderFactory.getMetadataReader(resource);
                Class<? extends Object> c = Class.forName(metadataReader.getClassMetadata().getClassName());
                log.debug("Scanning package " + basePackage + " and found class " + c);
                if (c.isAnnotationPresent(TypeAlias.class)) {
                    TypeAlias typeAliasAnnot = c.getAnnotation(TypeAlias.class);
                    String alias = typeAliasAnnot.value();
                    log.debug("And it has a TypeAlias " + alias);
                    aliasToClass.put(alias, c);
                }
            }
        }
    }

    private String resolveBasePackage(String basePackage) {
        return ClassUtils.convertClassNameToResourcePath(SystemPropertyUtils.resolvePlaceholders(basePackage));
    }

    @Override
    public TypeInformation<?> resolveTypeFrom(Object alias) {
        if (aliasToClass == null) {
            scan(scanPackages);
        }
        
        if (alias instanceof String) {
            Class<? extends Object> clazz = aliasToClass.get( (String)alias );
            if (clazz != null) {
                return ClassTypeInformation.from(clazz);
            }
        }
        return null;
    }

    @Override
    public Object createAliasFor(TypeInformation<?> type) {
        log.debug("EntityScanningTypeInformationMapper asked to create alias for type: " + type);
        return null;
    }


}

4 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. That worked great.
    I had to also include the reverse mapping for my case.

    ReplyDelete
  3. Found this even tho it's been inactive for a while now. I really like what you have to say, it's definitely very thought-provoking.

    A quick question I do have: isn't it bad design to use any kind of an alias on your mongo DB types? In doing so, you've just created a strong coupling between a Java class and the data. What if you need to rename one of your aliased classes? Or what if you need to make one of you aliased classes abstract so that you can specify concrete children implementations? Won't the aliases that are currently saved in the db be completely wrong after these kinds of refactors?

    ReplyDelete
  4. @Joseph,

    I'd actually written a previous post (http://foreignloops.blogspot.com/2014/11/before-you-use-springdata-and-mongodb.html) where I showed how I learned the hard way to not to couple explicit type information with data records, which is what Spring Data seemed to want to do by default. Creating an alias did, at least for my use case, sufficiently mitigate issues that arise out of subsequent refactoring. I'm not sure if its possible to completely avoid saving some sort of indicator as to what type to deserialize your Mongo records into, at least when using Spring Data.

    ReplyDelete