Writing integration tests with an in-memory Mongo DB

As I mentioned in my previous post, I’ve been working closely to the document oritented NoSQL database Mongo DB lately. As an advocate of sustainable systems development (test driven development that is), I took the lead in our team for designing tests for our business logic towards Mongo. For relational databases there are a lot of options, a common solution for testing against relational databases is to use an in-memory database such as H2. For NoSQL databases the options are not always as generous from an automated test perspective. Luckily, we found an in-memory version for Mongo DB from Flapdoodle which is easy to use and fits our use case perfectly. If your (Java) code base relies on Maven, just add the following dependency:

<dependency>
    <groupId>de.flapdoodle.embed</groupId>
    <artifactId>de.flapdoodle.embed.mongo</artifactId>
    <version>1.27</version>
    <scope>test</scope>
</dependency>

Then, in your test class (here based on JUnit), use the provided classes to start an in-memory Mongo DB, preferrably in a @Before method, and tear down the instance in a @After method, as in the example below:

public class TestInMemoryMongo {

    private static final String MONGO_HOST = "localhost";
    private static final int MONGO_PORT = 27777;
    private static final String IN_MEM_CONNECTION_URL = MONGO_HOST + ":" + MONGO_PORT;

    private MongodExecutable mongodExe;
    private MongodProcess mongod;
    private Mongo mongo;

    /**
     * Start in-memory Mongo DB process
     */
    @Before
    public void setup() throws Exception {
        MongodStarter runtime = MongodStarter.getDefaultInstance();
        mongodExe = runtime.prepare(new MongodConfig(Version.V2_0_5, MONGO_PORT, Network.localhostIsIPv6()));
        mongod = mongodExe.start();
        mongo = new Mongo(MONGO_HOST, MONGO_PORT);
    }

    /**
     * Shutdown in-memory Mongo DB process
     */
    @After
    public void teardown() throws Exception {
        if (mongod != null) {
            mongod.stop();
            mongodExe.stop();
        }
    }

    @Test
    public void shouldAssertSomeInteractionWithMongo() {
        // Create a connection to Mongo using the IN_MEM_CONNECTION_URL property
        // Execute some business logic towards Mongo
        // Assert the expected the behaviour
    }

    protected MongoPersistenceContext getMongoPersistenceContext() {
        // Returns an instance of the class containing business logic towards Mongo
        return new MongoPersistenceContext();
    }
}

… then you have an in-memory Mongo DB available for your test cases. The private member named “mongo” in the example above is of type com.mongodb.Mongo, which you can use for interaction with the Mongo database. As you notice, all you need is basically JUnit och Mongo, nothing more. But life is not always as easy as the simplest examples. In our case, we leverage from EJBs and other components which relies on running inside an container. As I’m contributor to the JBoss Arquillian project, a Java framework for integration tests, I was obviously curious about trying the approach of making the in-memory available when executing tests inside the container. If you use Arquillian already like we do, the transition is smooth. Just make sure to put the in-memory Mongo and Java driver on your classpath. The following Arquillian test extends the JUnit test class above, but with a bean injection for the business class under test:

@RunWith(Arquillian.class)
public class TestInMemoryMongoWithArquillian extends TestInMemoryMongo {

    @Deployment
    public static Archive getDeployment() {
        return ShrinkWrap.create(WebArchive.class, "mongo.war")
                .addClass(MongoPersistenceContext.class)
                .addAsManifestResource(EmptyAsset.INSTANCE, "beans.xml")
                .addAsWebInfResource(new StringAsset("<web-app></web-app>"), "web.xml")
                .addAsLibraries(DependencyResolvers.use(MavenDependencyResolver.class)
                        .artifact("de.flapdoodle.embed:de.flapdoodle.embed.mongo:jar:1.27")
                        .artifact("org.mongodb:mongo-java-driver:jar:2.9.1")
                        .resolveAs(JavaArchive.class));
    }

    @Inject MongoPersistenceContext mpc;

    @Override
    protected MongoPersistenceContext getMongoPersistenceContext() {
        return mpc;
    }
}

I’ve uploaded a full example project, based on Java 7, Maven, Mongo DB, Arquillian and Apache TomEE (embedded) on GitHub to demonstrate how to use an in-memory Mongo DB in a unit test as well as with Arquillian inside TomEE. This should serve as a good base when starting to write automated tests for your business logic towards Mongo.

 

Tommy Tynjä
@tommysdk

Applying indexes in Mongo DB

For the past year I’ve been consulting at a client where we’ve been using the document oriented NoSQL database Mongo DB in production (currently v2.0.5). Primarily we store PDF documents together with some arbitrary metadata, but in some use cases we also store a lot of completely dynamic documents, where there might be no similar columns shared between documents in the same collection.

For one of our collections, which holds PDF documents in a GridFS structure (~ 1 TB of data/node), we sometimes need to query for documents based on a couple of certain keys. If these keys are not indexed, queries can take a very long time to execute. How indexes are handled are very well explained in the Mongo documentation. Per default, GridFS provides us indexes for the _id, filename + uploadDate fields. To view indexes on the current collection, execute the following command from the Mongo shell:

db.myCollection.getIndexes();

Ideally, you want your indexes to reside completely within RAM. The following command returns the size of the current index:

db.myCollection.totalIndexSize();

To apply a new index for keyX and keyY, execute the following command:

db.myCollection.ensureIndex({"keyX":1, "keyY":1});

Applying the index took roughly about a minute per node in our environment. The index should be displayed when executing db.myCollection.getIndexes();

{
        "v" : 1,
        "key" : {
                "keyX" : 1,
                "keyY" : 1
        },
        "ns" : "myDatabase.myCollection",
        "name" : "keyX_1_keyY_1"
}

After applying the index, assure that the total size of the index is still managable (less than avaiable memory). Now, executing a query based on the indexed fields should yield its result much faster:

db.myCollection.find({"keyX":9281,"keyY":3270});
Tommy Tynjä
@tommysdk

Use of generics in EJB 3.1

With EJB 3.1 it is possible to make use of generics in your bean implementations. There are some aspects that needs to be considered though to make sure you use them as intended. It requires basic understanding of how Java generics work.

Imagine the following scenario. You might want to have a business interface such as:

@Local(Async.class)
public interface Async<TYPE> {
    public void method(TYPE t);
}

… and an implementation:

@Stateless
public class AsyncBean implements Async<String> {
    @Asynchronous
    public void method(final String s) {}
}

You would then like to dependency inject your EJB into another bean such as:

@EJB Async<String> asyncBean;

A problem here is that due to type erasure, you cannot be certain that the injected EJB is of the expected type which will produce a ClassCastException in runtime. Another problem is that the method annotated with @Asynchronous in the bean implementation will not be invoked asynchronously as might be expected. This is due to the method not actually being part of the business interface. The method in the business interface is in fact public void method(final Object s) due to type erasure, which in turn calls the typed method. Therefore the call to the typed method won’t be intercepted and made asynchronous. If the type definition in the AsyncBean class is removed (infers Object type), the call will be asynchronous. One solution to ensure type safety would be to treat the Async interface as a plain interface without the @Local annotation, mark the AsyncBean with @LocalBean (to make sure the bean method is part of the exposed business methods for that bean), and inject an AsyncBean where used. Another solution would be to insert a local interface between the plain interface and the bean implementation:

@Local(AsyncLocal.class)
public interface AsyncLocal extends Async<String> {
    @Override
    public void method(final String s);
}

… and then make the AsyncBean implement the AsyncLocal interface instead, and use the AsyncLocal interface at the injection points.

This behaviour might not be all that obvious, even though it makes perfect sense due to how generics are treated in Java. It is also unfortunate that the EJB specification does not clarify the expected behaviour for this use case. The following can be read in the specification, which feels a bit vague:
“3.4.3 Session Bean’s Business Interface
The session bean’s interface is an ordinary Java interface. It contains the business methods of the session bean.”

I’ve created a minimal sample project where the behaviours can be tried out by just running a simple test case written with Arquillian. The test case will be run inside an embedded TomEE container. Thanks to the blazing speed of TomEE, the test runs just as fast as a unit test. The code can be found here, and contains a failing test representing the scenario described above. Apply the proposed soultion to make the test pass.

As demonstrated, there are some caveats which needs to be considered when using generics together with EJBs. Hopefully this article can shed some light on what is happening under the hood.

Tommy Tynjä
@tommysdk