What is it?
The Ajira project started as a research project at
the Department of Computer
Science of the Vrije
Universiteit Amsterdam to research a new computing
paradigm to efficiently process large amounts of data.
It is inspired by MapReduce (and more specifically
by Hadoop). Its
ambition is to go beyond it by allowing a much more complex
processing on the data that can be performed not only using
large computing clusters but also with more modest
machines. We call it a Data Engine, because its goal
is to make your life easier if you need to process (lots) of
data. We hope you'll like it.
The Ajira project was supported by the Dutch national program
COMMIT.
Main features
- Automatic parallelization and distribution of
the computation. We spend many efforts in trying
to hide to the user all the parallelization necessary
to operate on large amounts of data. Within one
machine, the framework spawns multiple threads. If
more machines are used, then it takes care of the
nodes communication and ensures correct
termination.
- Ability to perform custom operations on the
data. In MapReduce only a Map and
a Reduce can be executed in sequence. In
Ajira, we try to break this mold allowing the user
to define an arbitrary number of operations to be
executed in whatever sequence is needed.
- Allows different computation flows at
runtime. Sometimes you just don't know what you
need to execute on your data. In Ajira, you can decide
at runtime which actions to take. Multiple actions can
be executed in parallel, on the same or different
parts of the input. The output of such actions can be
merged or be further split. This allows the execution
of much more complex programs than a simple
Map/Reduce.
- Works on single machines, computing clusters,
or even on clouds! In Ajira we use
the Ibis IPL
library to communicate between machines. The IPL
library abstracts the physical location of machines in
a logical network. This means that one program can be
physically executed using machines in a WAN network,
or using machines with heterogeneous
connectivity.
- Open source. Ajira is entirely written in
Java. All the code is released under the Apache
license and can therefore be used or extended
by anyone.
Want to know more?
Please, read the documentation that
we provide. You can start from the getting started
guide, or read our
tutorial to learn how to develop with it.
Are you wondering where the name Ajira comes from?
You can read it here.