Jena2 Database Interface - Options for Initialization and Access

The following options are available for use with the persistence subsystem. For each option Xyz, there are getXyz and setXyz methods in the associated interface. Some options must be set when initializing (formatting) the database and others may be set while accessing models.

Database Initialization Options

The following options may only be set before the database is initialized. To set these options, invoke the associated set method on the database driver. There is also a get method that may be called at any time to retrieve the option value. If the database has already been formatted, i.e., if IDBConnection.isFormatOK() returns true, then these set methods will throw an exception. These options are persisted in the database. When (a model in) a previously formatted database is opened, the option values in the database override (silently) any user-specified values.

IRDBDriver Option Type     Default Description
LongObjectLength int * Maximum length of a literal or resource to be stored in a statement table.     
LongObjectLengthMax int * Maximum possible value for LongObjectLength.     
IndexKeyLength int * Maximum length of the key in a long object table.
IndexKeyLengthMax int * Maximum possible value for IndexKeyLength.     
IsTransactionDb boolean * True if the database can support transactions.     
DoCompressURI boolean false If true, do prefix compression on long URIs.
CompressURILength int 100 URIs longer than this length will be compressed (if doCompressURI is true).
TableNamePrefix String jena_ The common prefix for all Jena table names in the database.

* These options are database-dependent. See the database-specific howto (HSQLDB, MySQL, Derby, Oracle, PostgreSQL, Microsoft SQL Server) for the default values.

LongObjectLength
This defines the maximum length of a value in a statement table where the value may be either a literal or a resource URI. Values longer than this length are stored in either the long literals or the long resources table. Smaller values of LongObjectLength reduce database space consumption at the cost of increased retrieval time. Each database engine has a maximum permissible value for LongObjectLength which may be retrieved by calling getLongObjectLengthMax(). An attempt to set a larger length will throw an exception.
    Note that LongObjectLength is an upper bound due to the database encoding used for values. For example, if LongObjectLength is ten, a literal string of ten or even nine characters would be stored as a long object because, when stored, the string value is encoded with type information which makes the actual stored value even longer. 
IndexKeyLength
This defines the maximum length of an index key for long object values (literals or resource URIs). Long objects are stored in three parts, a head, a hash and a tail. The head is the prefix of the long value that can be indexed. The hash is a content-based hash value of the remainder (the tail). Exact matching is done by comparing the head and the hash value. In the future, we plan to do prefix matching on the head for inequality and range queries.
   Generally, there is no need to change IndexKeyLength. However, smaller values could reduce database space consumption at the expense of reducing the (future) effectiveness of inequality and range queries. Note that IndexKeyLength is an upper bound due to the database encoding (see comments in LongObjectLength). Each database engine has a maximum permissible value for IndexKeyLength which may be retrieved by calling IndexKeyLengthMax(). An attempt to set a larger length will throw an exception.
IsTransactionDb
Some database engines support a non-transactional configuration in which begin-end transactions are not supported but the individual database operations are atomic. MySQL has both transactional and non-transactional configurations. This option can be used to set the transaction mode. Since it affects the physical database structure, it can only be set prior to database initialization. Applications must be careful when using non-transactional configurations because the database may be left in an inconsistent state if an application is interrupted in the middle of a database operation.
DoCompressURI
By default, resource URIs are stored fully expanded in the database. If DoCompressURI is true, URIs will be compressed by storing a prefix of the URI (typically a namespace) in a separate table. This can be used to reduce database space consumption. Ideally, it should not significantly increase retrieval time since it is expected that the number of prefixes will be relatively small and it should be possible to cache them in main memory for expansion.
    Note that there is an interaction between DoCompressURI and LongObjectLength. The prefix is compressed before the object length is checked. For example, if LongObjectLength is ten and DoCompressURI is true, the URI  myNamespace.com/foo:123 would be stored as a compressed URI directly in a statement table. However, if DoCompressURI is false, then that URI would be stored in the long resources table and the statement table would have a reference to it.
CompressURILength
If DoCompressURI is true, this specifies the minimum length URI that should be compressed. Resource URI's shorter than this value will be stored fully expanded.
TableNamePrefix
Every database table created by Jena has a common prefix. This option allows users to specify the prefix. It affects all Jena tables and indexes, including the Jena system table. Consequently, with this option it is possible to have multiple Jena persistent stores, each with different formatting options (e.g., LongObjectLength, DoCompressURI, etc.) in a single database instance, with each store having a distinct prefix.
    Note that this option differs from the previous options in that it must be set on every connection that access the store. Otherwise, the subsystem will assume the default prefix and will not be able to locate the Jena system table which contains the configuration.
    The maximum length of the prefix is database-dependent and an exception may be thrown if the prefix is too long. Otherwise, it is the user's responsibility to ensure that the prefix name conforms to the naming conventions for the underlying database engine (e.g., certain prohibited special characters). Also, if the database requires upper case table names (or lower case), the prefix will be automatically (silently) converted to that convention.
    This option has subtle semantics and should be used with care. Always use the following code sequence to ensure that the prefix is set correctly for the database connection.

IDBConnection conn = ( make a database connection )
conn.getDriver().setTableNamePrefix("myNewNamePrefix");
 

Database Access Options

The following options may be set at any time and are not persistent. They exist only for the duration of a database connection.

IRDBDriver Option Type     Default Description
StoreWithModel String null If not null or empty, subsequent models will share tables with the named model.
CompressCacheSize int 50 The size of the URI prefix cache if DoCompressURI is true.
StoreWithModel
By default, models are stored in separate database tables. This option enables models to share tables. Once specified, all subsequently created models created on the current connection are stored in the same tables as the specified model. A model name of "DEFAULT" references the default (unnamed) model.
    If the specified model does not exist, an exception is thrown when attempting to create a new model that references it. This is also true of the default model, i.e., it is not automatically created. If the specified model name is null or the empty string, then subsequently created models are stored in separate tables.
CompressCacheSize
If URI compression is enabled (DoCompressURI is true), an in-memory LRU cache of URI prefixes is maintained to reduce the need to access the database to expand compressed URIs. The cache size can be adjusted at any time after a connection to the database is established.
 

Model Access Options

The following options affect the behavior of query processing. These options are not persisted in the database. The options are set by calling the associated set method on the database model (an instance of ModelRDB). There is also a get method to retrieve the option value.

ModelRDB Option Type Default Description
DoFastpath boolean true If true, enable query Fastpath.    
QueryOnlyAsserted boolean false If true, query only asserted statement tables.
QueryOnlyReified boolean false If true, query only reified statement tables.
QueryFullReified boolean false If true, Fastpath ignores partially reified statements.
DoDuplicateCheck boolean true If true, check if a statement is already in the database before adding. it.
DoFastpath
This option enables and disables Fastpath query processing. Generally, it should be enabled but it may be useful to disable it for experiments or debugging. For details on Fastpath processing and explanations of the three query options in this table, see the Fastpath notes.
QueryOnlyAsserted
When true, querying will only be done on asserted statement tables; the reified tables are ignored. For applications that use only asserted statements this may provide a performance improvement for certain types of queries (specifically, those with unknown predicates), especially if the database is remote from the application.
QueryOnlyReified
When true, querying will only be done on reified statement tables; the asserted statement tables are ignored. For applications that use only reified statements this may provide a performance improvement for certain types of queries (specifically, those with unknown predicates), especially if the database is remote from the application.
QueryFullReified
See the Fastpath notes.
DoDuplicateCheck
When a statement is added to a persistent model, Jena first checks if the statement already exists in the model. This prevents the occurrence of duplicate rows in the statement tables. However, if a user knows that the rows to be inserted do not already exist, DoDuplicateCheck may be disabled to reduce overhead for adding statements. This can substantially reduce load times.
    Note that, once set, the value applies not just to the specified model but to any model subsequently created in the database during the user's session (the life of the database connection). The setting for existing models is not affected.
    When duplicate checking is disabled, if an application attempts to insert a duplicate statement in a model, the result depends on the database engine and configuration. In general, the insert will succeed, no indication will be provided to the application and the database will contain duplicate statements. If this is undesirable, one option is to create a unique index on the subject, predicate and object columns of the statement table. This can easily be done by modifying the templates for creating statement tables in the database-specific SQL template files, e.g., see CreateStatementTable and CreateReifStatementTable in the file 'etc/mysql.sql'. If this is done then the database engine will generate an error when a duplicate statement is added and an exception will be thrown to the application.