Release Notes ============= ==== Jena 2.3 ARQ - SPARQL for Jena o ARQ added to Jena See doc/ARQ/index.html for details JAR changes jena.jar has been split into jena.jar and jenatest.jar (the test packages). Replaced and upgraded with name changes: antlr.jar => antlr-2.7.5.jar jakarta-oro-2.0.5.jar => jakarta-oro-2.0.8.jar log4j-1.2.7.jar => log4j-1.2.12.jar icu4j.jar => icu4j_3_4.jar Xerces jars updated to Xerces 2.7.1 New Jarfiles: arq.jar jenatest.jar stax-1.1.1-dev.jar stax-api-1.0.jar Constraint rewriter o the .graph.query.Rewrite class recognises certain RDQL regex idioms and rewrites them. The rewritten expressions contained errors: (a) the case-insensitive classes stored the lowercased version fo the test string, which broken the RDB generated code; (b) against all his principles, kers had used toString() on the results of expression evaluation, which broke the comparison on typed or languaged literals. These Have Been Fixed. PrefixMappings o new boolean method samePrefixMappingAs(PrefixMapping) compares mappings for equality but has opportunity to avoid creating intermediate maps. o NOTE: cannot overload .equals() because Model::equals() is already defined. Making Model implement PrefixMapping may have been a mistake ... ARP o Total rewrite of internals. Now approx four times faster. User code will experience significantly less speed up, depending on the percentage of runtime taken up with parsing (as opposed to reading the data from the network or disk, and adding the triples to a Model). o The contract concerning behaviour after syntax errors has changed. (See Javadoc for ARPOptions#setErrorMode, for details of the new contract) o Some changes in the error codes produced for ill-formed input. o The treatment of interrupts has changed. Instead of throwing an InterruptedIOException, an error is produced: ERR_INTERRUPTED, and reported through the error handler as a fatal error. This normally throws a ParseException. o A few public classes that were in the old ARP package, but labelled as not part of the API have been removed. o DOM2Model public constructors have been deprecated, and replaced with factories. o SAX2Model and SAX2RDF factories methods have been deprecated and replaced. o SAX2RDF, SAX2Model protected constructors have changed in a not backwardly compatible fashion. o NTriple command-line option documentation changed, in manner that is theoretically not backwardly compatible. -r is now default (in documentation, follows previous implementation). -R option added to cancel -r. Node/Node_Literal/Literal o Node has gained the method getIndexingValue(), which is the value to use when indexing this Node for GraphMem lookup (and other such things). Non-literal nodes return themselves. Literal nodes return an appropriate value; the current implementation defers to the getIndexingValue() method of the associated LiteralLabel. o Node has gained the method getLiteralValue(), which fails if the node is not a Node_Literal and otherwise returns the value of the associated literal. This method allows uses of getLiteral().getValue() to be replaced, so that external code need not know about getLiteral() as much. o Literal::getWellFormed() has been deprecated; it is replaced by Literal::isWellFormedXML(). There is a missing API method eg isWellFormed() which would apply to any typed literal; this will arrive in due course. o Support for indexing of typed literals added. NodeToTripleMaps now use an indexing object to represent the Node rather than the Node itself. That object should implement the appropriate semantic equality: index(x) == index(y) <=> sameValueAs(x, y) If future datatyping extensions can't meet this contract it could be weakened to: index(x) != index(y) => ! sameValueAs(x, y) at the expense of post-processing find results with a sameValueAs test. Currently the index objects used are: plain literals, no lang and xsd:string literals -> the lexical form plain literals, lang tag -> the Node XMLLiterals -> the Node known typed literals -> the java getValue() object unknown typed literals -> a cons of the lexical value and datatype URI RDFNode/Resource/Literal o The method Resource::getNode() has been deprecated in favour of RDFNode::asNode(). o The method Resource::isAnon() has been moved up into RDFNode. o The (pointless) method Literal::isLiteral() has been moved up into RDFNode() (where it is pointful). o RDFNode has acquired `boolean isURIResource()` and `boolean isResource()`. o This allows all three what-kind-of-node tests to be applied to an RDFNode. Note that heavy use of these methods is a likely design smell - visitWith() may be a better solution to the classification problem. GraphMem/GraphTripleStore/NodeToTriplesMap o Performance analysis suggests that a chunk of the time in find() was taken up with redundant comparisions when filtering the intermediate iterator with the triple pattern. [EG: the hashmap index field is always right; the ANY nodes are never relevant.] So, after a go or two around the houses, this was optimised to only test the non-wild remaining fields, using the new Triple.Field operations plus the new Filter operations. o The default memory-based graph is now GraphMemFaster, which does optimised query handling. Triple o S/P/O fields made final (dunno why they weren't already). o Added Field class, which gives three constants (getSubject, getObject, Predicate) which have a getField(Triple) method to extract that field. Fields also have filterOn() methods to create filters over nodes and fields of triples; filtering over ANY nodes delivers Filter.any, which composes cheaply. Filters & Iterators o Removed a performance infelicity in the default andThen implementation (it kept calling the left-hand hasNext even when it had been exhausted). Later replaced the implementation with one that keeps a list of pending iterators and itself implements .andThen() by extending the pending list. o Added FilterDropiterator so that filterDrop doesn't need to create a new negated Filter and suffer an extra indirection layer; added filterKeepIterator for symmetry. o Filter is now a class, not an interface, to make it easier to add new operations to it (otherwise the API changes would be grossly visible to all users). o Filter has new methods: .and(Filter), producing a Filter that passes only elements that pass both filters, and .filterKeep(Iterator), which filters the iterator in the same way as ExtendedIterator's filterKeep operation does. o Filter.any has fast implementations for these operations, allowing it to be used as a fairly cheap identity element. ModelSpecs o Fixed a bug: the loadFile property did not work on inference models. The fix ensures that any descendant of ModelSpecImpl implements createModel() using a method doCreateModel() and then loading the specified files. Schemagen o Schemagen now by default includes individuals whose class is in one of the target namespaces for the document, even if the individual itself is not. This behaviour can be turned off with option strictIndividuals. Typed literals: o Fixed bug in unparsing of xsd:time values. o Added normalization step so that creating a typed literal from an XSDDateTime will use narrow types (e.g. xsd:date) when appropriate. o Fixed bug in sameValueAs when comparing an integer to a float/double. Reasoners: o Extended rule parser to support typed literals using N3 type syntax such as 'true'^^xsd:byte. o Fixed bug with rule sets which include a proprocessing hook to ensure, the hooks are rerun after new triple adds which should invoke the hook. o Fixed two bugs with derivation logging of backward rules. o Modified processing of non-monotonic rulesets (involving drop/remove) so that each entry in the conflict set is fired separately and all the consquences propagated before attempting to fire the next rule. To avoid performance hits, rulesets not involving such operators execute as before. User defined Builtins which remove data should be marked as such using the isMonotonic method. o Fixed bug in TransitiveGraphCache which had resulted in some transitive reductions being incompletely reduced (i.e. some indirect property instances were being incorrectly reported as being direct). o Added "drop" operator as an alternative to "remove" when performing non-monotonic rewrites. o Fixed bug in rebind/reset of infgraphs which use TGC and failed to reset the transitive engine. o Optimized Resource.remove operations for the case where the parent model is an InfModel. o The default DIG reasoner used in the documentation examples has been changed from Racer to Pellet. Pellet is free and open-source, while Racer has switched to a commercial license model. Ontology API o OntDocumentManager now delegates file resolution and model caching to FileManager, which means that FileManager's resolution strategies can be used to locate ontology files (e.g. from the classloader). o Prefix mapping and ontology language selection in the OntDocumentManager has been deprecated, and will be removed in a future version of Jena. Command line utilities o New utility jena.rdfcat, which can merge any number of individual rdf documents together into one model, and perform syntax translation (e.g. RDF/XML to N3). o New query utilities for SPARQL ModelLock o Deprected in favour of Lock (in shared). o Two implementations: LockMRSW and LockMutex ==== End Jena 2.3 MySQL ----- There is a problem when using MySQL j-connection 3.1.* and MySQL 4.1.*. It manifests itself as truncation of long literals. Systems not using long literals should not see any problem. Using j-connector 3.0.* or the development versions of j-connector 3.2 do not exhibit the problem. Post 2.2beta change list ARP o Fixed XMLLiteral bug relevant to OWL-S o Added workaround for ICU bug. The workaround may slow processing of Tamil and other langauges which use Unicode composing characters. If you are processing large volumes of Tamil using a patched version of icu4j may be faster. Ask on jena-dev for more information. o Improved character encoding support: all character encodings supported by Java are now supported by Jena. FileGraph and ModelMakerImpl o ModelMakerImpl now implements [the obsolescent] createModelOver using maker.openModel( ... ) rather than .createModel( ... ). This fixes a problem with existing files in the directory not being consulted for their contents. o FileGraphs now (weakly) support (non-nested) transactions, using checkpoint files to record their state at a begin() and restoring that state at an abort(). A commit() writes the current state to the backing file and deletes the checkpoint. InfModelBase (hence, any inference model) o Inference models now (weakly) support transactions; they delegate them to their base model. Additionally, an abort() will do a rebind(). JMS & ModelSpec o Added new modelName property to the vocabulary and schema, ready to properly support named models as well as model makers. o The specification for RDB models has changed: it is not the maker, but it's /hasConnection/ value, that has the connection properties. (This allows the connection to be shared, or to be prespecified.) o The vocabulary class JMS has been renamed to JenaModelSpec. There is a (deprecated) JMS subclass to allow legacy use of the vocabulary. o OntModelSpec understands the `modelName` property; it gives the name of the model (in the baseModelMaker) which is to be used in a ontModelSpec.create() call. OWL Syntax Checker o No longer in the Jena download. o A separate contribution. o Can be separately download from the Jena project files page. PrefixMappings o the requirement that adding `(prefix, uri)` to a prefix mapping remove any existing prefix for `uri` has been removed. Calls that run the mapping backward (eg qNameFor()) will get a correct answer, not necessarily the same one each time; if possible, it will be the "most recently bound" prefix. (The "not possible" cases are those where a prefix has been removed and the inverse mapping has been regenerated.) Ontology API o As part of a move to provide more consistent behaviour, listDeclaredProperties has been completely re-written. The new behaviour, which in some important respects differs from the old behaviour, is now documented in doc/how-to/rdf-frames.html. There has also been a non backwards-compatible change in the meaning of the Boolean flag passed to listDeclaredProperties. o An OntModel can now list the root classes in the local class hierarchy, see OntModel.listHierarchyRoots() DIG reasoner o Fixed a bug that meant that and were not being translated to owl:Thing and owl:Nothing, and hence not appearing in output RDF/XML Output o Improved character encoding support: all character encodings supported by Java are now supported by Jena. Post Jena 2.1 change list [up to 2.2beta] RDF API o Fixed bug in typed literals support which caused isValidLiteral tests to fail on user defined types derived indirectly from simple types. o Fixed bugs in conversion of Calendar to XSDDateTime o Fixed XSDDouble isValidValue test to check for Double o Fixed XSDbase64Binary, XSDhexBinary returning strings instead of byte[] o GraphExtract operations made available through ModelExtract plus StatementBoundary[Base] & StatementTripleBoundary. o Model has gained read(String url, String baseURI, String lang ). Database o Fixed user-reported problem where removeNsPrefix() wasn't persistent o handles some constraints internally o A minor change to ModelCom has improved the speed with which DB models with several prefixes are opened. GraphBase and Reification SPI o Reifier::getHiddenTriples() and getReificationTriples() REMOVED and replaced by iterator-returning find(TripleMatch) [for all quadlets], findExposed(TripleMatch) [for exposed quadlets, ie Standard], and findEither(TripleMatch, boolean showHidden) [for exposed or hidden quadlets]. o Reworking of GraphBase to clean up reification and fix bug with duplicated fully-reified triples. GraphBase now implements find() by appending triples from reifier with triples from local triple store. find() is not over-ridable; over-ride graphBaseFind() instead. Similarly size -> graphBaseSize, contains -> graphBaseContains. o Reworking of SimpleReifier to express it in terms of implementations of a store for fragments and a store for complete triples, with new interfaces. This should allow implementors of persistent Graphs an easier time of it. [Driven by GraphBDB work, so there's an example to hand.] o Reifiers must also implement size[OfExposedTriples]() and close() methods. Model & Graph removeAll(), remove(S, P, O) o added new API operation Model::removeAll() which removes "all" statements from a model [currently there are issues about inference models] o added removeAll() to Graph BulkUpdateHandler o added Model::remove(S, P, O) which removes all statements matching (S, P, O) with nulls as wildcards o added BulkUpdateHandler.remove(S,P,O) removing triples matching (S, P, O), Node.ANY as wildcard o BulkUpdateHandler generates events for these o ModelFactory has gained createUnion(Model, Model) which creates a dynamic union of the two argument models. o The class GraphExtract and its related interface TripleBoundary have been created to allow the extraction of rooted subgraphs from a graph, terminating at triples satisfying some boundary condition. GraphMem, SmallGraphMem o GraphMem has had the redundant Set(Triple) excised, and changes made to NodeToTriplesMap to push the triple-matching inwards and simplify GraphMem's find code. It will use a little less memory and should be a tad faster. o a new memory-based Graph, SmallGraphMem, has been introduced. This *only* holds a Set(Triple); no indexing is done. Hence it is unsuitable for graphs with more than "a few" statements, unless memory footprint is (much) more important than search speed. It is primarily an extreme to compare other Graphs against. Graph Capabilities gains findContractSafe() o used to tell prettywriter that its use of find() works, otherwise it falls back to the ordinary writer. Graph Query handling o Query now rewrites (some) RDQL pattern matches which are equivalent to startsWith, endsWith, or contains to use new Expression nodes with labels J_startsWith, J_endsWith, J_contains, to allow back ends to optimise those. RDQL o Added langeq operator o Remove ResultBinding.getValue() (which was an internal use operation) as part of move to more directly using the graph level queryhandling. o ResultBinding becomes an interface. See also ResultBindingImpl. Event handling o added new graph.GraphEvents as holder of event constants with one such, .removeAll, issued by BulkUpdateHandler for removeAll(), and a static method for creating removed-by-pattern triple values. o the standard Jena readers generate startRead and finishRead events o all the Graph events now come with their source Graph attached as an argument to the event method. o added test case and fixed resulting issues to ensure that .remove() on the iterator from find() generated a notifyDeleteTriple event. Reasoners o Changed processing of PROPruleSet on GenericRuleReasoners to accept multiple rulesets and merge them o Added support for @prefix and @include to the Rule parser utility o Suppressed internal properties (rb:xsdRange etc) from leaking out of infmodel o Fix space leak in backward chainer o Added subProperty inheritance of domain/range values o Fixed validation of owl Functional properties to handle literals o Changed validation report of un-instantiatable classes to be warnings rather than errors, report name is "Inconsistent class" o During validation the culprit resource is now made available via the error report's getExtension method o Fixed bug in backward reasoner which caused it to occasionally miss some results in non-deterministic ways o Fixed performance problem with listing all triples in a transitive reasoner o Fixed bug 927644, mishandling of cycles in transitive reasoner o Fixed ommission in handling of someValuesFrom applied to an rdfs:Datatype. o Fixed bug in hide() table not being carried across from prebuilt reasoner caches, which resulted in the prototypical instance of owl:Thing being visible in listIndividuals. o Fixed bug in TransitiveReasoner reported by wangxiao o Changed delete for RETE reasoner to be non-incremental to work around bug without demanding full reference counting o Added experimental OWLMini and OWLMicro reasoners o [kers] Added WrappedReasonerFactory class and RuleReasoner interface. Some tweaking to "implements" clauses. Refactored out some setRules code into FBRuleReasoner and BaseRuleReasonerFactory, to make it easy to share code for ModelSpec's reasoner specs. ModelSpecs o The modelspec language has been extended. Reasoner specs can now specify multiple rulesets by URL or by literal strings. Schemas may also be specified by URL. o Internal refactoring has cleaned up the API somewhat, and there is, optionally, config information read from etc/modelspec-config.n3 At the moment, this only allows the Java class that implements a given jms:ModelSpec type to be specified. o The JMS (Java Model Spec) vocabulary class has had its schema extracted - it is now loaded from vocabularies/jena-model-spec.n3. That vocabulary element is now added to jena.jar (as are some other vocabulary elements used by the system). PrefixMappings o Standard no longer contains vcard, jms, or rss. o Extended introduced = Standard + vcard, jms, rss. o usePrefix deprecated - use shortForm instead. o qnameFor added; result is legal qname or null. o w.withDefaultMapping(x) added, which adds mappings from x which don't clash with those already in w. o the restriction that namespaces must not end with name characters has been removed. Exceptions o added WrappedException extends JenaException for wrapped exceptions o added WrappedIOException extends WrappedException Node o Node cache hash table replaced by specialised implementation o new Node methods getNameSpace, getLocalName, hasURI(String) o minor adjustments to Node.create(String) to allow for specifying language or type for literals, use of default prefix, and elimination of the nn-NN hack for numeric literals. o default Node.toString() uses quoting and @/^^ for literals Triple o Triple.create now uses a cache Resource o new method hasURI(String) Ontology API o Added a new method getOWLLanguageLevel() to OntModel, which returns the OWL sublanguage - lite, DL or full - and error messages o Fixed ont language profiles to allow .as() on e.g. owl:SymmetricProperty as OntProperty.class. Previously this relied on the reasoner; the change was needed to support the DIG interface. o OntModels created over existing base models don't include elements from their document manager's PrefixMapping if they would clash with those already existing in the base. o Fixed bug 985258 - schemagen can now create RDFS vocabularies o Fixed bug 940570 - solved a problem with listIndividuals when using the default OWL reasoner o Fixed bug 948995 - .as() on owl:InverseFunctionalProperty for datatype properties failing o Various fixes to prevent cycles in the graph confusing listSubClasses, listSuperClasses, etc o fixed profiles to allow owl:Thing .as(OntClass.class) even if no reasoner present o In response to bug 1065103, DAML models now by default use rdfs:subClassOf, rdfs:subPropertyOf, rdfs:range and rdfs:domain in preference to their daml: equivalents. The old (Jena 2.1) behaviour is available by switching to the DAML_OILLegacyProfile in the OntModelSpec. The new version more closely matches what typical DAML ontologies do. o Added createIndividual() to OntClass o Added listDataRanges() to OntModel DIG reasoner interface o Various bug fixes, plus a significant performance fix for listIndividuals() OWL Syntax Checker o Added support for OntDocumentManager o Improved command line, supports N3 etc, OntDocumentManager File Utilities o ModelLoader retired by deprecation. Use FileManager instead, specifically, FileManager.get() for the global FileManager (or create your own). Create new model with FileManager.get().loadModel Read into an existing model with FileManager.get().readModel o In schemagen, inference is now *not* used by default on input models; a new option --inference has been added to allow inference to be turned on. Creating sets/maps o util.CollectionFactory has static methods for creating hashed Maps and hashed Sets. These are used in the internals to allow the implementing classes to be changed (eg to use the trove library). Non-hashed-collection create methods may follow. o This was initially called HashUtils; that class remains, with the initial method names, but it is deprecated and will disappear post J2.2. ARP o New support for non-Xerces SAX sources. o Support for DOM sources (Java 1.4.2 and later). o ARP setup rationalized, a few methods deprecated as a result. o Improved documentation, covering new features, (see doc/ARP) o *Removed* StanfordImpl, assuming noone uses it. Please complain (jena-dev@yahoogroups.com) if assumption was false. Utilities o new class IteratorCollection with methods iteratorToSet and iteratorToList (heavility used in tests and useful in general) N3 o Resolve relative URIs if there is a base. Jena 2.1 - Main changes from Jena 2.0 Installation The names of some jars have changed be sure to update your classpath. The name of the xerces jars are now: xercesImpl.jar and xml-apis.jar We also require Jakarta commons logging: commons-logging.jar See readme.html for the full list of jars. OWL Syntax Checker Major update from alpha version (Jena 2.0), to production version (Jena 2.1). API created. Many fold performance improvement (orders of magnitude) Now conformant with OWL Recommendation. Streaming mode added, suitable for lower memory environments or large input. Command-Line jena.owlsyntax program added. Still to do: better error messages. RDF/XML-ABBREV output Changes default configuration to not use property attributes, which seem unpopular. ARP Extended API to show bnode scope and XML Namespaces. Discovered memory leak which has been present since the beginning. This is not fixed. Users of ARP and Jena in memory limited, or long-running applications, or reading lots of varied RDF/XML, should read the updated Javadoc for the package com.hp.hpl.jena.rdf.arp. Reasoner Small bug fixes (see below). Xerces Now requires Xerces 2.6.0 or better. The included jars are Xerces 2.6.1. Ontology API General bug fixes and improvements based on jena-dev feedback. The default document manager policy (etc/ont-policy.rdf) no longer re-directs imports of owl.owl and daml+oil.daml to a cached copy in the 'vocabulary' directory. This is becuase the vocabulary directory is not included in jena.jar, and this default re-direction was causing problems in some applet or web service environments. The Jena 2.0 behaviour can be restored by replacing ont-policy.rdf with ont-policy-test.rdf. Instance detection has been improved, with the side-effect that DAML ontologies using the DAML micro rule reasoner may now report that instances have rdf:type daml:Thing in addition to other types. Jena 2.1 changes from Jena 2.1-dev-3 Minor bug fixes in OWL Syntax Checker. Streaming mode in OWL Syntax Checker. Documented ARP memory leak. Jena 2.1-dev-3 Implements W3C RDF and OWL Proposed Recommendations OWL Syntax Checker - much faster, new API (error msgs still being worked on) RDF/XML-ABBREV output various bug fixes RDF/XML-ABBREV msg added requesting bug reports on rdf:parseType="Collection" ARP new extended handler for scope of blank nodes and namespace handler ARP improved syntax error messages OWL Syntax checker prolog source included in download (see tools dir) Jena 2.1-dev-2: Developers' release to include recent bug fixes (notably handling of typed literals in validation rules). Do not use this version unless you need one or more bug fixes not in Jena 2.0. Jena 2.1-dev-1: This is a developers' release, particularly intended for users of the OWL Syntax Checker. Most users should continue to use Jena 2.0. For changes, see directly below. Documentation may not be up to date. Do not use this version unless: - you need a conformant OWL Syntax Checker - you need one or more of the bug fixes not in Jena 2.0 RDF API: o Bug fixes: - fixed issue with typed literals sometimes treating lexically different, sameValueAs resources as equal. - fixed bug in Model::remove(StmtIterator) at the expense of manifesting the iterator into a List - fixed bug in .remove on StmtIterators returned from listStatements() on a memory model o ModelFactory improvements: - ModelSpecs can now be identified by URI as well as Model values - ModelRDBMaker.createModel will now return a ModelRDB rather than plain Model Reasoner subsystem: o Fixed delete bug in FORWARD_RETE configurations, remove of statements should now remove the consequents of that statement. o Added a check for constructing one OWL reasoner instance layered on top of another because this can have a large performance impact to no benefit. o Added a "hide" primitive which marks nodes as hidden. When querying an inference model no triples containing hidden subject or object nodes will be included in the result iterator. Used this to hide class-prototype instances o Extended the comparision builtins (equal, le etc) to support comparison of XSDDataTime instances. Many thanks to Bradley Schatz (Bradley@greystate.com) for supply the patches for this. o Extended OWL rules to include more explicit representation of XSD knowledge. o Various bug fixes in OWL rules (maxCardinality bug, hasValue in intersection lists fixed, bug in someValuesFrom fixed, misssing property subclass axioms). o Fixed bug in RETE engine which could loop when deleting non-deletable triples. o Fixed bug in LP engine which could lead to loss of variable bindings (manifested as "Internal error in LP reasoner: variable in triple result") o Extended is/notDTtype to check for ill-formed typed literals. OWL Syntax Checker o Now conforms with OWL Proposed Rec of December 2003 - Performance much improved. There is about a one second delay on start-up. - Error messages still somewhat cryptic (Should be better in next release) RDF/XML-ABBREV output o Failed to fix bug concerning rdf:parseType="Collection" - added new message trying to generate sufficient user feedback on jena-dev to track down bug. o Fixed other bugs on bug list RDQL o Improved handling of character sets in qnames N3 o Improved handling of character sets in qnames Graph query SPI [NB NOT visible at the model level] o replaced use of Graph for constraints by new Expression interface as part of ongoing query improvement.