Author: Hasan (hasan@apache.org)
Last update: November 11, 2017
In this tutorial we are going to import triples stored in a file into a graph.
Problem Definition
Given a file containing a set of triples in Turtle serialization format (text/turtle), an RDF Graph should be created and filled with the triples. Assuming the content of the file is as follows, the program should log the corresponding triples.
@prefix ex: <http://clerezza.apache.org/2017/01/example#> .
_:a ex:hasFirstName "Hasan" .
_:a ex:isA ex:ClerezzaUser .
Solution
Apache Clerezza provides a Parser that can be used to read files containing triples in various serialization format. The Parser makes use of ParsingProvider services which implement the functionality to parse files of specific data format. We are going to use a ParsingProvider based on Jena Parser.
The programme listed below reads the file example02.ttl, parses its content and stores the triples into a Graph. Then it reads the newly created graph and logs the triples within the graph.
1 package org.apache.clerezza.tutorial;
2
3 import org.apache.clerezza.commons.rdf.Graph;
4 import org.apache.clerezza.commons.rdf.Triple;
5 import org.apache.clerezza.rdf.core.serializedform.Parser;
6 import org.apache.clerezza.rdf.core.serializedform.SupportedFormat;
7 import org.apache.clerezza.rdf.core.serializedform.UnsupportedFormatException;
8 import org.slf4j.Logger;
9 import org.slf4j.LoggerFactory;
10
11 import java.io.InputStream;
12 import java.util.Iterator;
13
14 public class Example02 {
15
16 private static final Logger logger = LoggerFactory.getLogger(Example02.class);
17
18 public static void main(String[] args) {
19 InputStream inputStream = Example02.class.getResourceAsStream("example02.ttl");
20 Parser parser = Parser.getInstance();
21
22 try {
23 Graph graph = parser.parse(inputStream, SupportedFormat.TURTLE);
24
25 Iterator<Triple> iterator = graph.filter(null,null,null);
26 Triple triple;
27
28 while (iterator.hasNext()) {
29 triple = iterator.next();
30 logger.info(String.format("%s %s %s",
31 triple.getSubject().toString(),
32 triple.getPredicate().toString(),
33 triple.getObject().toString()
34 ));
35 }
36 } catch (UnsupportedFormatException ex) {
37 logger.warn(String.format("%s is not supported by the used parser", SupportedFormat.TURTLE));
38 }
39 }
40 }
We will use maven for building the program. The required POM file is as follows:
1 <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
2 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
3 <modelVersion>4.0.0</modelVersion>
4 <groupId>org.apache.clerezza.tutorial</groupId>
5 <artifactId>Example-02</artifactId>
6 <packaging>jar</packaging>
7 <version>1.0</version>
8 <build>
9 <plugins>
10 <plugin>
11 <groupId>org.apache.maven.plugins</groupId>
12 <artifactId>maven-compiler-plugin</artifactId>
13 <version>3.7.0</version>
14 <configuration>
15 <source>1.8</source>
16 <target>1.8</target>
17 </configuration>
18 </plugin>
19 <plugin>
20 <groupId>org.codehaus.mojo</groupId>
21 <artifactId>exec-maven-plugin</artifactId>
22 <version>1.6.0</version>
23 <executions>
24 <execution>
25 <goals>
26 <goal>java</goal>
27 </goals>
28 </execution>
29 </executions>
30 <configuration>
31 <mainClass>org.apache.clerezza.tutorial.Example02</mainClass>
32 </configuration>
33 </plugin>
34 </plugins>
35 </build>
36 <name>Example-02</name>
37 <url>http://maven.apache.org</url>
38 <dependencies>
39 <dependency>
40 <groupId>org.apache.clerezza</groupId>
41 <artifactId>rdf.core</artifactId>
42 <version>1.0.1</version>
43 </dependency>
44 <dependency>
45 <groupId>org.slf4j</groupId>
46 <artifactId>slf4j-simple</artifactId>
47 <version>1.7.25</version>
48 </dependency>
49 <dependency>
50 <groupId>org.apache.clerezza</groupId>
51 <artifactId>rdf.jena.parser</artifactId>
52 <version>1.1.1</version>
53 </dependency>
54 </dependencies>
55 </project>
The directory structure is simple as shown below:
pom.xml
src/main/java/org/apache/clerezza/tutorial/Example02.java
src/main/resources/org/apache/clerezza/tutorial/example02.ttl
To build the jar, we should invoke:
mvn package
Running the programme can be done by invoking
mvn exec:java
The result of the programme execution shows the log messages as expected.
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building Example-02 1.0
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ Example-02 ---
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hasan/.m2/repository/org/slf4j/slf4j-simple/1.7.25/slf4j-simple-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hasan/.m2/repository/org/slf4j/slf4j-log4j12/1.7.6/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory]
[org.apache.clerezza.tutorial.Example02.main()] INFO org.apache.clerezza.rdf.core.serializedform.Parser - constructing Parser
[org.apache.clerezza.tutorial.Example02.main()] INFO org.apache.clerezza.tutorial.Example02 - org.apache.clerezza.rdf.jena.commons.JenaBNodeWrapper@78d47560 <http://clerezza.apache.org/2017/01/example#hasFirstName> "Hasan"^^<http://www.w3.org/2001/XMLSchema#string>
[org.apache.clerezza.tutorial.Example02.main()] INFO org.apache.clerezza.tutorial.Example02 - org.apache.clerezza.rdf.jena.commons.JenaBNodeWrapper@78d47560 <http://clerezza.apache.org/2017/01/example#isA> <http://clerezza.apache.org/2017/01/example#ClerezzaUser>
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 0.928 s
[INFO] Finished at: 2017-11-11T15:53:32+01:00
[INFO] Final Memory: 9M/216M
[INFO] ------------------------------------------------------------------------
Discussion
The maven POM file shows three libraries on which the programme directly depends:
-
org.apache.clerezza.rdf.core: contains implementation of the Apache Clerezza Parser
-
org.apache.clerezza.rdf.jena.parser: contains ParsingProvider service based on Jena Parser
-
org.slf4j.slf4j-simple: contains implementation of the logger
The core of the programme lies at line 20 (Parser instantiation) and 23 (parsing a stream of triples into a graph).
Note: Any comments and suggestions for improvements are welcome. Please send your feedback to dev@clerezza.apache.org