BootOX
Introduction
The integrated ontology and mapping bootstrapper, named BootOX, lets the user extract ontologies and mappings from already existing databases. The mappings and ontologies can be used by Optique immediately after they have been created, so it is an excellent tool to use to setup Optique with very little effort. BootOX can be controlled through a set of settings displayed on the bootstrapper page, but except for that, it is fully automatic. This leads to an ontology which is very similar to the database schema, however, in many cases this is desired, and it makes a good staring point for those who want to develop a custom ontology and mapping for their data.
Bootstrapping techniques
BootOX follows the W3C direct mapping directives. Basically this means that it follows these bootstrapping rules to construct the new ontology:
- Each (non-binary) table is translated into an OWL class.
- Each attribute not involved in a foreign key is translated into an OWL datatype property.
- each foreign key between two tables is translated into an OWL object property.
BootOX also uses more advanced techniques to further improve the resulting ontology, but these are the most important rules to know about.
How to use BootOX
To demo how BootOX works, we will run BootOX on the Northwind database using the default settings. This requires that the Northwind database is already added as a data source in Optique.
- Click the Bootstapper icon in the main menu. This will direct you to the bootstrapper page.
- Use the settings given in the following table.
Setting name | Value/Action |
---|---|
Available schemata | Northwind data source |
Bootstrapping Level | Schema driven |
Expressiveness | OWL 2 QL |
Constraint Schema | Local constraint schema |
Attribute Naming Schema | Creation of unique attribute names |
Provenance mappings | Do not select any options |
Additional bootstrapping options | Select “Extract annotations for query formulation interface” |
Import domain ontology | Do not select any ontology |
OWL 2 QL approximation | Do not select this |
Ontology and Mapping Storage | Store both ontology and mapping |
- Use the following URIs for the ontology and mapping:
- Ontology IRI: https://www.optique-project.eu/resource/h2-northwind/bootstrapped_ontology/default_settings
- Mapping IRI: https://www.optique-project.eu/resource/h2-northwind/bootstrapped_mappings/default_settings
- Click “Start Bootstrapping”
- Wait for the message on the bottom of the page confirming that the bootstrapper succeded.
Inspect the bootstrapped ontology
The bootstrapped ontology can be inspected in two different ways: Either through the ontology viewer in the Optique Platform, or by exporting and opening it in Protege.
Option 1) Open ontology inside the Optique Platform
- Click the Ontologies icon in the main menu. This will direct you to the ontologies page.
- Locate the bootstrapped ontology by looking at the Ontology IRI and Date of Upload
- Click the ontology to get more information about its classes and properties. Each class and property can be clicked for additional information about them.
Option 2) Open in Protege
- Click the Ontologies icon in the main menu. This will direct you to the ontologies page.
- Locate the bootstrapped ontology by looking at the Ontology IRI and Date of Upload.
- Export the ontology by clicking the export button (blue arrow) to the far right.
- Import the ontology into Protege.
Inspect generated classes
The default action done by BootOX when it encounters a table, is to translate it into a ontology class. To check if this has actually happened, we compare the two sides in the table below.
Database tables | Classes in ontology |
---|---|
CATEGORIES | CATEGORIES |
CUSTOMERCUSTOMERDEMO | |
CUSTOMERDEMOGRAPHICS | CUSTOMERDEMOGRAPHICS |
CUSTOMERS | CUSTOMERS |
EMPLOYEES | EMPLOYEES |
EMPLOYEETERRITORY | |
ORDERDETAILS | ORDERDETAILS |
ORDERS | ORDERS |
PRODUCTS | PRODUCTS |
REGION | REGION |
SHIPPERS | SHIPPERS |
SUPPLIERS | SUPPLIERS |
TERRITORIES | TERRITORIES |
owl:Ontology |
Most of the tables has been translated into ontology classes, which indicates that BootOX did not encounter any problems with the those. The exceptions are the tables named CUSTOMERCUSTOMERDEMO and EMPLOYEETERRITORY; none of them have a corresponding class. This is because they are cross-reference tables. Cross-reference tables are not translated into classes, and this rule overrides the default rule for tables.
The class owl:Ontology, is just a default class needed to express that the bootstrapped ontology itself is an ontology. The only instance of this class is the bootstrapped ontology itself. This class can in most cases be ignored, since it does not relate the the relevant domain.
Inspect generated object properties
The bootstrapped ontology contains 13 object properties, all listed in the table below.
For each foreign key between two tables, where none of them are cross-reference tables, an object property is generated. One example of this is the foreign key relationship between PRODUCTS.SUPPLIERID and SUPPLIER.SUPPLIERID, which leads to an object property with the URI nw:PRODUCTS/SUPPLIERID, and the label PRODUCTS.SUPPLIERID.
For each cross-reference table with foreign keys pointing to two other tables, two object properties are generated. The object properties gives a relation between the two referenced tables, but in opposite directions, so they are in fact inverses of each other. One example showing this is the two object properties labeled TERRITORIES_has_EMPLOYEES and EMPLOYEES_has_TERRITORIES, which is generated from the cross-reference table EMPLOYEETERRITORIES with refers to both the EMPLOYEES table and the TERRITORIES table.
Object property | Foreign key/Comment |
---|---|
CUSTOMERDEMOGRAPHICS_has_CUSTOMERS | From cross-reference table: CUSTOMERCUSTOMERDEMO |
CUSTOMERS_has_CUSTOMERDEMOGRAPHICS | From cross-reference table: CUSTOMERCUSTOMERDEMO |
EMPLOYEES.REPORTSTO | EMPLOYEES.REPORTSTO -> EMPLOYEES.EMPLOYEEID |
EMPLOYEES_has_TERRITORIES | From cross-reference table: EMPLOYEETERRITORIES |
TERRITORIES_has_EMPLOYEES | From cross-reference table: EMPLOYEETERRITORIES |
ORDERDETAILS.ORDERID | ORDERDETAILS.ORDERID -> ORDERS.ORDERID |
ORDERDETAILS.PRODUCTID | ORDERDETAILS.PRODUCTID -> PRODUCTS.PRODUCTID |
ORDERS.CUSTOMERID | ORDERS.CUSTOMERID -> CUSTOMERS.CUSTOMERID |
ORDERS.EMPLOYEEID | ORDERS.EMPLOYEEID -> EMPLOYEES.EMPLOYEEID |
ORDERS.SHIPVIA | ORDERS.SHIPVIA -> SHIPPERS.SHIPPERID |
PRODUCTS.CATEGORYID | PRODUCTS.CATEGORYID -> CATEGORIES.CATEGORYID |
PRODUCTS.SUPPLIERID | PRODUCTS.SUPPLIERID -> SUPPLIERS.SUPPLIERID |
TERRITORIES.REGIONID | TERRITORIES.REGIONID -> REGION.REGIONID |
Inspect generated data properties
In total the bootstrapped ontology contains 86 data properties. Most of these are generated from the column names of each table. There are too many columns in Northwind to consider them all, so we will only look at the table PRODUCTS, with 10 columns in total. 8 of these columns resulted in a data property, while the two other generated the object properties PRODUCT.CATEGORYID and PRODUCT.SUPPLIERID, which we have already covered.
Column name | Property name | Property type |
---|---|---|
CATEGORYID | PRODUCT.CATEGORYID | Object property |
DISCONTINUED | PRODUCTS.DISCONTINUED | Data property |
PRODUCTID | PRODUCT.PRODUCTID | Data property |
PRODUCTNAME | PRODUCT.PRODUCTNAME | Data property |
QUANTITYPERUNIT | PRODUCTS.QUANTITYPERUNIT | Data property |
REORDERLEVEL | PRODUCTS.REORDERLEVEL | Data property |
SUPPLIERID | PRODUCT.SUPPLIERID | Object property |
UNITPRICE | PRODUCTS.UNITPRICE | Data property |
UNITSINSTOCK | PRODUCTS.UNITSINSTOCK | Data property |
UNITSONORDER | PRODUCTS.UNITSONORDER | Data property |
I addition to the data properties generated directly from columns, there are also 11 additional properties: ADDRESS, CITY, COMPANYNAME, CONTACTNAME, CONTACTTITLE, COUNTRY, FAX, PHONE, POSTALCODE, REGION, UNITPRICE. These are all generated because there are more than one column (in different tables) with that name. Take PHONE as an example. This data property is generated because there are three different tables with columns named PHONE, namely SUPPLIERS.PHONE, CUSTOMERS.PHONE, SHIPPERS.PHONE. Each of these columns lead to one data property, and PHONE is a super property over all of them.
BootOX also look up the type of each column, and reuse corresponding RDF data types if they exist. To see this in action, the user must select to let BootOX generate domain and range axioms, which is not the default setting. This will be covered later.