Predictive Model Markup Language (PMML)

PMML describes DM models in the Extensible Markup Language (XML), which is the universal format for structured documents and data on the Web designed by W3C group (http://www.w3c.org). Why is it important to define XML specification for data mining models? In complex tasks, various tools have to be applied and they have to interchange their results. This has to be supported by an open data format. The Data Mining Group DMG members defined for that purpose the Predictive Model Markup Language (PMML). PMML provides the XML specification for several kinds of data mining models and it can be further enhanced in such a way that it is no more constrained to predictive models, only.
PMML open format has many advantages for researchers and commercial users: it makes it possible to carry out different data mining tasks (e.g. train, test, apply, visualize) with different tools. Furthermore, if necessary, the user can easily edit the model (as an XML document) by the a simple text editor. Further details about PMML can be found on http://www.dmg.org.

The Data Mining tool Kepler (also known as D-Miner, by the former project partner Dialogis) is able to import and export data mining model in the PMML format. The following two documents describe how the PMML capability is integrated into Kepler (PDF, PowerPoint).

You may also want to look at some examples of the visualization of decision tree models that were exported from IBM's intelligent miner and Kepler in the PMML format. Please be aware that the visualization Applet uses Java 1.3, you may need to download and install the browser plug-in first.

Data Set Model Type PMML Source file Visualization
Mushroom
(courtesy G. Meyer, IBM)
Decision Tree mushroom.xml Visualization
Thyroid
(courtesy G. Meyer, IBM)
Decision Tree thyroid.xml Visualization
Optdigits
(courtesy G. Meyer, IBM)
Decision Tree optdigits.xml Visualization
Insurance
(courtesy S. Müller, Dialogis)
Decision Tree dminer_dtree.xml Visualization
Health
(courtesy D. Wettschereck, University of Applied Sciences )
Classification Rules (C4.5), note: this is not a standard PMML type, but an adaptation by S. Müller, Dialogis rules_example.xml Visualization
Shopping
(courtesy Data Mining Group, significantly modified by D. Wettschereck, University of Applied Sciences, Bonn)
Association Rules assoc_example.xml Visualization