SPSS Clementine
Encyclopedia
SPSS Modeler is a data mining
Data mining
Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...

 software tool by SPSS Inc.
SPSS Inc.
SPSS Inc. was a software house headquartered in Chicago and incorporated in Delaware, most noted for the proprietary software of the same name SPSS. The use of this trademarked name has been the subject of ongoing legal action against the company for many years.In addition to the software which...

, an IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

 company. It was originally named SPSS Clementine by SPSS, after which it was renamed PASW Modeler in 2009 by SPSS. It was since acquired by IBM in its acquisition of SPSS Inc.

Overview

SPSS Modeler uses a three-tier design. Users manipulate icons and options in the front-end application on Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 operating systems. This front-end client application then communicates with a Clementine Server software, or directly with a database or dataset. The most common configuration in large corporations is to house the Clementine Server software on a powerful analytical server box (Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

, UNIX
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

, Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

), which then connects to the corporate Data warehouse
Data warehouse
In computing, a data warehouse is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.A data warehouse...

. Data processing commands are automatically converted from the icon-based user interface into a command code (which is not visible) and is sent to the Clementine Server for processing. Where possible, this command code will be further compiled into SQL
SQL
SQL is a programming language designed for managing data in relational database management systems ....

 and processed on the data warehouse.

According to Rexer's Annual Data Miner Survey
Rexer's Annual Data Miner Survey
Rexer Analytics’s Annual Data Miner Survey is the largest survey of data mining professionals in the industry. It consists of approximately 50 multiple choice and open-ended questions that cover seven general areas of data mining science and practice: Field and goals, Algorithms, Models, Tools...

 in 2010:
  • SPSS (along with STATISTICA
    STATISTICA
    STATISTICA is a statistics and analytics software package developed by StatSoft. STATISTICA provides data analysis, data management, data mining, and data visualization procedures...

     Data Miner and R) received the strongest satisfaction ratings as a data mining
    Data mining
    Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...

     tool in both 2010 and 2009.
  • SPSS Modeler and SPSS Text Analytics (now called SPSS Modeler Premium) were rated as the second (17%) and fourth (7%), respectively, most used text mining
    Text mining
    Text mining, sometimes alternately referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text. High-quality information is typically derived through the devising of patterns and trends through means such as...

     software.

Versions and history

Early versions of the software were Unix based and designed as a consulting tool and not for sale to customers. Originally developed by a UK company named Integral Solutions Limited (ISL), the tool quickly garnered the attention of the data mining community (at that time in its infancy). Original in many respects, it was the first data mining tool to use an icon based Graphical user interface
Graphical user interface
In computing, a graphical user interface is a type of user interface that allows users to interact with electronic devices with images rather than text commands. GUIs can be used in computers, hand-held devices such as MP3 players, portable media players or gaming devices, household appliances and...

 rather than requiring users to write in a Programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

.

In 1999 ISL were acquired by SPSS
SPSS
SPSS is a computer program used for survey authoring and deployment , data mining , text analytics, statistical analysis, and collaboration and deployment ....

 Inc, who saw the potential for extended development as a commercial data mining tool. In early 2000 the software was developed into a client / server architecture, and shortly afterward the client front-end interface component was completed re-written and replaced with a superior Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 front-end.

SPSS Clementine version 12.0
The client front-end runs under Windows. The server back-end Unix variants (Sun, HP-UX, AIX), Linux, and Windows. The graphical user interface is written in Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

.

Release history

  • Clementine 5.1 - Jan 2000
  • Clementine 12.0 - Jan 2008
  • PASW Modeler 13 (formerly Clementine) - April 2009
  • IBM
    IBM
    International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

     SPSS Modeler

Competitors

  • SAS Enterprise Miner - data mining software provided by the SAS Institute
    SAS Institute
    SAS Institute Inc. , headquartered in Cary, North Carolina, USA, has been a major producer of software since it was founded in 1976 by Anthony Barr, James Goodnight, John Sall and Jane Helwig...

    .
  • STATISTICA
    STATISTICA
    STATISTICA is a statistics and analytics software package developed by StatSoft. STATISTICA provides data analysis, data management, data mining, and data visualization procedures...

     Data Miner - data mining software provided by StatSoft
    StatSoft
    StatSoft is a global provider of enterprise and desktop software for data analysis, data management, data visualization, data mining , and quality control.-Company History:...

    .

Further reading

  • Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., et al. (2000). CRISP-DM 1.0, Chicago, IL: SPSS.
  • Nisbet, R., Elder, J., and Miner, G. (2009). Handbook of Statistical Analysis and Data Mining Applications. Burlington, MA: Academic Press (Elsevier).

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK