STATISTICA
Encyclopedia
STATISTICA is a statistics and analytics software package developed by StatSoft
StatSoft
StatSoft is a global provider of enterprise and desktop software for data analysis, data management, data visualization, data mining , and quality control.-Company History:...

. STATISTICA provides data analysis, data management, data mining, and data visualization procedures. STATISTICA product categories include Enterprise (for use across a site or organization), Web-Based (for use with a server and web browser), Concurrent Network Desktop, and Single-User Desktop.

History

STATISTICA originally derives from a set of software packages and add-ons that were initially developed during the Mid 1980's by StatSoft
StatSoft
StatSoft is a global provider of enterprise and desktop software for data analysis, data management, data visualization, data mining , and quality control.-Company History:...

. Following the 1986 release of CSS (Complete Statistical System) and the 1988 release of MacSS (Macintosh Statistical System), the first DOS version of STATISTICA (trademarked in capitals as STATISTICA) was released in 1991. In 1992, the Macintosh version of STATISTICA was released.

STATISTICA 5.0, was released in 1995 which automatically configured itself for new 32-bit Windows 95/NT or the older version of Windows (3.1) and featured a large number of new statistics and graphics procedures, a word-processor-style output editor of unlimited capacity (combining tables and graphs), and a built-in professional development environment that enabled the user to easily design new procedures (e.g., via the included, comprehensive STATISTICA Basic language) and integrate them with the STATISTICA system.

STATISTICA 5.1 was released in 1996 followed by STATISTICA '97 and STATISTICA '98 editions.

In 2001, STATISTICA 6 was based on the COM architecture and high-end technologies (such as multithreading and support for distributed computing
Distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...

).

STATISTICA 9 was released in 2009, supporting 32 bit and 64-bit computing.

The most recent release of STATISTICA is STATISTICA 10 (release was announced in November 2010). This release features further performance optimizations for the 64-bit CPU architecture, as well as advanced multithreading technologies, integration with Microsoft Sharepoint, Microsoft Office
Microsoft Office
Microsoft Office is a non-free commercial office suite of inter-related desktop applications, servers and services for the Microsoft Windows and Mac OS X operating systems, introduced by Microsoft in August 1, 1989. Initially a marketing term for a bundled set of applications, the first version of...

 2010 and other applications, the ability to generate Java
Java
Java is an island of Indonesia. With a population of 135 million , it is the world's most populous island, and one of the most densely populated regions in the world. It is home to 60% of Indonesia's population. The Indonesian capital city, Jakarta, is in west Java...

 and C# code, and other GUI and kernel improvements.

Localized versions
Internationalization and localization
In computing, internationalization and localization are means of adapting computer software to different languages, regional differences and technical requirements of a target market...

 of STATISTICA (including the entire STATISTICA family of products) are available in Chinese (both Traditional and Simplified), Czech, English, French, German, Italian, Japanese, Polish, Russian, and Spanish. STATISTICA documentation is available in Arabic, Chinese, Czech, English, French, German, Hungarian, Italian, Japanese, Korean, Polish, Portuguese, Russian, Spanish, and other languages.

Release history

List of releases:
  • PsychoStat - 1984
  • Statistical Supplement for Lotus 1-2-3 - 1985
  • StatFast/Mac - 1985
  • CSS 1 - 1987
  • CSS 2 - 1988
  • MacSS - 1988
  • STATISTICA/DOS - 1991
  • STATISTICA/Mac - 1992
  • STATISTICA 4.0 - 1993
  • STATISTICA 4.5 - 1994
  • STATISTICA 5.0 - 1995
  • STATISTICA 5.1 - 1996
  • STATISTICA 5.5 - 1999
  • STATISTICA 6.0 - 2001
  • STATISTICA 7.0 - 2004
  • STATISTICA 7.1 - 2005
  • STATISTICA 8.0 - 2007
  • STATISTICA 9.0 - 2009
  • STATISTICA 9.1 - 2009
  • STATISTICA 10.0 - 2010

Overview

STATISTICA is a suite of analytics software products and solutions provided by StatSoft
StatSoft
StatSoft is a global provider of enterprise and desktop software for data analysis, data management, data visualization, data mining , and quality control.-Company History:...

. The software includes an array of data analysis, data management, data visualization, and data mining procedures; as well as a variety of predictive modeling, clustering, classification, and exploratory techniques. Additional techniques are available through integration with the free, open source R
R (programming language)
R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis....

 programming environment.

Different packages of analytical techniques are available in six product lines: (1) Desktop, (2) Data Mining
Data mining
Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...

, (3) Enterprise
Enterprise software
Enterprise software, also known as enterprise application software , is software used in organizations, such as in a business or government, contrary to software chosen by individuals...

, (4) Web-Based
Web application
A web application is an application that is accessed over a network such as the Internet or an intranet. The term may also mean a computer software application that is coded in a browser-supported language and reliant on a common web browser to render the application executable.Web applications are...

, (5) Connectivity
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

 and Data Integration
Data integration
Data integration involves combining data residing in different sources and providing users with a unified view of these data.This process becomes significant in a variety of situations, which include both commercial and scientific domains...

 Solutions, and (6) Power Solutions
Power station
A power station is an industrial facility for the generation of electric energy....

.

According to Rexer's Annual Data Miner Survey
Rexer's Annual Data Miner Survey
Rexer Analytics’s Annual Data Miner Survey is the largest survey of data mining professionals in the industry. It consists of approximately 50 multiple choice and open-ended questions that cover seven general areas of data mining science and practice: Field and goals, Algorithms, Models, Tools...

 in 2010:
  • STATISTICA Data Miner along with IBM
    IBM
    International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

     SPSS Modeler and R) received the strongest satisfaction ratings as a data mining
    Data mining
    Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...

     tool in both 2010 and 2009; moreover, it was rated as the primary data mining tool chosen most often (18%).
  • STATISTICA Text Miner was rated as the top used text mining
    Text mining
    Text mining, sometimes alternately referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text. High-quality information is typically derived through the devising of patterns and trends through means such as...

     software (19%).

Graphics

STATISTICA includes analytic and exploratory graphs in addition to standard 2- and 3-dimensional graphs. Brushing actions (interactive labeling, marking, and data exclusion) allow for investigation of outliers and exploratory data analysis.

User interface

Operation of the software typically involves loading a table of data and applying statistical functions from pull-down menus or (in versions starting from 9.0) from the ribbon bar
Ribbon (computing)
In GUI-based application software, a ribbon is an interface where a set of toolbars are placed on tabs in a tab bar. Recent releases of some Microsoft applications have embraced this form with a modular ribbon as their main interface. The Ribbon is a contextual interface that offers functionality...

. The menus then prompt for the variables to be included and the type of analysis required. It is not necessary to type command prompts. Each analysis may include graphical or tabular output and is stored in a separate workbook.

Further reading

  • Hill, T., and Lewicki, P. (2007). STATISTICS Methods and Applications. Tulsa, OK: StatSoft. WEB: http://www.statsoft.com/textbook/
  • Nisbet, R., Elder, J., and Miner, G. (2009). Handbook of Statistical Analysis and Data Mining Applications. Burlington, MA: Academic Press (Elsevier).

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK