WinFS
Encyclopedia
WinFS is the code name for a cancelled data storage and management
Data management
Data management comprises all the disciplines related to managing data as a valuable resource.- Overview :The official definition provided by DAMA International, the professional organization for those in the data management profession, is: "Data Resource Management is the development and execution...

 system project based on relational database
Relational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...

s, developed by Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

 and first demonstrated in 2003 as an advanced storage subsystem for the Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

, designed for persistence
Persistence (computer science)
Persistence in computer science refers to the characteristic of state that outlives the process that created it. Without this capability, state would only exist in RAM, and would be lost when this RAM loses power, such as a computer shutdown....

 and management of structured, semi-structured
Semi-structured model
The semi-structured model is a database model. In this model, there is no separation between the data and the schema, and the amount of structure used depends on the purpose.The advantages of this model are the following:...

 as well as unstructured data
Unstructured data
Unstructured Data refers to information that either does not have a pre-defined data model and/or does not fit well into relational tables. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well...

.

WinFS includes a relational database
Relational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...

 for storage of information, and allows any type of information to be stored in it, provided there is a well defined schema
Database schema
A database schema of a database system is its structure described in a formal language supported by the database management system and refers to the organization of data to create a blueprint of how a database will be constructed...

 for the type. Individual data items could then be related together by relationships, which are either inferred by the system based on certain attributes or explicitly stated by the user. As the data has a well defined schema, any application can reuse the data; and using the relationships, related data can be effectively organized as well as retrieved. Because the system knows the structure and intent of the information, it can be used to make complex queries that enable advanced searching through the data and aggregating various data items by exploiting the relationships between them.

While WinFS and its shared type schema make it possible for an application to recognize the different data types, the application still has to be coded to render the different data types. Consequently, it would not allow development of a single application that can view or edit all data types; rather what WinFS enables applications to understand is the structure of all data and extract the information that it can use further. When WinFS was introduced at the 2003 Professional Developers Conference
Professional Developers Conference
Microsoft's Professional Developers Conference is a conference for software developers, normally Windows developers.It covers new and upcoming technology from Microsoft, and so only occurs in the years when there is something new to talk about...

, Microsoft also released a video presentation, named IWish, showing mockup interfaces that showed how applications would expose interfaces that take advantage of a unified type system. The concepts shown in the video ranged from applications using the relationships of items to dynamically offer filtering options to applications grouping multiple related data types and rendering them in a unified presentation.

WinFS was billed as one of the pillars of the "Longhorn"
Development of Windows Vista
Development of Windows Vista occurred over the span of five and a half years, starting in earnest in May 2001, prior to the release of Microsoft's Windows XP operating system, and continuing until November 2006....

 wave of technologies, and would ship as part of the next version of Windows. It was subsequently decided that WinFS would ship after the release of Windows Vista
Windows Vista
Windows Vista is an operating system released in several variations developed by Microsoft for use on personal computers, including home and business desktops, laptops, tablet PCs, and media center PCs...

, but those plans were shelved in June 2006, with some of its component technologies being integrated into upcoming releases of ADO.NET
ADO.NET
ADO.NET is a set of computer software components that programmers can use to access data and data services. It is a part of the base class library that is included with the Microsoft .NET Framework. It is commonly used by programmers to access and modify data stored in relational database systems,...

 and Microsoft SQL Server
Microsoft SQL Server
Microsoft SQL Server is a relational database server, developed by Microsoft: It is a software product whose primary function is to store and retrieve data as requested by other software applications, be it those on the same computer or those running on another computer across a network...

. While it was then assumed by observers that WinFS was finished as a project, in November 2006 Steve Ballmer
Steve Ballmer
Steven Anthony "Steve" Ballmer is an American business magnate. He is the chief executive officer of Microsoft, having held that post since January 2000. , his personal wealth is estimated at US$13.9 billion, ranking number 19 on the Forbes 400.-Early life:Ballmer was born in Detroit, Michigan to...

 announced that WinFS was still in development, though it was not clear how the technology was to be delivered. Several components of the last Integrated Storage Initiative project, Microsoft Semantic Engine, presented at Microsoft PDC 2009, have been integrated back into the SQL Server "Denali". At the 2010 SQL Server PASS Community Summit, the forthcoming version of SQL Server ("Denali") was shown, which seems to incorporate many of the WinFS ideas.

Motivation

Many filesystems found on common operating systems, including the NTFS
NTFS
NTFS is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7....

 filesystem which is used in modern versions of Microsoft Windows, store files and other objects only as a stream of bytes
Byte stream
In computer science, a byte stream is a bit stream, in which data bits are grouped into units, called bytes.In computer networking the term octet stream is sometimes used to refer to the same thing; it emphasizes the use of bytes having the length of 8 bits, known as octets.Formally, a byte stream...

, and have little or no information about the data stored in the files. Such file systems also provide only a single way of organizing the files, namely via directories and file names.

Because a file system has no knowledge about the data it stores, applications tend to use their own, often proprietary, file format
File format
A file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...

s. This hampers sharing of data between multiple applications. It becomes difficult to create an application which processes information from multiple file types, because the programmers have to understand the structure
File format
A file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...

 and semantics of all the files. Using common file formats is a workaround to this problem but not a universal solution; there is no guarantee that all applications will use the format. Data with standardized schema, such as XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

 documents and relational data
Relational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...

 fare better, as they have a standardized structure and run-time requirements.

Also, a traditional file system can retrieve and search data based only on the filename, because the only knowledge it has about the data is the name of the files that store the data. A better solution is to tag files with attributes that describe them. Attributes are metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

 about the files such as the type of file (such as document, picture, music, creator, etc.). This allows files to be searched for by their attributes, in ways not possible using a folder hierarchy, such as finding "pictures which have person X". The attributes can be recognizable by either the file system natively, or via some extension. Desktop search
Desktop search
Desktop search is the name for the field of search tools which search the contents of a user's own computer files, rather than searching the Internet...

 applications take this concept a step further. They extract data, including attributes, from files and index it. To extract the data, they use a filter for each file format. This allows for searching based on both the file's attributes and the data in it.

However, this still does not help in managing related data, as disparate items do not have any relationships defined. For example, it is impossible to search for "the phone numbers of all persons who live in Acapulco and each have more than 100 appearances in my photo collection and with whom I have had e-mail within last month". Such a search could not be done unless it is based on a data model which has both the semantics
Semantics
Semantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....

 as well as relationships of data defined. WinFS aims to provide such a data model and the runtime infrastructure that can be used to store the data, as well as the relationships between data items according to the data model, doing so at a satisfactory level of performance.

Overview

WinFS natively recognizes different types of data, such as picture, e-mail, document, audio, video, calendar, contact, among others; rather than just leaving them as raw unanalyzed bytestreams (as most file systems do). Data stored and managed by the system are instances of the data type recognized by the WinFS runtime. The data are structured by means of properties. For example, an instance of a résumé type will surface the data by exposing certain properties like Name, Educational Qualification, Experience, among others. Each of the properties may be of simple types like strings, integers, or dates or complex types like contacts. Different data types expose different properties. Besides that, WinFS also allows different data instances to be related together, such as a document and a contact can be related by an Authored By relationship. Relationships are also exposed as properties; for example if a document is related to a contact by a Created By relationship, then the document will have a Created By property. When it is accessed, the relationship is traversed and the related data returned. By following the relations, all related data can be reached.

WinFS promotes sharing of data between applications by making the data types accessible to all applications, along with their schemas. So any application, when it wants to use a WinFS type, by using the schema can find out the structure of the data and utilize the information. So, an application has access to all data on the system, even though the developer did not have to write parsers to recognize the different data formats. It can also use the relationships and related data to create dynamic filters to present the information the application deals with, in different ways. The WinFS API further abstracts the task of accessing data. All WinFS types are exposed as .NET
.NET Framework
The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...

 objects
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...

 with the properties of the object directly mapping to the properties of the data type. Also, by letting different applications which deal with the same data share the same WinFS data instance rather than storing the same data in different files, the hassles of synchronizing the different stores when the data change are removed. Thus WinFS can reduce redundancies.

Access to all the data in the system allows complex searches for data to be performed across all the data items managed by WinFS. In the example used above ("the phone numbers of all persons who live in Acapulco and each have more than 100 appearances in my photo collection and with whom I have had e-mail within last month"), WinFS can traverse the subject relationship of all the photos to find the contact items. Similarly, it can filter all emails in last month and access the communicated with relation to reach the contacts. The common contacts can then be figured out from the two sets of results and their phone number retrieved by accessing the suitable property of the contact items.

WinFS, in addition to fully schematized data (like XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

 and relational
Relational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...

 data), supports semi-structured (like images, which has an unstructured bitstream plus structured metadata) as well as unstructured (like files) as well. It stores the unstructured components directly as files while storing the structured metadata in the structured store. Internally, WinFS uses a relational database
Relational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...

 to manage data. However, it does not limit the data to belonging to any particular data model, like relational or hierarchical, but can be of any well-defined schema. The WinFS runtime maps
Object-relational mapping
Object-relational mapping in computer software is a programming technique for converting data between incompatible type systems in object-oriented programming languages. This creates, in effect, a "virtual object database" that can be used from within the programming language...

 the schema to a relational modality, by defining the tables it will store the types in and the primary keys and foreign key
Foreign key
In the context of relational databases, a foreign key is a referential constraint between two tables.A foreign key is a field in a relational table that matches a candidate key of another table...

s that would be required to represent the relationships. WinFS includes mappings for object and XML schemas by default; mappings for other schemas needs to be specified. Object schemas are specified using XML; WinFS generates code to surface the schemas as .NET
.NET Framework
The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...

 classes
Class (computer science)
In object-oriented programming, a class is a construct that is used as a blueprint to create instances of itself – referred to as class instances, class objects, instance objects or simply objects. A class defines constituent members which enable these class instances to have state and behavior...

. ADO.NET
ADO.NET
ADO.NET is a set of computer software components that programmers can use to access data and data services. It is a part of the base class library that is included with the Microsoft .NET Framework. It is commonly used by programmers to access and modify data stored in relational database systems,...

 can be used to directly specify the relational schema, though a mapping to the object schema needs to be provided to surface it as classes. All relationship traversals are performed as joins on these tables. WinFS also automatically creates indexes
Index (database)
A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of slower writes and increased storage space...

 on these tables, to facilitate fast access to the information. Indexes significantly speed up joins, and thus traversing relationships to retrieve related data is performed very fast. Indexes are also used during searching of information; searching and querying use the indexes so that the operations complete quickly, much like desktop search
Desktop search
Desktop search is the name for the field of search tools which search the contents of a user's own computer files, rather than searching the Internet...

 systems.

Development

The development of WinFS is an extension to a feature which was initially planned in the early 1990s. Dubbed Object File System, it was supposed to be included as part of Cairo
Cairo (operating system)
Cairo was the code name for a project at Microsoft from 1991 to 1996. Its charter was to build technologies for a next generation operating system that would fulfill Bill Gates' vision of "information at your fingertips." Cairo never shipped, although portions of its technologies have since...

. OFS was supposed to have powerful data aggregation features, but the Cairo project was shelved, and with it OFS. However, later during the development of COM
Component Object Model
Component Object Model is a binary-interface standard for software componentry introduced by Microsoft in 1993. It is used to enable interprocess communication and dynamic object creation in a large range of programming languages...

, a storage system, called Storage+, based on then-upcoming SQL Server 8.0, was planned, which was slated to offer similar aggregation features. This, too, never materialized, and a similar technology, Relational File System (RFS), was conceived to be launched with SQL Server 2000.However, SQL Server 2000 ended up being a minor upgrade to SQL Server 7.0 and RFS was not implemented.

But the concept was not scrapped. It just morphed into WinFS. WinFS was initially planned for inclusion in Windows Vista
Windows Vista
Windows Vista is an operating system released in several variations developed by Microsoft for use on personal computers, including home and business desktops, laptops, tablet PCs, and media center PCs...

, and build 4051 of Windows Vista, then called by its codename "Longhorn", given to developers at the Microsoft Professional Developers Conference
Professional Developers Conference
Microsoft's Professional Developers Conference is a conference for software developers, normally Windows developers.It covers new and upcoming technology from Microsoft, and so only occurs in the years when there is something new to talk about...

 in 2003, included WinFS, but it suffered from significant performance issues. In August 2004, Microsoft announced that WinFS would not ship with Windows Vista; it would instead be available as a downloadable update after Vista's release.

On August 29, 2005, Microsoft quietly made Beta 1 of WinFS available to MSDN subscribers. It worked on Windows XP
Windows XP
Windows XP is an operating system produced by Microsoft for use on personal computers, including home and business desktops, laptops and media centers. First released to computer manufacturers on August 24, 2001, it is the second most popular version of Windows, based on installed user base...

, and required the .NET Framework
.NET Framework
The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...

 to run. The WinFS API was included in the System.Storage namespace. The beta was refreshed on December 1, 2005 to be compatible with version 2.0 of the .NET Framework. WinFS Beta 2 was planned for some time later in 2006, and was supposed to include integration with Windows Desktop Search, so that search results include results from both regular files and WinFS stores, as well as allow access of WinFS data using ADO.NET
ADO.NET
ADO.NET is a set of computer software components that programmers can use to access data and data services. It is a part of the base class library that is included with the Microsoft .NET Framework. It is commonly used by programmers to access and modify data stored in relational database systems,...

.

However, on June 23, 2006, the WinFS team at Microsoft announced that WinFS would no longer be delivered as a separate product, and some components would be brought under the umbrella of other technologies - like the object-relational mapping
Object-relational mapping
Object-relational mapping in computer software is a programming technique for converting data between incompatible type systems in object-oriented programming languages. This creates, in effect, a "virtual object database" that can be used from within the programming language...

 components into ADO.NET Entity Framework; support for unstructured data, adminless mode of operation, support for file system
File system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...

 objects via the FILESTREAM data type, and hierarchical data in SQL Server 2008, then codenamed Katmai, as well as integration with Win32 APIs and Windows Shell
Windows Shell
The Windows shell is the main graphical user interface in Microsoft Windows, and since Windows 95 hosted by Windows Explorer. The Windows shell includes well-known Windows components such as the Taskbar and the Start menu...

 and support for traversal of hierarchies by traversing relationships into later releases of Microsoft SQL Server
Microsoft SQL Server
Microsoft SQL Server is a relational database server, developed by Microsoft: It is a software product whose primary function is to store and retrieve data as requested by other software applications, be it those on the same computer or those running on another computer across a network...

; and the synchronization components into Microsoft Sync Framework
Microsoft Sync Framework
Microsoft Sync Framework is a data synchronization platform from Microsoft that can be used to synchronize data across multiple data stores. Sync Framework includes a transport-agnostic architecture, into which data store-specific synchronization providers, modelled on the ADO.NET data provider...

. However, having a shared-schema storage system built into a future iteration of Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 has not yet been ruled out.

With that announcement, most analysts assumed that the WinFS project was being killed off. But in November 2006, Steve Ballmer
Steve Ballmer
Steven Anthony "Steve" Ballmer is an American business magnate. He is the chief executive officer of Microsoft, having held that post since January 2000. , his personal wealth is estimated at US$13.9 billion, ranking number 19 on the Forbes 400.-Early life:Ballmer was born in Detroit, Michigan to...

 said in an interview that WinFS is being actively developed but integration into the Windows codebase will come only after the technology has fully incubated. It was subsequently confirmed in an interview with Bill Gates
Bill Gates
William Henry "Bill" Gates III is an American business magnate, investor, philanthropist, and author. Gates is the former CEO and current chairman of Microsoft, the software company he founded with Paul Allen...

 and that Microsoft plans to migrate applications like Windows Media Player
Windows Media Player
Windows Media Player is a media player and media library application developed by Microsoft that is used for playing audio, video and viewing images on personal computers running the Microsoft Windows operating system, as well as on Pocket PC and Windows Mobile-based devices...

, Windows Photo Gallery
Windows Photo Gallery
Windows Photo Gallery is a photo management, tagging, and editing tool developed by Microsoft, and is included with all editions of Windows Vista...

, Microsoft Office Outlook etc. to use WinFS as the data storage back-end.

Architecture

WinFS uses a relational engine, which is derived from SQL Server 2005, to provide the data relations mechanism. WinFS stores are simply SQL Server database (.MDF) files with the FILESTREAM attribute set. These files are stored in access-restricted folder named "System Volume Information" placed into the volume root, in folders under the folder "WinFS" with names of GUID
Globally Unique Identifier
A globally unique identifier is a unique reference number used as an identifier in computer software. The term GUID also is used for Microsoft's implementation of the Universally unique identifier standard....

s of these stores.

At the bottom of the WinFS stack
Solution stack
In computing, a solution stack is a set of software subsystems or components needed to deliver a fully functional solution, e.g. a product or service....

 lies WinFS Core which interacts with the filesystem
File system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...

 and provides file access and addressing capabilities. The relational engine leverages the WinFS core services to present a structured store and other services such as locking
Lock (computer science)
In computer science, a lock is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. Locks are one way of enforcing concurrency control policies.-Types:...

 which the WinFS runtime uses to implement the functionality. The WinFS runtime expose Services such as Synchronization and Rules which can be used to synchronize WinFS stores or perform certain actions on the occurrence of certain events.

WinFS runs as a service
Windows Service
On Microsoft Windows operating systems, a Windows service is a long-running executable that performs specific functions and which is designed not to require user intervention. Windows services can be configured to start when the operating system is booted and run in the background as long as...

 which runs three processes
Process (computing)
In computing, a process is an instance of a computer program that is being executed. It contains the program code and its current activity. Depending on the operating system , a process may be made up of multiple threads of execution that execute instructions concurrently.A computer program is a...

 - WinFS.exe, which hosts relational datastore, WinFSSearch.exe, which hosts the indexing and querying engine, and WinFPM.exe (WinFS File Promotion Manager), which interfaces with the underlying file system. It allows programmatic access to its features, via a set of .NET Framework
.NET Framework
The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...

 APIs
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

, that enables applications to define custom made data types, define relationships among data, store and retrieve information, and allow advanced searches. The applications can then aggregate the data and present the aggregated data to the user.

Data store

WinFS stores data in relational stores, which are exposed as virtual locations called stores. A WinFS store is a common repository where any application can store data along with its metadata, relationships and schema. WinFS runtime can apply certain relationships itself; for example, if the values of the subject property of a picture and the name property of a contact are same, then WinFS can relate the contact with the picture. Relations can also be specified by other applications or the user.

WinFS provides a unified storage, but stops short of defining the format that is to be stored in the data stores. Instead it supports data to be written in application specific formats. But applications must provide a schema
Logical schema
A Logical Schema is a data model of a specific problem domain expressed in terms of a particular data management technology. Without being specific to a particular database management product, it is in terms of either relational tables and columns, object-oriented classes, or XML tags...

 that defines how the file format should be interpreted. For example, a schema could be added to allow WinFS to understand how to read and thus be able to search and analyze, say, a PDF file. By using the schema, any application can read data from any other application, and also allows different applications to write in each other’s format by sharing the schema.

Multiple WinFS stores can be created on a single machine. This allows different classes of data to be kept segregated; for example, official documents and personal documents can be kept in different stores. WinFS, by default, provides only one store, named "DefaultStore". WinFS stores are exposed as shell objects, akin to Virtual folder
Virtual folder
In computing, a virtual folder generally denotes an organizing principle for files that is not dependent on location in a hierarchical directory tree...

s, which dynamically generates a list of all items present in the store and presents them in a folder view. The shell object also allows searching information in the datastore.

A data unit that has to be stored in a WinFS store is called a WinFS Item. A WinFS item, along with the core data item, also contains information on how the data item is related to other data. This Relationship is stored in terms of logical links. Links specify which other data items the current item is related with. Put in other words, links specify the relationship of the data with other data items. Links are physically stored using a link identifier, which specifies the name and intent of the relationship, such as type of or consists of. The link identifier is stored as an attribute of the data item. All the objects which have the same link id are considered to be related. An XML schema
XML schema
An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself...

, defining the structure of the data items that will be stored in WinFS, must be supplied to the WinFS runtime beforehand. In Beta 1 of WinFS, the schema assembly had to be added to the GAC before it could be used.

Data model

WinFS models data using the data items, along with their relationships, extensions and rules governing its usage. WinFS needs to understand the type and structure of the data items, so that the information stored in the data item can be made available to any application that requests it. This is done by the use of schemas. For every type of data item that is to be stored in WinFS, a corresponding schema needs to be provided to define the type, structure and associations of the data. These schemas are defined using XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

.

Predefined WinFS schemas include schemas for documents, e-mail, appointments, tasks, media, audio, video, and also includes system schemas that include configuration, programs, and other system-related data. Custom schemas can be defined on a per-application basis, in situations where an application wants to store its data in WinFS, but not share the structure of that data with other applications, or they can be made available across the system.

Type system

The most important difference between a file system and WinFS is that WinFS knows the type of each data item that it stores. And the type specifies the properties of the data item. The WinFS type system is closely associated with the .NET framework’s concept of classes and inheritance
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...

. A new type can be created by extending
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...

 and nesting any predefined types.

WinFS provides four predefined base types – Items, Relationships, ScalarTypes and NestedTypes. An Item is the fundamental data object which can be stored, and a Relationship is the relation or link between two data items. Since all WinFS items must have a type, the type of item stored defines its properties. The properties of an Item may be a ScalarType, which defines the smallest unit of information a property can have, or a NestedType, which is a collection of more than one ScalarTypes and/or NestedTypes. All WinFS types are made available as .NET CLR classes
Class (computer science)
In object-oriented programming, a class is a construct that is used as a blueprint to create instances of itself – referred to as class instances, class objects, instance objects or simply objects. A class defines constituent members which enable these class instances to have state and behavior...

.

Any object represented as a data unit, such as contact, image, video, document etc., can be stored in a WinFS store as a specialization of the Item type. By default, WinFS provides Item types for Files, Contact, Documents, Pictures, Audio, Video, Calendar, and Messages. The File Item can store any generic data, which is stored in file systems as files. But unless an advanced schema is provided for the file, by defining it to be a specialized Item, WinFS will not be able to access its data. Such a file Item can only support being related to other Items.

A developer can extend any of these types, or the base type Item, to provide a type for his custom data. The data contained in an Item is defined in terms of properties, or fields which hold the actual data. For example, an Item Contact may have a field Name which is a ScalarType, and one field Address, a NestedType, which is further composed of two ScalarTypes. To define this type, the base class Item is extended and the necessary fields are added to the class. A NestedType field can be defined as another class which contains the two ScalarType fields. Once the type is defined, a schema has to be defined, which denotes the primitive type of each field, for example, the Name field is a String, the Address field is a custom defined Address class, both the fields of which are Strings. Other primitive types that WinFS supports are Integer
Integer
The integers are formed by the natural numbers together with the negatives of the non-zero natural numbers .They are known as Positive and Negative Integers respectively...

, Byte
Byte
The byte is a unit of digital information in computing and telecommunications that most commonly consists of eight bits. Historically, a byte was the number of bits used to encode a single character of text in a computer and for this reason it is the basic addressable element in many computer...

, Decimal, Float, Double
Double precision
In computing, double precision is a computer number format that occupies two adjacent storage locations in computer memory. A double-precision number, sometimes simply called a double, may be defined to be an integer, fixed point, or floating point .Modern computers with 32-bit storage locations...

, Boolean
Logical value
In logic and mathematics, a truth value, sometimes called a logical value, is a value indicating the relation of a proposition to truth.In classical logic, with its intended semantics, the truth values are true and false; that is, classical logic is a two-valued logic...

 and DateTime, among others. The schema will also define which fields are mandatory and which are optional. The Contact Item defined in this way will be used to store information regarding the Contact, by populating the properties field and storing it. Only those fields marked as mandatory needs to be filled up during initial save. Other fields may be populated later by the user, or not populated at all. If more properties fields, such as last conversed date, need to be added, this type can be extended to accommodate them. Item types for other data can be defined similarly.

WinFS creates tables for all defined Items. All the fields defined for the Item form the columns of the table and all instances of the Item are stored as rows in the table for the respective Items. Whenever some field in the table refers to data in some other table, it is considered a relationship. The schema of the relationship specifies which tables are involved and what the kind and name of the relationshp is. The WinFS runtime manages the relationshp schemas. All Items are exposed as .NET CLR objects
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...

, with a uniform interface providing access to the data stored in the fields. Thus any application can retrieve object of any Item type and can use the data in the object, without being aware of the physical structure the data was stored in.

WinFS types are exposed as .NET classes, which can be instantiated as .NET objects. Data are stored in these type instances by setting their properties. Once done, they are persisted into the WinFS store. A WinFS store is accessed using an ItemContext class (see Data retrieval section for details). ItemContext allows transactional access to the WinFS store; i.e. all the operations since binding an ItemContext object to a store till it is closed either all succeed or are all rolled back. As changes are made to the data, they are not written to the disc; rather they are written to an in-memory log. Only when the connection is closed are the changes written to the disc in a batch. This helps to optimize disc I/O. The following code snippet, written in C#, creates a contact and stores it in a WinFS store.


//Connect to the default WinFS store
using(ItemContext ic = ItemContext.Open)
{
//Create the contact and set the data in appropriate properties
ContactEAddress contact = new ContactEAddress;

//Name is a ComplexType
contact.Name = new PersonName;
contact.Name.Displayname = "Doe, John";
contact.Name.FirstName = "John";
contact.Name.LastName = "Doe";

//Telephone number is a ComplexType
contact.TelephoneNumber = new TelephoneNumber; //ComplexType
contact.TelephoneNumber.Country = CountryCode.Antarctica;
contact.TelephoneNumber.Areacode = 4567;
contact.TelephoneNumber.Number = 9876543210;

//Age is a SimpleType
contact.Age = 111;

//Add the object to the user's personal folder.
//This relates the item with the Folder pseudo-type, for backward
//compatibility, as this lets the item to be accessed in a folder
//hierarchy for apps which are not WinFS native.
Folder containingFolder = UserDataFolder.FindMyPersonalFolder;
containingFolder.OutFolderMemberRelationship.AddItem(ic, contact);

//Find a document and relate with the document. Searching begins by creating an
//ItemSearcher object. Each WinFS type object contains a GetSearcher method
//that generates an ItemSearcher object which searches documents of that type.
using (ItemSearcher searcher = Document.GetSearcher(ic))
{
Document d = searcher.Find(@"Title = 'Some Particular Document'");
d.OutAuthoringRelationship.AddItem(ic, contact);
}
//Since only one document is to be found, the ItemContext.FindOne method
//could be used as well.

//Find a picture and relate with it
using (ItemSearcher searcher = Picture.GetSearcher(ic))
{
Picture p = searcher.Find(@"Occasion = 'Graduation' and Sequence = '3'");
p.OutSubjectRelationship.AddItem(ic, contact);
}

//Persist to the store and close the reference to the store
ic.Update;
ic.Close;
}

Relationships

A datum can be related
Relational model
The relational model for database management is a database model based on first-order predicate logic, first formulated and proposed in 1969 by Edgar F...

 to one more item, giving rise to a one-to-one relationship, or with more than one items, resulting in a one-to-many relationship. The related items, in turn, may be related to other data items as well, resulting in a network of relationships, which is called a many-to-many relationship. Creating a relationship between two Items create another field in the data of the Items concerned which refer the row in the other Item’s table where the related object is stored.
In WinFS, a Relationship is an instance of the base type Relationship, which is extended to signify a specialization of a relation. A Relationship is a mapping between two items, a Source and a Target. The source has an Outgoing Relationship, whereas the target gets an Incoming Relationship. WinFS provides three types of primitive relationships – Holding Relationship, Reference Relationship and Embedding Relationship. Any custom relationship between two data types are instances of these relationship types.
  • Holding Relationships specifies ownership and lifetime (which defines how long the relationship is valid) of the Target Item. For example, the Relationship between a folder and a file, and between an Employee and his Salary record, is a Holding Relationship – the latter is to be removed when the former is removed. A Target Item can be a part of more than one Holding Relationships. In such a case, it is to be removed when all the Source Items are removed.
  • Reference Relationships provide linkage between two Items, but do not have any lifetime associated, i.e., each Item will continue to be stored even without the other.
  • Embedding Relationships give order to the two Items which are linked by the Relationship, such as the Relationship between a Parent Item and a Child Item.


Relationships between two Items can either be set programmatically by the application creating the data, or the user can use the WinFS Item Browser to manually relate the Items. A WinFS item browser can also graphically display the items and how they are related, to enable the user to know how their data are organized.

Rules

WinFS includes Rules, which are executed when a certain condition is met. WinFS rules work on data and data relationships. For example, a rule can be created which states that whenever an Item is created which contains field "Name" and if the value of that field is some particular name, a relationship should be created which relates the Item with some other Item. WinFS rules can also access any external application. For example, a rule can be built which launches a Notify application whenever a mail is received from a particular contact. WinFS rules can also be used to add new properties fields to existing data Items.

WinFS rules are also exposed as .NET CLR objects. As such any rule can be used for any purpose. A rule can even be extended by inheriting from it to form a new rule which consists of the condition and action of the parent rule plus something more.

RAV

WinFS supports creating Rich Application Views (RAV) by aggregating different data in a virtual table format. Unlike database view
View (database)
In database theory, a view consists of a stored query accessible as a virtual table in a relational database or a set of documents in a document-oriented database composed of the result set of a query or map and reduce functions...

, where each individual element can only be a scalar value, RAVs can have complex Items or even collection of Items. The actual data can be across multiple data types or instances and can even be retrieved by traversing relationships. RAVs are intrinsically paged (dividing the entire set of data into smaller pages containing disconnected subsets of the data) by the WinFS runtime. The page size is defined during creation of the view and the WinFS API exposes methods to iterate over the pages. RAVs also supports modification of the view according to different grouping parameters. Views can also be queried against.

Access control

Even though all data are shared, everything is not equally accessible. WinFS uses the Windows authentication system
Integrated Windows Authentication
Integrated Windows Authentication is a term associated with Microsoft products that refers to the SPNEGO, Kerberos, and NTLMSSP authentication protocols with respect to SSPI functionality introduced with Microsoft Windows 2000 and included with later Windows NT-based operating systems...

 to provide two data protection mechanisms. First, there is share-level security that controls access to your WinFS share. Second, there is item level security that supports NT compatible security descriptors. The process accessing the item must have enough privileges to access it. Also in Vista there is the concept of "integrity level" for an application. Higher integrity data cannot be accessed by a lower integrity process.

Data retrieval

The primary mode of data retrieval from a WinFS store is querying the WinFS store according to some criteria, which returns an enumerable set of items matching the criteria. The criteria for the query is specified using the OPath query language
Query language
Query languages are computer languages used to make queries into databases and information systems.Broadly, query languages can be classified according to whether they are database query languages or information retrieval query languages...

. The returned data are made available as instances of the type schemas, conforming to the .NET object model
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...

. The data in them can be accessed by accessing the properties of individual objects.

Relations are also exposed as properties. Each WinFS Item has two properties, named IncomingRelationships and OutgoingRelationships, which provides access to the set of relationship instances the item participates in. The other item which participates in one relationship instance can be reached through the proper relationship instance.

The fact that the data can be accessed using its description, rather than location, can be used to provide end-user organizational capabilities without limiting to the hierarchical organization as used in file-systems. In a file system, each file or folder is contained in only one folder. But WinFS Items can participate in any number of holding relationships, that too with any other items. As such, end users are not limited to only file/folder organization. Rather, a contact can become a container for documents; a picture a container for contacts and so on. For legacy compatibility, WinFS includes a pseudo-type called Folder which is present only to participate in holding relationships and emulate file/folder organization. Since any WinFS Item can be related with more than one Folder item, from an end user perspective, an item can reside in multiple folders without duplicating the actual data. Applications can also analyze the relationship graphs
Graph (data structure)
In computer science, a graph is an abstract data structure that is meant to implement the graph and hypergraph concepts from mathematics.A graph data structure consists of a finite set of ordered pairs, called edges or arcs, of certain entities called nodes or vertices...

 to present various filters. For example, an email application can analyze the related contacts and the relationships of the contacts with restaurant bills and dynamically generate filters like "Emails sent to people I had lunch with".

Searches

The WinFS API provides a class called the ItemContext class, which is bound to a WinFS store. The ItemContext object can be used to scope the search to the entire store or a subset of it. It also provides transactional access to the store. An object of this class can then spawn an ItemSearcher object which then takes the type (an object representing the type) of the item to be retrieved or the relationship and the OPath query
Query language
Query languages are computer languages used to make queries into databases and information systems.Broadly, query languages can be classified according to whether they are database query languages or information retrieval query languages...

 string representing the criteria for the search. A set of all matches are returned, which can then be bound to an UI widget for displaying en masse or enumerating individually. The properties items can also be modified and then stored back to the data store to update the data. The ItemContext object is closed (which marks the end of association of the object with the store) when the queries are made or changes merged into the store.

Related items can also be accessed through the items. The IncomingRelationships and OutgoingRelationships properties gives access to all the set of relationship instances, typed to the name of the relationship. These relationship objects expose the other item via a property. So, for example, if a picture is related to a picture, it can be accessed by traversing the relationship as:


ContactsCollection contacts = picture.OutgoingRelationships.Cast(typeof(Contact)).Value;
//This retrieves the collection of all outgoing relationships from a picture object
//and filters down the contacts reachable from them and retrieves its value.

//Or the relationship can be statically specified as
ContactsCollection contacts = picture.OutgoingRelationships.OutContactRelationship.Contact;

An OPath query string allows to express the parameters that will be queried for to be specified using Item properties, embedded Items as well as Relationships. It can specify a single search condition, such as "title = Something'", or a compound condition such as "title = 'Title 1' || title = 'Title 2' && author = 'Someone'". These boolean and relational operations can be specified using C# like &&, ||, =, != operators as well as their English-like equivalent like EQUAL, NOT EQUAL. SQL
SQL
SQL is a programming language designed for managing data in relational database management systems ....

 like operators such as LIKE, GROUP BY and ORDER BY are also supported, as is wildcard conditions. So, "title LIKE 'any*'" is a valid query string. These operators can be used to execute complex searches such as

using ( ItemContext ic = ItemContext.Open )
{
//Searching begins by creating a ItemSearcher object. The searcher is created from a
//relationship instance because the contacts being searched for are in relation. The
//first parameter defines the scope of the search. An ItemContext as the scope means
//the entire store is to be searched. Scope can be limited to a set of Items which may
//be in a holding relationship with the contacts. In that case, the set is passed as
//the scope of the search.
ItemSearcher searcher = OutContactRelationship.GetTargetSearcher(ic, typeof(Contact));
ContactCollection contacts = searcher.FindAll("OutContactRelationship.Contact.Name LIKE 'A*'");
ic.Close;
}

The above code snippet creates an ItemSearcher object that searches on the OutContactRelationship instance that relates pictures and contacts, in effect searching all pictures related with a contact. It then runs the query Name LIKE 'A*'" on all contacts reachable through OutContactRelationship, returning the list of "contacts whose names start with A and whose pictures I have". Similarly more relationships could be taken into account to further narrow down the results. Further, a natural language query processor, which parses query in natural language and creates a well-formed OPath query string to search via proper relationships, can allow users to make searches such as "find the name of the wine I had with person X last month", provided financial management applications are using WinFS to store bills.

Different relations specify a different set of data. So when a search is made which encompasses multiple relations, the different sets of data are retrieved individually and a union
Relational algebra
Relational algebra, an offshoot of first-order logic , deals with a set of finitary relations that is closed under certain operators. These operators operate on one or more relations to yield a relation...

 of the different sets is computed. The resulting set contains only those data items which correspond to all the relations.

Notifications

WinFS also includes better support for handling data that changes frequently. Using WinFS Notifications, applications choose to be notified of changes to selected data Items. WinFS will raise an ItemChangedEvent, using the .NET Event model, when a subscribed-to Item changes, and the event will be published to the applications.

Data sharing

WinFS allows easy sharing of data between applications, and among multiple WinFS stores, which may reside on different computers, by copying to and from them. A WinFS item can also be copied to a non-WinFS file system, but unless that data item is put back into the WinFS store, it will not support the advanced services provided by WinFS.

The WinFS API also provides some support for sharing with non-WinFS applications. WinFS exposes a shell object to access WinFS stores. This object maps WinFS items to a virtual folder hierarchy, and can be accessed by any application. WinFS data can also be manually shared using network shares, by sharing the legacy shell object. Non-WinFS file formats can be stored in WinFS stores, using the File Item, provided by WinFS. Importers can be written, to convert specific file formats to WinFS Item types.

In addition, WinFS provides services to automatically synchronize items in two or more WinFS stores, subject to some predefined condition, such as "share only photos" or "share photos which have an associated contact X". The stores may be on different computers. Synchronization is done in a peer-to-peer
Peer-to-peer
Peer-to-peer computing or networking is a distributed application architecture that partitions tasks or workloads among peers. Peers are equally privileged, equipotent participants in the application...

 fashion; there is no central authority. A synchronization can be either manual or automatic or scheduled. During synchronization, WinFS finds the new and modified Items, and updates accordingly. If two or more changes conflict, WinFS can either resort to automatic resolution based on predefined rules, or defer the synchronization for manual resolution. WinFS also updates the schemas, if required.

Shell namespace

WinFS Beta 1 includes a shell
Windows Shell
The Windows shell is the main graphical user interface in Microsoft Windows, and since Windows 95 hosted by Windows Explorer. The Windows shell includes well-known Windows components such as the Taskbar and the Start menu...

 namespace extension, which surfaces WinFS stores as top level objects in My Computer view. Files can be copied into and out of the stores, as well as applications can be directly used to save there. Even folders such as My Documents can be redirected to the stores. WinFS uses Importer plug-ins to analyze the files as they were being imported to the store and create proper WinFS schemas and objects, and when taking the objects out, re-pack them into files. If importers for certain files are not installed, they are stored as generic File types.

Microsoft Rave

Microsoft Rave is an application that shipped with WinFS Beta 1. It allows synchronization of two or more WinFS stores, and supports synchronization in full mesh mode as well as the central hub topology. While synchronizing, Microsoft Rave will determine the changes made to each store since the last sync, and update accordingly. When applying the changes, it also detects if there is any conflict, i.e., the same data has been changed on both stores since the last synchronization. It will either log the conflicting data for later resolution or have it resolved immediately. Microsoft Rave uses peer-to-peer
Peer-to-peer
Peer-to-peer computing or networking is a distributed application architecture that partitions tasks or workloads among peers. Peers are equally privileged, equipotent participants in the application...

 technology to communicate and transfer data.

StoreSpy

With WinFS Beta 1, Microsoft included an unsupported application called StoreSpy, which allowed one to browse WinFS stores by presenting a hierarchical view of WinFS Items. It automatically generated virtual folder
Virtual folder
In computing, a virtual folder generally denotes an organizing principle for files that is not dependent on location in a hierarchical directory tree...

s based on access permissions, date and other metadata, and presented them in a hierarchical tree view, akin to what traditional folders are presented in. The application generated tabs for different Item types. StoreSpy allowed viewing Items, Relationships, MultiSet, Nested Elements, Extensions and other types in the store along with its full metadata. It also presented a search interface to perform manual searches, and save them as virtual folders. The application also presented a graphical view of WinFS Rules. However, it did not allow editing of Items or their properties, though it was slated for inclusion in a future release. But the WinFS project was cut back before it could materialize.

Type Browser

WinFS also includes another application, named WinFS Type Browser, which can be used to browse the WinFS types, as well as visualize the hierarchical relationship between WinFS types. A WinFS type, both built-in types as well as custom schemas, can be visualized along with all the properties and methods that it supports. It also shows the types that it derives from as well as other types that extend the type schema. However, while it was included with WinFS, it was released as an unsupported tool.

OPather

WinFS Beta 1 also includes an unsupported application, named OPather. It presents a graphical
Gui
Gui or guee is a generic term to refer to grilled dishes in Korean cuisine. These most commonly have meat or fish as their primary ingredient, but may in some cases also comprise grilled vegetables or other vegetarian ingredients. The term derives from the verb, "gupda" in Korean, which literally...

 interface for writing Opath queries. It can be used by selecting target object type and specifying the parameters of the query. It also includes Intellisense
IntelliSense
IntelliSense is Microsoft's implementation of autocompletion, best known for its use in the Microsoft Visual Studio integrated development environment...

-like parameter completion feature. It can then be used to perform visualization tasks like binding results of a query to a DataGrid
Grid view
A grid view or a datagrid is a graphical user interface element that presents a tabular view of data. A typical grid view also supports some or all of the following:* Clicking a column header to change the sort order of the grid...

control, create views of the data in WinFS itself, or just extract the query string.

Project "Orange"

Microsoft launched a project to build a data visualization application for WinFS. It was codenamed "Project Orange" and was supposedly built using Windows Presentation Foundation
Windows Presentation Foundation
Developed by Microsoft, the Windows Presentation Foundation is a computer-software graphical subsystem for rendering user interfaces in Windows-based applications. WPF, previously known as "Avalon", was initially released as part of .NET Framework 3.0. Rather than relying on the older GDI...

. It was supposed to provide exploration of Items stored in WinFS stores, and data relationships were supposed to be a prominent part of the navigation model. It was supposed to let people allow organization of the WinFS stores graphically as well – productizing many of the concepts shown in the IWish Concept Video. However, since the WinFS project went dark, the status of this project is unknown.

See also

  • Desktop organizer
    Desktop organizer
    Desktop Organizer software applications are applications that automatically create useful organizational structures from desktop content from heterogeneous types of content including email, files, contacts, companies, RSS news feeds, photos, music and chat sessions...

  • Relational Database Management System
  • Storage
    GNOME Storage
    GNOME Storage was a project to replace the traditional file system with a new document store. Storage was part of a larger design for a new desktop environment that was still under development. The current implementation includes natural language access and network transparency.Storage is no longer...

    , a storage management system for GNOME
    GNOME
    GNOME is a desktop environment and graphical user interface that runs on top of a computer operating system. It is composed entirely of free and open source software...

     desktop
  • NEPOMUK-KDE

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK