Libwww
Encyclopedia
libwww is a highly-modular client-side
Client-side
Client-side refers to operations that are performed by the client in a client–server relationship in a computer network.Typically, a client is a computer application, such as a web browser, that runs on a user's local computer or workstation and connects to a server as necessary...

 web API
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

 for Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 and Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

, and is also the name of the reference implementation of this API.

It can be used for both large and small applications including: web browser
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...

s/editors
HTML editor
An HTML editor is a software application for creating web pages. Although the HTML markup of a web page can be written with any text editor, specialized HTML editors can offer convenience and added functionality. For example, many HTML editors work not only with HTML, but also with related...

, robot
Internet bot
Internet bots, also known as web robots, WWW robots or simply bots, are software applications that run automated tasks over the Internet. Typically, bots perform tasks that are both simple and structurally repetitive, at a much higher rate than would be possible for a human alone...

s and batch tools. There are pluggable modules provided with libwww which include complete HTTP
Hypertext Transfer Protocol
The Hypertext Transfer Protocol is a networking protocol for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web....

/1.1 with caching
Cache
In computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...

, pipelining
HTTP pipelining
HTTP pipelining is a technique in which multiple HTTP requests are sent on a single HTTP connection without waiting for the corresponding responses....

, POST
POST (HTTP)
In computing, POST is one of many request methods supported by the HTTP protocol used by the World Wide Web. The POST request method is used when the client needs to send data to the server as part of the request, such as when uploading a file or submitting a completed form.In contrast to the GET...

, Digest Authentication
Digest access authentication
Digest access authentication is one of the agreed upon methods a web server can use to negotiate credentials with a user's web browser. It uses encryption to send the password over the network which is safer than the Basic access authentication that sends plaintext.Technically digest...

, deflate
DEFLATE
Deflate is a lossless data compression algorithm that uses a combination of the LZ77 algorithm and Huffman coding. It was originally defined by Phil Katz for version 2 of his PKZIP archiving tool and was later specified in RFC 1951....

, etc.

The purpose of libwww is to serve as a testbed
Testbed
A testbed is a platform for experimentation of large development projects. Testbeds allow for rigorous, transparent, and replicable testing of scientific theories, computational tools, and new technologies.The term is used across many disciplines to describe a development environment that is...

 for protocol experiments and that software developers don't need to "reinvent the wheel".

libcurl is considered to be a modern replacement for libwww.

History

In 1991 and 1992, Tim Berners-Lee
Tim Berners-Lee
Sir Timothy John "Tim" Berners-Lee, , also known as "TimBL", is a British computer scientist, MIT professor and the inventor of the World Wide Web...

 and a student at CERN
CERN
The European Organization for Nuclear Research , known as CERN , is an international organization whose purpose is to operate the world's largest particle physics laboratory, which is situated in the northwest suburbs of Geneva on the Franco–Swiss border...

 named Jean-Francois Groff rewrote various components of the original WorldWideWeb
WorldWideWeb
WorldWideWeb, later renamed to Nexus to avoid confusion between the software and the World Wide Web, was the first web browser and editor. When it was written, WorldWideWeb was the only way to view the Web....

 browser for the NeXTstep
NEXTSTEP
NeXTSTEP was the object-oriented, multitasking operating system developed by NeXT Computer to run on its range of proprietary workstation computers, such as the NeXTcube...

 operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 in portable C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 code, in order to demonstrate the potential of the World Wide Web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...

. In the beginning libwww was referred to as the Common Library and was not available as a separate product. Before becoming generally available, libwww was integrated in the CERN program library
CERN Program Library
The CERN Program Library or CERNLIB is a collection of FORTRAN77 libraries and modules, maintained "as is" by CERN. Its content ranges from more specialized data analysis of high energy physics to general purpose numerical analysis...

 (CERNLIB). In July 1992 the library was ported to DECnet
DECnet
DECnet is a suite of network protocols created by Digital Equipment Corporation, originally released in 1975 in order to connect two PDP-11 minicomputers. It evolved into one of the first peer-to-peer network architectures, thus transforming DEC into a networking powerhouse in the 1980s...

. In the May 1993 World Wide Web Newsletter Berners-Lee announced that the Common Library was now called libwww and is licensed as public domain
Public domain
Works are in the public domain if the intellectual property rights have expired, if the intellectual property rights are forfeited, or if they are not covered by intellectual property rights at all...

 to encourage the development of web browsers. He initially considered releasing the software under the GNU General Public License
GNU General Public License
The GNU General Public License is the most widely used free software license, originally written by Richard Stallman for the GNU Project....

, rather than into the public domain, but decided against it due to concerns that large corporations such as IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

 would be deterred from using it by the restrictions of the GPL. The rapid early development of the library caused Robert Cailliau
Robert Cailliau
Robert Cailliau , born 26 January 1947, is a Belgian informatics engineer and computer scientist who, together with Sir Tim Berners-Lee, developed the World Wide Web.-Biography:...

 problems when integrating it into his MacWWW
MacWWW
MacWWW, also known as Samba, is an early minimalist web browser from 1992 meant to run on Macintosh computers. It was the first web browser for the Mac OS platform, and the first for any non-Unix operating system. MacWWW tries to emulate the design of WorldWideWeb. Unlike modern browsers it opens...

 browser.

From 25 November 1994 (version 2.17) Henrik Frystyk Nielsen
Henrik Frystyk Nielsen
Henrik Frystyk Nielsen is a Danish engineer and computer scientist. He is best known for his pioneering work on the World Wide Web and subsequent work on computer network protocols.-Biography:...

 was responsible for libwww.
On 21 March 1995, with the release of version 3.0, CERN put the full responsibility for libwww on the World Wide Web Consortium
World Wide Web Consortium
The World Wide Web Consortium is the main international standards organization for the World Wide Web .Founded and headed by Tim Berners-Lee, the consortium is made up of member organizations which maintain full-time staff for the purpose of working together in the development of standards for the...

 (W3C). From 1995 onwards, the Line Mode Browser was no longer released separately, but part of the libwww package.

The W3C created the Arena web browser
Arena (web browser)
The Arena browser was an early testbed web browser and web authoring tool for Unix. Originally authored by Dave Raggett in 1993, the browser continued its development at CERN and the World Wide Web Consortium and subsequently by Yggdrasil Computing...

 as a testbed and testing tool for HTML3, CSS
Cascading Style Sheets
Cascading Style Sheets is a style sheet language used to describe the presentation semantics of a document written in a markup language...

, PNG and other features like the libwww, but after beta 3, Arena was replaced by Amaya
Amaya (web browser)
Amaya is a free and open source WYSIWYG web authoring tool with browsing abilities, created by a structured editor project at the INRIA, a French national research institution, and later adopted by the World Wide Web Consortium . Amaya is used as a testbed for web standards and replaced the Arena...

. On 2 September 2003 the W3C stopped development of library due a lack of resources, with the expectation that any further development to come from the open source community
Free software community
The free-software community is an informal term that refers to the users and developers of free software as well as supporters of the free-software movement. The movement is sometimes referred to as the open-source software community or a subset thereof...

.

Features

Libwww supports following protocols:
  • file
    File URI scheme
    The file URI scheme is a URI scheme specified in RFC 1630 and RFC 1738, typically used to retrieve files from within one's own computer.- Format :A file URL takes the form of file://host/path...

  • FTP
    File Transfer Protocol
    File Transfer Protocol is a standard network protocol used to transfer files from one host to another host over a TCP-based network, such as the Internet. FTP is built on a client-server architecture and utilizes separate control and data connections between the client and server...

  • Gopher
  • HTTP
    Hypertext Transfer Protocol
    The Hypertext Transfer Protocol is a networking protocol for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web....

     1.1 with a Persistent Cache Manager, pipelining
    HTTP pipelining
    HTTP pipelining is a technique in which multiple HTTP requests are sent on a single HTTP connection without waiting for the corresponding responses....

  • NNTP
    Network News Transfer Protocol
    The Network News Transfer Protocol is an Internet application protocol used for transporting Usenet news articles between news servers and for reading and posting articles by end user client applications...

  • Telnet
    TELNET
    Telnet is a network protocol used on the Internet or local area networks to provide a bidirectional interactive text-oriented communications facility using a virtual terminal connection...

  • WAIS
    Wide area information server
    Wide Area Information Servers or WAIS is a client–server text searching system that uses the ANSI Standard Z39.50 Information Retrieval Service Definition and Protocol Specifications for Library Applications" to search index databases on remote computers...



Other features include:
  • TLS and SSL
    Transport Layer Security
    Transport Layer Security and its predecessor, Secure Sockets Layer , are cryptographic protocols that provide communication security over the Internet...

     can be used through OpenSSL
    OpenSSL
    OpenSSL is an open source implementation of the SSL and TLS protocols. The core library implements the basic cryptographic functions and provides various utility functions...

    .
  • gzip
    Gzip
    Gzip is any of several software applications used for file compression and decompression. The term usually refers to the GNU Project's implementation, "gzip" standing for GNU zip. It is based on the DEFLATE algorithm, which is a combination of Lempel-Ziv and Huffman coding...

     compression and decompression through zlib
    Zlib
    zlib is a software library used for data compression. zlib was written by Jean-Loup Gailly and Mark Adler and is an abstraction of the DEFLATE compression algorithm used in their gzip file compression program. Zlib is also a crucial component of many software platforms including Linux, Mac OS X,...

  • a HTML
    HTML
    HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

    , RDF
    Resource Description Framework
    The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...

    , SGML
    Standard Generalized Markup Language
    The Standard Generalized Markup Language is an ISO-standard technology for defining generalized markup languages for documents...

     and XML
    XML
    Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

     parser
    Parsing
    In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar...

     and a style sheet manager
  • a integration of a SQL
    SQL
    SQL is a programming language designed for managing data in relational database management systems ....

     database (using the MySQL
    MySQL
    MySQL officially, but also commonly "My Sequel") is a relational database management system that runs as a server providing multi-user access to a number of databases. It is named after developer Michael Widenius' daughter, My...

    ) for i.e. web crawler
    Web crawler
    A Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner or in an orderly fashion. Other terms for Web crawlers are ants, automatic indexers, bots, Web spiders, Web robots, or—especially in the FOAF community—Web scutters.This process is called Web...

    s


Libwww supports plug-ins.

Applications using libwww

Over 19 applications have used libwww.
  • Agora
    Agora (web browser)
    Agora was a World Wide Web email browser and was a proof of concept to help people to use the full internet. Agora was an email-based web browser designed for non-graphic terminals and to help people without full access to the internet such as in developing countries or without a permanent internet...

  • Arena
  • Amaya
  • Cello
    Cello (web browser)
    Cello was an early shareware 16-bit multipurpose web browser for Windows 3.1 developed by Thomas R. Bruce of the Legal Information Institute at Cornell Law School. It was the first web browser for Microsoft Windows, and thus was among the first free winsock browsers...

  • CERN httpd
    CERN httpd
    CERN httpd was a web server daemon originally developed at CERN from 1990 onwards by Tim Berners-Lee, Ari Luotonen and Henrik Frystyk Nielsen...

     server
  • Cygwin
    Cygwin
    Cygwin is a Unix-like environment and command-line interface for Microsoft Windows. Cygwin provides native integration of Windows-based applications, data, and other system resources with applications, software tools, and data of the Unix-like environment...

  • Distributed Oceanographic Data Systems
    Distributed Oceanographic Data Systems
    The Distributed Oceanographic Data Systems, or DODS, is a type of server that allows sharing data with remote users or between DODS servers. It is developed by the National Oceanic and Atmospheric Administration, and is based upon the OPeNDAP data transport architecture....

     with the OPeNDAP
    OPeNDAP
    OPeNDAP, an acronym for "Open-source Project for a Network Data Access Protocol", is a data transport architecture and protocol widely used by earth scientists. The protocol is based on HTTP and the current specification is . OPeNDAP includes standards for encapsulating structured data, annotating...

  • GRIF Symposia, a HTML editor
    HTML editor
    An HTML editor is a software application for creating web pages. Although the HTML markup of a web page can be written with any text editor, specialized HTML editors can offer convenience and added functionality. For example, many HTML editors work not only with HTML, but also with related...

  • Lynx
    Lynx (web browser)
    Lynx is a text-based web browser for use on cursor-addressable character cell terminals and is very configurable.-Usage:Browsing in Lynx consists of highlighting the chosen link using cursor keys, or having all links on a page numbered and entering the chosen link's number. Current versions support...

  • MacWWW
    MacWWW
    MacWWW, also known as Samba, is an early minimalist web browser from 1992 meant to run on Macintosh computers. It was the first web browser for the Mac OS platform, and the first for any non-Unix operating system. MacWWW tries to emulate the design of WorldWideWeb. Unlike modern browsers it opens...

  • Mosaic
    Mosaic (web browser)
    Mosaic is the web browser credited with popularizing the World Wide Web. It was also a client for earlier protocols such as FTP, NNTP, and gopher. Its clean, easily understood user interface, reliability, Windows port and simple installation all contributed to making it the application that opened...

     and Mosaic-based browser
  • ROS (Robot Operating System)
    ROS (Robot Operating System)
    Robot Operating System is a software framework for robot software development, providing operating system-like functionality on a heterogenous computer cluster. ROS was originally developed in 2007 under the name switchyard by the Stanford Artificial Intelligence Laboratory in support of the...

  • TkWeb
  • tkWWW
    TkWWW
    tkWWW was an early web browser/WYSIWYG HTML editor written by Joseph Wang at the MIT as part of the Project Athena and the Globewide Network Academy project. The browser was based on the Tcl language and the tk toolkit extension but did not achieve broad user acceptance or market share although it...

  • WorldWideWeb
    WorldWideWeb
    WorldWideWeb, later renamed to Nexus to avoid confusion between the software and the World Wide Web, was the first web browser and editor. When it was written, WorldWideWeb was the only way to view the Web....

     (later Nexus)


Integrated applications in libwww are:
  • Command Line Tool, an application which shows how to use libwww for building simple batch mode tools for accessing the Web.
  • Line Mode Browser, a Spartan web browser.
  • Webbot, a simple application showing how to use libwww for building robots.
  • Mini Server, a small application showing how to implement a server or a proxy using libwww.

Criticism

The developers of libcurl have criticised libwww as being not as portable
Software portability
Portability in high-level computer programming is the usability of the same software in different environments. The prerequirement for portability is the generalized abstraction between the application logic and system interfaces...

, not thread-safe and lacking several HTTP authentication types.
Neither libcurl nor libwww are lightweight enough for some projects.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK