Squid cache
Encyclopedia
Squid is a proxy server
Proxy server
In computer networks, a proxy server is a server that acts as an intermediary for requests from clients seeking resources from other servers. A client connects to the proxy server, requesting some service, such as a file, connection, web page, or other resource available from a different server...

 and web cache
Web cache
A web cache is a mechanism for the temporary storage of web documents, such as HTML pages and images, to reduce bandwidth usage, server load, and perceived lag...

 daemon
Daemon (computer software)
In Unix and other multitasking computer operating systems, a daemon is a computer program that runs as a background process, rather than being under the direct control of an interactive user...

. It has a wide variety of uses, from speeding up a web server
Web server
Web server can refer to either the hardware or the software that helps to deliver content that can be accessed through the Internet....

 by caching repeated requests; to caching web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...

, DNS
Domain name system
The Domain Name System is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participating entities...

 and other computer network
Computer network
A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information....

 lookups for a group of people sharing network resources; to aiding security by filtering traffic. Although primarily used for HTTP and FTP
File Transfer Protocol
File Transfer Protocol is a standard network protocol used to transfer files from one host to another host over a TCP-based network, such as the Internet. FTP is built on a client-server architecture and utilizes separate control and data connections between the client and server...

, Squid includes limited support for several other protocols including TLS
Transport Layer Security
Transport Layer Security and its predecessor, Secure Sockets Layer , are cryptographic protocols that provide communication security over the Internet...

, SSL, Internet Gopher and HTTPS
Https
Hypertext Transfer Protocol Secure is a combination of the Hypertext Transfer Protocol with SSL/TLS protocol to provide encrypted communication and secure identification of a network web server...

.

Squid was originally designed to run on Unix-like
Unix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....

 systems, but also runs well on Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

-based systems. Released under the GNU General Public License
GNU General Public License
The GNU General Public License is the most widely used free software license, originally written by Richard Stallman for the GNU Project....

, Squid is free software
Free software
Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...

.

It is used by Wikimedia Foundation
Wikimedia Foundation
Wikimedia Foundation, Inc. is an American non-profit charitable organization headquartered in San Francisco, California, United States, and organized under the laws of the state of Florida, where it was initially based...

 on Wikipedia
Wikipedia
Wikipedia is a free, web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation. Its 20 million articles have been written collaboratively by volunteers around the world. Almost all of its articles can be edited by anyone with access to the site,...

.

History

Squid was originally developed by Duane Wessels as the Harvest object cache, part of the Harvest project
Harvest project
Harvest was a DARPA funded research project by the Internet Research Task Force Research Group on Resource Discovery and hosted at the University of Colorado at Boulder which provided a web cache, developed standards such as the Internet Cache Protocol and Summary Object Interchange Format, and...

 at the University of Colorado at Boulder
University of Colorado at Boulder
The University of Colorado Boulder is a public research university located in Boulder, Colorado...

. Further work on the program was completed at the University of California, San Diego
University of California, San Diego
The University of California, San Diego, commonly known as UCSD or UC San Diego, is a public research university located in the La Jolla neighborhood of San Diego, California, United States...

 and funded via two grants from the National Science Foundation
National Science Foundation
The National Science Foundation is a United States government agency that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National Institutes of Health...

. Duane Wessels forked the "last pre-commercial version of Harvest" and renamed it to Squid to avoid confusion with the commercial fork called Cached 2.0, which became NetCache
NetCache
NetCache is a former web cache software product which was owned and developed by NetApp between 1997 and 2006, and a hardware product family incorporating the NetCache software.-History:...

. Squid version 1.0.0 was released in July 1996.

Squid is now developed almost exclusively through volunteer efforts.

Web proxy Caching is a way to store requested Internet objects (e.g. data like web pages) available via the HTTP, FTP, and Gopher protocols on a system closer to the requesting site. Web browser
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...

s can then use the local Squid cache as a proxy
Proxy server
In computer networks, a proxy server is a server that acts as an intermediary for requests from clients seeking resources from other servers. A client connects to the proxy server, requesting some service, such as a file, connection, web page, or other resource available from a different server...

 HTTP server, reducing access time as well as bandwidth
Bandwidth (computing)
In computer networking and computer science, bandwidth, network bandwidth, data bandwidth, or digital bandwidth is a measure of available or consumed data communication resources expressed in bits/second or multiples of it .Note that in textbooks on wireless communications, modem data transmission,...

 consumption. This is often useful for Internet service provider
Internet service provider
An Internet service provider is a company that provides access to the Internet. Access ISPs directly connect customers to the Internet using copper wires, wireless or fiber-optic connections. Hosting ISPs lease server space for smaller businesses and host other people servers...

s to increase speed to their customers, and LANs
Local area network
A local area network is a computer network that interconnects computers in a limited area such as a home, school, computer laboratory, or office building...

 that share an Internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...

 connection. Because it is also a proxy (i.e. it behaves like a client on behalf of the real client), it can provide some anonymity
Internet privacy
Internet privacy involves the right or mandate of personal privacy concerning the storing, repurposing, providing to third-parties, and displaying of information pertaining to oneself via the Internet. Privacy can entail both Personally Identifying Information or non-PII information such as a...

 and security
Security
Security is the degree of protection against danger, damage, loss, and crime. Security as a form of protection are structures and processes that provide or improve security as a condition. The Institute for Security and Open Methodologies in the OSSTMM 3 defines security as "a form of protection...

. However, it also can introduce significant privacy concerns as it can log a lot of data including URLs
Uniform Resource Locator
In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....

 requested, the exact date and time, the name and version of the requester's web browser and operating system, and the referrer.

A client program (e.g. browser) either has to specify explicitly the proxy server it wants to use (typical for ISP customers), or it could be using a proxy without any extra configuration: “transparent caching”, in which case all outgoing HTTP requests are intercepted by Squid and all responses are cached. The latter is typically a corporate set-up (all clients are on the same LAN) and often introduces the privacy concerns mentioned above.

Squid has some features that can help anonymize
Anonymity
Anonymity is derived from the Greek word ἀνωνυμία, anonymia, meaning "without a name" or "namelessness". In colloquial use, anonymity typically refers to the state of an individual's personal identity, or personally identifiable information, being publicly unknown.There are many reasons why a...

 connections, such as disabling or changing specific header fields in a client's
Client (computing)
A client is an application or system that accesses a service made available by a server. The server is often on another computer system, in which case the client accesses the service by way of a network....

 HTTP requests. Whether these are set, and what they are set to do, is up to the person who controls the computer running Squid. People requesting pages through a network which transparently uses Squid may not know whether this information is being logged. Within UK organisations at least, users should be informed if computers or internet connections are being monitored.

Reverse proxy

The above setup—caching the contents of an unlimited number of webservers for a limited number of clients—is the classical one. Another setup is “reverse proxy
Reverse proxy
In computer networks, a reverse proxy is a type of proxy server that retrieves resources on behalf of a client from one or more servers. These resources are then returned to the client as though it originated from the reverse proxy itself...

” or “webserver acceleration” (using http_port 80 accel vhost). In this mode, the cache serves an unlimited number of clients for a limited number of—or just one—web servers.

As an example, if slow.example.com is a “real” web server, and www.example.com is the Squid cache server that “accelerates” it, the first time any page is requested from www.example.com, the cache server would get the actual page from slow.example.com, but later requests would get the stored copy directly from the accelerator (for a configurable period, after which the stored copy would be discarded). The end result, without any action by the clients, is less traffic to the source server, meaning less CPU and memory usage, and less need for bandwidth. This does, however, mean that the source server cannot accurately report on its traffic numbers without additional configuration, as all requests would seem to have come from the reverse proxy. A way to adapt the reporting on the source server is to use the X-Forwarded-For
X-Forwarded-For
The X-Forwarded-For HTTP header field is a de facto standard for identifying the originating IP address of a client connecting to a web server through an HTTP proxy or load balancer. This is an HTTP request header which was introduced by the Squid caching proxy server's developers...

 HTTP header reported by the reverse proxy, to get the real client's IP address.

It is possible for a single Squid server to serve both as a normal and a reverse proxy simultaneously. For example, a business might host its own website on a web server, with a Squid server acting as a reverse proxy between clients (customers accessing the website from outside the business) and the web server. The same Squid server could act as a classical web cache, caching HTTP requests from clients within the business (i.e. employees accessing the internet from their workstations), so accelerating web access and reducing bandwidth demands.

Media-range limitations

This feature is used extensively by video streaming websites such as YouTube
YouTube
YouTube is a video-sharing website, created by three former PayPal employees in February 2005, on which users can upload, view and share videos....

, so that if a user clicks to the middle of the video progress bar, the server can begin to send data from the middle of the file, rather than sending the entire file from the beginning and the user waiting for the preceding data to finish loading.

Partial downloads are also extensively used by Microsoft Windows Update
Windows Update
Windows Update is a service provided by Microsoft that provides updates for the Microsoft Windows operating system and its installed components, including Internet Explorer...

 so that extremely large update packages can download in the background and pause halfway through the download, if the user turns off their computer or disconnects from the Internet.

The Metalink
Metalink
Metalink is a cross-platform and cross-application Internet standard/framework/file format for programs that download, including download managers, BitTorrent clients, Web browsers, FTP clients, and P2P programs...

 download format enables clients to do segmented downloads
Segmented downloading
Segmented downloading can be a more efficient way of downloading files from many peers at once. The one single file is downloaded, in parallel, from several distinct sources or uploaders of the file...

 by issuing partial requests and spreading these over a number of mirrors.

Squid can relay partial requests to the origin web server. In order for a partial request to be satisfied at a fast speed from cache, Squid requires a full copy of the same object to already exist in its storage.

If a proxy video user is watching a video stream and browses to a different page before the video completely downloads, Squid can not keep the partial download for reuse and simply discards the data. Special configuration is required to force such downloads to continue and be cached.

Supported platforms

Squid can run on the following operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

s:
  • AIX
    AIX operating system
    AIX AIX AIX (Advanced Interactive eXecutive, pronounced "a i ex" is a series of proprietary Unix operating systems developed and sold by IBM for several of its computer platforms...

  • BSDI
    BSD/OS
    BSD/OS was a proprietary version of the BSD operating system developed by Berkeley Software Design, Inc. ....

  • Digital Unix
  • FreeBSD
    FreeBSD
    FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...

  • HP-UX
    HP-UX
    HP-UX is Hewlett-Packard's proprietary implementation of the Unix operating system, based on UNIX System V and first released in 1984...

  • IRIX
    IRIX
    IRIX is a computer operating system developed by Silicon Graphics, Inc. to run natively on their 32- and 64-bit MIPS architecture workstations and servers. It was based on UNIX System V with BSD extensions. IRIX was the first operating system to include the XFS file system.The last major version...

  • Linux
    Linux
    Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

  • Mac OS X
    Mac OS X
    Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

  • NetBSD
    NetBSD
    NetBSD is a freely available open source version of the Berkeley Software Distribution Unix operating system. It was the second open source BSD descendant to be formally released, after 386BSD, and continues to be actively developed. The NetBSD project is primarily focused on high quality design,...

  • NeXTStep
    NEXTSTEP
    NeXTSTEP was the object-oriented, multitasking operating system developed by NeXT Computer to run on its range of proprietary workstation computers, such as the NeXTcube...

  • OpenBSD
    OpenBSD
    OpenBSD is a Unix-like computer operating system descended from Berkeley Software Distribution , a Unix derivative developed at the University of California, Berkeley. It was forked from NetBSD by project leader Theo de Raadt in late 1995...

  • OS/2
    OS/2
    OS/2 is a computer operating system, initially created by Microsoft and IBM, then later developed by IBM exclusively. The name stands for "Operating System/2," because it was introduced as part of the same generation change release as IBM's "Personal System/2 " line of second-generation personal...

     and eComStation
    EComStation
    eComStation or eCS is a PC operating system based on OS/2, published by Serenity Systems. It includes several additions and accompanying software not present in the IBM version of the system.-Differences between eComStation and OS/2:...

  • SCO OpenServer
    SCO OpenServer
    SCO OpenServer, previously SCO UNIX and SCO Open Desktop , is, misleadingly, a closed source version of the Unix computer operating system developed by Santa Cruz Operation and now maintained by the SCO Group....

  • Solaris
  • UnixWare
    UnixWare
    UnixWare is a Unix operating system maintained by The SCO Group . UnixWare is typically deployed as a server rather than desktop. Binary distributions of UnixWare are available for x86 architecture computers. It was originally released by Univel, a jointly owned venture of AT&T's Unix System...

  • Windows
    Microsoft Windows
    Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...


Performance

The Squid web site claims that if working in front of the server application, it can improve performance by up to four times. Squid is especially efficient in case of (probably unexpected) high traffic to one or several particular pages, as in this case near 100% of caching can be achieved.

Recent development

By 2006 work was under way to include ICAP
Internet Content Adaptation Protocol
The Internet Content Adaptation Protocol is a lightweight HTTP-like protocol specified in RFC 3507 which is used to extend transparent proxy servers, thereby freeing up resources and standardizing the way in which new features are implemented. ICAP is generally used to implement virus scanning,...

 support

In 2007, work got seriously under way to complete and include IPv6
IPv6
Internet Protocol version 6 is a version of the Internet Protocol . It is designed to succeed the Internet Protocol version 4...

 support. As of version 3.1, this work is complete.

See also

  • Web accelerator
    Web accelerator
    A web accelerator is a proxy server that reduces web site access times. They can be a self-contained hardware appliance or installable software....

     which discusses host-based HTTP acceleration
  • Proxy server
    Proxy server
    In computer networks, a proxy server is a server that acts as an intermediary for requests from clients seeking resources from other servers. A client connects to the proxy server, requesting some service, such as a file, connection, web page, or other resource available from a different server...

     which discusses client-side proxies
  • Reverse proxy
    Reverse proxy
    In computer networks, a reverse proxy is a type of proxy server that retrieves resources on behalf of a client from one or more servers. These resources are then returned to the client as though it originated from the reverse proxy itself...

     which discusses origin-side proxies
  • Comparison of web servers
    Comparison of web servers
    -Overview:-Features:- Operating system support :...

  • Comparison of lightweight web servers
    Comparison of lightweight web servers
    Lightweight web servers are web servers which have been designed to run with very small resource overhead because of hardware, environment, or simply for the challenge of it....


External links


ategory:Free caching software]
Category:Free cross-platform software]
Category:Reverse proxy
Category:Gopher clients
Category:Unix network-related software
af:Squid
ar:سكويد
ca:Squid
da:Squid (software)
de:Squid
es:Squid (programa)
fa:اسکوئید کارساز
fr:Squid
ko:스퀴드 서버
id:Squid
it:Squid
lt:Squid podėlis (cache)
ms:Squid
nl:Squid (web-proxyserver)
ja:Squid cache
pl:Squid (oprogramowanie)
pt:Squid
ru:Squid
simple:Squid cache
sv:Squid
ta:இசுகுவிட்
th:สควิด (ซอฟต์แวร์)
uk:Squid
zh:Squid (软件)
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK