COMPRESSION OF SECURE CONTENT
FIELD OF THE INVENTION
This invention relates to secure content and, more specifically, to compression of secure content.
BACKGROUND OF THE INVENTION
Typically, compression products and related functionality do not allow for content to be secured (encrypted) during the entire time that the content is transported through the network. This is because most compression products are unable to interpret the encrypted content. In order to determine if the content can be compressed, the compression product needs to interpret the encrypted content. Most compression products are able to interpret content only when the content is unencrypted, i.e., not secure.
There is a need to have content be secure (encrypted) throughout its data flow (from backend server to client), and for the secure content to be compressed at the same time. Based on the foregoing, there is a need for a mechanism to compress secure content.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
FIG. 1 is a block diagram that illustrates a high-level network diagram showing aspects of a computerized environment in which the compression of secure content can be performed, according to certain embodiments.
FIG. 2 is a block diagram that illustrates some of the components typically incorporated in at least some of the computer systems and other devices on which the facility executes.
FIG. 3 is a flow chart that illustrates some of steps that the facility performs, according to certain embodiments.
FIG. 4 is a block diagram of a CSE graphical user interface (GUI), according to certain embodiments of the invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
A facility for dynamically compressing secure information, i.e., encrypted information or content, before the secure information is transported to the client that is requesting the encrypted information, is described. For purposes of explanation, a software implementation of the facility is described. However, the facility may be a software implementation, or a hardware implementation, or a combination thereof and may vary from implementation to implementation. The current embodiments are not restricted to any particular implementation.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the present invention. U.S. Provisional Patent Application No. 60/418,878 (Atty. Docket No. 36321- 8030.US00), filed October 15, 2002, by Brian Metzger, is herein incorporated by reference
In certain embodiments, the facility includes a proxy server, a cryptographic engine, and a compression service engine. The compression service engine, in collaboration with the cryptographic engine, is configurable to interpret a request for encrypted information as well as the response to the request. The purpose of such an interpretation includes determining whether: 1) the client that sent the request is capable of accepting compressed encrypted information, and 2) the type and level of compression to apply to the encrypted information, if the client is capable of accepting compressed encrypted information. If a copy of the encrypted information is already stored in a proxy cache, then the encrypted information is retrieved from the proxy cache for serving to the client, rather than requesting the encrypted information from a back-end server. Further, the encrypted information is compressed based on the determined type and level of compression before sending the encrypted information to the client in response to the client's request.
T 1(3'.' I' 'a h'ϊgh'-leveTbTock "diagram that illustrates aspects of a computerized environment 100 in which the compression of secure content can be performed, according to certain embodiments. FIG. 1 shows a plurality of clients 102a-102n, a network 104, a proxy server 106 and a back-end server 108. There may be more than one back-end server.
In certain embodiments, compression of secure content is performed with the aid of one or more other computer systems, such as proxy server 106. Components of the facility may reside on and/or execute on any combination of these computer systems, and intermediate results from the compression may similarly reside on any combination of these computer systems. According to some embodiments, the facility includes a proxy server, an encryption/decryption service engine (cryptographic engine) and a compression service engine (CSE). The facility may be embodied in a single device or distributed among various devices. For embodiments that include hardware implementations, suitable hardware interfaces are used for the CSE.
The computer systems 100 shown in FIG. 1 are connected via network 104, which may use a variety of different networking technologies, including wired, guided or line-of-sight optical, and radio frequency networking. In some embodiments, the network includes the public switched telephone network. Network connections established via the network may be fully-persistent, session-based, or intermittent, such as packet-based. While the facility typically operates in an environment such as is shown in FIG. 1 and described above, those skilled in the art will appreciate the facility may also operate in a wide variety of other environments.
In some embodiments, communication between any of clients 102a-102n and back-end server 108 is through secure communication links using a secure protocol.
FIG. 2 is a block diagram showing some of the components typically incorporated in at least some of the computer systems and other devices on which the facility executes, including some or all of the server and client computer systems shown in FIG. 1. These computer systems and devices 200 may include one or more central processing units ("CPUs") 201 for executing computer programs; a computer memory 202 for storing programs and data - including data structures -- while they are being
used; a persistent storage device 203, such as a hard drive, for persistently storing programs and data; a computer-readable media drive 204, such as a CD-ROM drive, for reading programs and data stored on a computer-readable medium; and a network connection 205 for connecting the computer system to other computer systems, such as via the Internet, to exchange programs and/or data - including data structures. While computer systems configured as described above are typically used to support the operation of the facility, those skilled in the art will appreciate that the facility may be implemented using devices of various types and configurations, and having various components.
Clients and servers exchange sensitive data (secure content) by encrypting the data before transmission through the network. As with any data transmission through a network, bandwidth and latency constraints are of concern. In digital systems, bandwidth is expressed as the number of bits of data per sec (bps). If the bandwidth is not wide enough to support the amount of data that is being relayed at the speed the data is being processed, then a bottleneck occurs. Bottlenecks have adverse effects on latency because bottlenecks increase the amount of time it takes for a data packet to travel from the packet's source to the packet's destination.
However, not all bits of data are equal. Data compression techniques make it possible for some data to contain more information than others. Thus, short of increasing the size of the bandwidth, bandwidth and latency problems may be mitigated by maximizing the amount of information that can be transmitted by compressing the data before transmission through the network. In non-secure network systems, the data requested by clients can be easily compressed before sending such data to the clients through the network.
However, in the case of secure network systems, the usual mechanisms for compressing data are unable to interpret the encrypted data and, as a consequence, are unable to compress the data. Such mechanisms are unable to interpret encrypted data because encrypted data does not permit deep inspection of the data. For example, such mechanisms are unable to look at the payload of an encrypted data packet. Insight into the payload is needed for decisions on the type of compression
techniques' to apply. Further, the determination of the type of compression techniques to apply is affected by the characteristics of the client that is requesting the encrypted data, as explained in greater detail below.
For purposes of explanation, assume that a client is requesting encrypted content from the backend server of a server system. The data requested by the client from the back-end server is herein referred to as "requested data" or "response." The client that is making the request is herein referred to as a "requesting client." FIG. 3 is a flow chart that illustrates some of steps of a procedure 300 that the facility performs, according to certain embodiments. At block 302, it is determined whether the requesting client can accept compressed data. To make such a determination, the request is first decrypted, if the request is encrypted. The requesting can accept compressed data if the request contains an "Accept-Encoding: gzip" header, for example. According to certain embodiments, a proxy server that can employ an encryption/decryption service engine to decrypt both the request and the response is used.
If it is determined that the requesting client is unable to receive compressed data, then procedure 300 arrives at block 304 where the requested data is retrieved and sent to the requesting client in uncompressed form. Some older versions of client browsers are unable to accept compressed data.
If the requesting client is unable to receive compressed data, then procedure 300 arrives at block 306 where it is determined whether the requested data is stored in the cache. If the requested data is not already in the cache, then at block 308, the proxy server makes a request for the data from the back-end server. If the requested data is already stored in the cache, then at block 310, the proxy server retrieves the requested data from the cache.
At 312, the requested data is decrypted and examined to determine the desired type and level of compression. The desired type of compression may either be a gzip compression or a GIF compression, for example. The level of compression is the percentage by which images can be compressed. With gzip compression enabled, the CSE will compress the response if: 1) the request contains an Accept-Encoding: gzip
header, and 2) the response does'NOT contain a Content-Encoding header. With GIF compression enabled, the CSE will scale down GIF images according to a user specified level of compression or quality factor. Each image can be decoded, and a reduction algorithm can be applied. Next, the image can be re-encoded. The type of image reduction algorithm may vary form implementation to implementation. Since there is no HTTP header indicating what GIF quality is acceptable to clients, it can be assumed that all clients, on a forwarding rule with the CSE attached, accept images with the configured quality factor. The type and level of compression can be configured using a compression profile. Compression profiles are described in greater detail herein with reference to FIG. 4. Sometimes, the requested data is already in compressed form as indicated by the presence of a "Content-Encoding" header in the requested data. In such cases, the compression service engine will know that there is no need to compress the data.
At block 314, it is determined whether the desired compressed form of the requested data is stored in the cache. The compressed form of the requested data is also referred to herein as a "compressed data object." If the desired compressed data object is already in the cache, then the proxy server will simply serve the desired compressed data object to the requesting client. If the desired compressed data object is not in the cache, then at block 318 the CSE is called upon to apply the appropriate compression technique to compress the requested data at the desired level of compression. It is further assumed that the proxy server and associated CSE can interface with third party libraries to perform the actual content compression and image reduction. Such libraries may or not be free, require license fees, etc. The performance of the proxy server and associated CSE is related to the performance of such libraries. Further, the CSE can leverage third party libraries to perform the actual compression. For example, gzip compression can be performed by zlib, and the GIF compression by giflib. For Gif Compression, a modified version of GIFSICLE in a library form can be used.
A secure tunnel is maintained for the transport of the compressed data object to the requesting client. As a consequence, the compression of the requested secure
content results in improved response time and in a decreased amount of bandwidth that is needed to transport the compressed secure content.
Compression is a processor intensive activity. A user configurable compression level will control the size of the compressed data object versus the performance impact. As a consequence, at block 318 of FIG. 3, the proxy server may make an intelligent choice not to compress the data if the processor is heavily loaded at the time of satisfying the request. If the requested data is suitable for caching, the proxy server will cause a copy of the requested data to be compressed at a later time when the processor is less busy. The resulting compressed data object is then stored in the cache for purposes of satisfying future requests for such secure content.
According to certain embodiments, the cache is capable of distinguishing data objects by Content Encoding. This will prevent gzipped objects from being served to clients that did not send an "Accept-Encoding: gzip" header. As for images, the quality factor will be part of the cache lookup for a GIF to prevent other forwarding rules from accessing compressed (ie: reduced quality) GIF images. However, a particular forwarding rule may be such that it allows reduced quality images to be sent as a response if it is known that the client browser, such as a PDA browser for example, has little ability to appreciate high quality images.
In some embodiments, because of the way certain image reduction algorithms work, the first client to request a compressed image may receive the original image. This will start the reduction process in the background, eventually placing the compressed image in the cache for future use.
Typically, clients accept plain encoded objects. Thus, in the case of caching multiple copies of the same data object with different encodings, a given client may be served a plain encoded object even though such a client can accept compressed content. Consider the following scenario. An older browser makes a request for 7index.html" and does not send an "Accept-Encoding: gzip" header. In such a case, the plain object is stored in the cache. A modern browser then requests the same object but does send the Accept-Encoding: gzip header. Since the modern browser implicitly accepts plain encodings, the plain copy from the cache is served to the
modern browser, rather than compressing the requested object and sending the compressed object to the modern browser. In certain embodiments, the module informs the cache that the CSE is attached and the above scenario is treated as a cache miss. In the case where the CSE is not attached, but the backend server itself is performing the compression, a cache setting is added wherein the cache setting allows the administrator to specify that the backend server is performing the compression, so that plain encoding hits can be treated as missed on those forwarding rules as well.
The compression service engine can be configured to selectively compress secure content. As previously explained, some clients may not accept compressed content while other clients may accept compressed content. The compression service engine can be configured to select for compression, only the secure content that is destined for clients that will accept compressed content. Furthermore, different compression algorithms may be selected based on the characteristics of the content being served.
The proxy server and the associated compression service engine can be configured by administrators who have knowledge of: 1) the type of content served through such a proxy server, and 2) the web server environment. The compression service engine is configurable using a plurality of profiles. Each profile defines a different configuration. Profiles are described in greater detail herein with reference to FiG. 4.
The proxy server uses a set a forwarding rules to determine how incoming requests from client browsers are to be treated. Service engine filters are filters that allow a user to specify conditions that need to be satisfied for a CSE to process a request or response. According to certain embodiments, a CSE and a filter are attached to each forwarding rule.
If an administrator wishes to enable or disable the CSE based on the content type or User-Agent headers (or any other HTTP headers), the administrator can create an appropriate service engine filter. For example, if a particular user agent has a bug which causes it to send an "Accept-Encoding: gzip" header when it does not in fact support gzip, a service engine filter can be used to disable the module for this agent.
The same method can be used to restrict compression to objects that have Content- Type: text/*, etc. The following steps may be followed: Create a new profile with the desired compression levels. Create a response filter to control what content will be compressed. Attach the profile and filter to the forwarding rule.
According to certain embodiments, the CSE will have a User Interface similar to other service engines. FIG. 4 is a block diagram of a CSE graphical user interface (GUI), according to certain embodiments of the invention. In FIG. 4, CSE user interface 400 comprises a profile list 402. Profile list 402 contains a list of existing profiles (compression profiles) as indicated by profile name 404. CSE GUI 400 also comprises a "create profile" section such as create profile 408. Each profile will have a properties page where the attributes of the profile can be viewed or modified. The properties page can be accessed by selecting the properties button 406.
Each profile of the CSE may include the following configurable attributes: Profile Name 410 - name by which this profile is referred to by forwarding rules. Log Level 412 - how much information should be logged. Enable gzip Compression 414 - true if this profile performs gzip compression, gzip Compression Level 416 - integer level of speed versus size. 1 fastest largest - 9 slowest/smallest.
Enable GIF Compression 418 - true if this profile performs GIF compression. GIF Compression Level 420 - user defined quality factor. The user will be able to set a specific number of colors to reduce to or a percentage to reduce colors by. Also, the user will be able to select from different compression algorithms, and will be able to select different algorithms for color and grayscale images.
In addition, there may be a check box (per-forwarding rule, for example) indicating if the backend servers use compression. There may be one new configuration file containing all the configuration information for all CSE profiles. The name and format of this file is automatically generated, and is not of general interest, "forward. conf" contains a new per-forwarding rule setting for backend server compression. "Backup" and "Restore" will work as in other service engines. When the
CSE configuration changes, the proxy may need to be restarted. If image compression is performed in a separate process, such a process may also need to be restarted.
In the foregoing specification, embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to implementation. Thus, the sole and exclusive indicator of what is the invention, and is intended by the applicants to be the invention, is the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction. Any express definitions set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Hence, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.