7 CONTENT="Modular DocBook HTML Stylesheet Version 1.76b+
10 TITLE="Privoxy 3.0.6 User Manual"
11 HREF="index.html"><LINK
13 TITLE="The Main Configuration File"
14 HREF="config.html"><LINK
17 HREF="filter-file.html"><LINK
21 <LINK REL="STYLESHEET" TYPE="text/css" HREF="p_doc.css">
33 SUMMARY="Header navigation table"
42 >Privoxy 3.0.6 User Manual</TH
64 HREF="filter-file.html"
82 > The actions files are used to define what <SPAN
92 > takes for which URLs, and thus determines
93 how ad images, cookies and various other aspects of HTTP content and
94 transactions are handled, and on which sites (or even parts thereof).
95 There are a number of such actions, with a wide range of functionality.
96 Each action does something a little different.
97 These actions give us a veritable arsenal of tools with which to exert
98 our control, preferences and independence. Actions can be combined so that
99 their effects are aggregated when applied against a given set of URLs.</P
102 are three action files included with <SPAN
117 > - is the primary action file
118 that sets the initial values for all actions. It is intended to
119 provide a base level of functionality for
123 > array of features. So it is
124 a set of broad rules that should work reasonably well as-is for most users.
125 This is the file that the developers are keeping updated, and <A
126 HREF="installation.html#INSTALLATION-KEEPUPDATED"
127 >making available to users</A
129 The user's preferences as set in <TT
152 > - is intended to be for local site
153 preferences and exceptions. As an example, if your ISP or your bank
154 has specific requirements, and need special handling, this kind of
155 thing should go here. This file will not be upgraded.
163 > - is used only by the web based editor
165 HREF="http://config.privoxy.org/edit-actions-list?f=default"
167 > http://config.privoxy.org/edit-actions-list?f=default</A
169 to set various pre-defined sets of rules for the default actions section
181 >Set to Cautious</SPAN
187 >Set to Advanced</SPAN
191 > These have increasing levels of aggressiveness <SPAN
196 influence on your browsing unless you select them explicitly in the
199 >. A default installation should be pre-set to
203 > (versions prior to 3.0.5 were set to
207 >). New users should try this for a while before
208 adjusting the settings to more aggressive levels. The more aggressive
209 the settings, then the more likelihood there is of problems such as sites
210 not working as they should.
216 > button allows you to turn each
217 action on/off individually for fine-tuning. The <SPAN
221 button changes the actions list to low/safe settings which will activate
222 ad blocking and a minimal set of <SPAN
225 >'s features, and subsequently
226 there will be less of a chance for accidental problems. The
230 > button sets the list to a medium level of
231 other features and a low level set of privacy features. The
235 > button sets the list to a high level of
236 ad blocking and medium level of privacy. See the chart below. The latter
237 three buttons over-ride any changes via with the
241 > button. More fine-tuning can be done in the
242 lower sections of this internal page.
245 > It is not recommend to edit the <TT
252 > The default profiles, and their associated actions, as pre-defined in
266 >Table 1. Default Configurations</B
301 >Ad-blocking Aggressiveness</TD
323 >Ad-filtering by size</TD
345 >Ad-filtering by link</TD
389 >Privacy Features</TD
455 >GIF de-animation</TD
521 >JavaScript taming</TD
565 >Image tag reordering</TD
592 > The list of actions files to be used are defined in the main configuration
593 file, and are processed in the order they are defined (e.g.
597 > is typically process before
601 >). The content of these can all be viewed and
603 HREF="http://config.privoxy.org/show-status"
605 >http://config.privoxy.org/show-status</A
607 The over-riding principle when applying actions, is that the last action that
608 matches a given URL, wins. The broadest, most general rules go first
613 followed by any exceptions (typically also in
617 >), which are then followed lastly by any
618 local preferences (typically in <SPAN
634 > An actions file typically has multiple sections. If you want to use
638 > in an actions file, you have to place the (optional)
640 HREF="actions-file.html#ALIASES"
642 > at the top of that file.
643 Then comes the default set of rules which will apply universally to all
644 sites and pages (be <SPAN
654 > or any other actions file after
658 >, because it will override the result
659 from consulting any previous file). And then below that,
660 exceptions to the defined universal policies. You can regard
664 > as an appendix to <TT
668 with the advantage that is a separate file, which makes preserving your
669 personal settings across <SPAN
672 > upgrades easier.</P
675 Actions can be used to block anything you want, including ads, banners, or
676 just some obnoxious URL that you would rather not see. Cookies can be accepted
677 or rejected, or accepted only during the current browser session (i.e. not
678 written to disk), content can be modified, JavaScripts tamed, user-tracking
679 fooled, and much more. See below for a <A
680 HREF="actions-file.html#ACTIONS"
691 >8.1. Finding the Right Mix</H2
694 HREF="actions-file.html#ACTIONS"
696 >, like cookie suppression
697 or script disabling, may render some sites unusable that rely on these
698 techniques to work properly. Finding the right mix of actions is not always easy and
699 certainly a matter of personal taste. And, things can always change, requiring
700 refinements in the configuration. In general, it can be said that the more
704 > your default settings (in the top section of the
705 actions file) are, the more exceptions for <SPAN
709 will have to make later. If, for example, you want to crunch all cookies per
710 default, you'll have to make exceptions from that rule for sites that you
711 regularly use and that require cookies for actually useful purposes, like maybe
712 your bank, favorite shop, or newspaper. </P
714 > We have tried to provide you with reasonable rules to start from in the
715 distribution actions files. But there is no general rule of thumb on these
716 things. There just are too many variables, and sites are constantly changing.
717 Sooner or later you will want to change the rules (and read this chapter again :).</P
726 >8.2. How to Edit</H2
728 > The easiest way to edit the actions files is with a browser by
729 using our browser-based editor, which can be reached from <A
730 HREF="http://config.privoxy.org/show-status"
732 >http://config.privoxy.org/show-status</A
734 The editor allows both fine-grained control over every single feature on a
735 per-URL basis, and easy choosing from wholesale sets of defaults like
749 > setting is more aggressive, and
750 will be more likely to cause problems for some sites. Experienced users only!</P
752 > If you prefer plain text editing to GUIs, you can of course also directly edit the
753 the actions files with your favorite text editor. Look at
757 > which is richly commented with many
767 >8.3. How Actions are Applied to URLs</H2
769 > Actions files are divided into sections. There are special sections,
773 HREF="actions-file.html#ALIASES"
776 > sections which will
777 be discussed later. For now let's concentrate on regular sections: They have a
778 heading line (often split up to multiple lines for readability) which consist
779 of a list of actions, separated by whitespace and enclosed in curly braces.
780 Below that, there is a list of URL patterns, each on a separate line.</P
782 > To determine which actions apply to a request, the URL of the request is
783 compared to all patterns in each <SPAN
786 > file. Every time it matches, the list of
787 applicable actions for the URL is incrementally updated, using the heading
788 of the section in which the pattern is located. If multiple matches for
789 the same URL set the same action differently, the last match wins. If not,
790 the effects are aggregated. E.g. a URL might match a regular section with
791 a heading line of <TT
795 HREF="actions-file.html#HANDLE-AS-IMAGE"
799 then later another one with just <TT
803 HREF="actions-file.html#BLOCK"
813 > actions to apply. And there may well be
814 cases where you will want to combine actions together. Such a section then
832 # Block these as if they were images. Send no block page.
834 media.example.com/.*banners
835 .example.com/images/ads/</PRE
842 > You can trace this process for any given URL by visiting <A
843 HREF="http://config.privoxy.org/show-url-info"
845 >http://config.privoxy.org/show-url-info</A
848 > Examples and more detail on this is provided in the Appendix, <A
849 HREF="appendix.html#ACTIONSANAT"
850 > Troubleshooting: Anatomy of an Action</A
870 to determine what <SPAN
876 > might apply to which sites and
877 pages your browser attempts to access. These <SPAN
887 > matching to achieve a high degree of
888 flexibility. This allows one expression to be expanded and potentially match
889 against many similar patterns.</P
894 > pattern has the form
897 ><domain>/<path></TT
901 ><domain></TT
906 optional. (This is why the special <TT
909 > pattern matches all
910 URLs). Note that the protocol portion of the URL pattern (e.g.
921 the pattern. This is assumed already!</P
923 > The pattern matching syntax is different for the domain and path parts of
924 the URL. The domain part uses a simple globbing type matching technique,
925 while the path part uses a more flexible
927 HREF="http://en.wikipedia.org/wiki/Regular_expressions"
932 Expressions (PCRE)"</SPAN
943 >www.example.com/</TT
947 > is a domain-only pattern and will match any request to <TT
951 regardless of which document on that server is requested. So ALL pages in
952 this domain would be covered by the scope of this action. Note that a
956 > is different and would NOT match.
966 > means exactly the same. For domain-only patterns, the trailing <TT
976 >www.example.com/index.html</TT
980 > matches only the single document <TT
997 > matches the document <TT
1000 >, regardless of the domain,
1007 > web server anywhere.
1017 > matches nothing, since it would be interpreted as a domain name and
1018 there is no top-level domain called <TT
1034 >8.4.1. The Domain Pattern</H3
1036 > The matching of the domain part offers some flexible options: if the
1037 domain starts or ends with a dot, it becomes unanchored at that end.
1042 CLASS="VARIABLELIST"
1051 > matches any domain that <SPAN
1071 > matches any domain that <SPAN
1091 > matches any domain that <SPAN
1101 And, by the way, also included would be any files or documents that exist
1102 within that domain since no path limitations are specified. (Correctly
1103 speaking: It matches any FQDN that contains <TT
1107 a domain.) This might be <TT
1109 >www.example.com</TT
1113 >news.example.de</TT
1117 >www.example.net/cgi/testing.pl</TT
1118 > for instance. All these
1125 > Additionally, there are wild-cards that you can use in the domain names
1126 themselves. These work similarly to shell globbing type wild-cards:
1130 > represents zero or more arbitrary characters (this is
1133 HREF="http://en.wikipedia.org/wiki/Regular_expressions"
1140 > based syntax of <SPAN
1147 > represents any single character (this is equivalent to the
1148 regular expression syntax of a simple <SPAN
1151 >), and you can define
1154 >"character classes"</SPAN
1155 > in square brackets which is similar to
1156 the same regular expression technique. All of this can be freely mixed:</P
1160 CLASS="VARIABLELIST"
1165 >ad*.example.com</TT
1171 >"adserver.example.com"</SPAN
1175 >"ads.example.com"</SPAN
1176 >, etc but not <SPAN
1178 >"sfads.example.com"</SPAN
1185 >*ad*.example.com</TT
1189 > matches all of the above, and then some.
1205 >pictures.epix.com</TT
1208 >a.b.c.d.e.upix.com</TT
1215 >www[1-9a-ez].example.c*</TT
1221 >www1.example.com</TT
1225 >www4.example.cc</TT
1228 >wwwd.example.cy</TT
1232 >wwwz.example.com</TT
1242 >wwww.example.com</TT
1249 > While flexible, this is not the sophistication of full regular expression based syntax.</P
1258 >8.4.2. The Path Pattern</H3
1263 > uses Perl compatible (PCRE)
1265 HREF="http://en.wikipedia.org/wiki/Regular_expressions"
1274 HREF="http://www.pcre.org/"
1278 matching the path portion (after the slash), and is thus more flexible.</P
1281 HREF="appendix.html#REGEX"
1283 > with a brief quick-start into regular
1284 expressions, and full (very technical) documentation on PCRE regex syntax is available on-line
1286 HREF="http://www.pcre.org/man.txt"
1288 >http://www.pcre.org/man.txt</A
1290 You might also find the Perl man page on regular expressions (<TT
1294 useful, which is available on-line at <A
1295 HREF="http://perldoc.perl.org/perlre.html"
1297 >http://perldoc.perl.org/perlre.html</A
1300 > Note that the path pattern is automatically left-anchored at the <SPAN
1304 i.e. it matches as if it would start with a <SPAN
1307 > (regular expression speak
1308 for the beginning of a line).</P
1310 > Please also note that matching in the path is <SPAN
1314 >CASE INSENSITIVE</I
1317 by default, but you can switch to case sensitive at any point in the pattern by using the
1323 >www.example.com/(?-i)PaTtErN.*</TT
1325 only documents whose path starts with <TT
1335 > this capitalization.</P
1339 CLASS="VARIABLELIST"
1344 >.example.com/.*</TT
1348 > Is equivalent to just <SPAN
1350 >".example.com"</SPAN
1351 >, since any documents
1352 within that domain are matched with or without the <SPAN
1356 regular expression. This is redundant
1362 >.example.com/.*/index.html</TT
1366 > Will match any page in the domain of <SPAN
1368 >"example.com"</SPAN
1373 >, and that is part of some path. For
1374 example, it matches <SPAN
1376 >"www.example.com/testing/index.html"</SPAN
1380 >"www.example.com/index.html"</SPAN
1381 > because the regular
1382 expression called for at least two <SPAN
1386 requirement. It also would match
1389 >"www.example.com/testing/index_html"</SPAN
1391 special meta-character <SPAN
1400 >.example.com/(.*/)?index\.html</TT
1404 > This regular expression is conditional so it will match any page
1408 > regardless of path which in this case can
1409 have one or more <SPAN
1412 >. And this one must contain exactly
1416 > (but does not have to end with that!).
1422 >.example.com/(.*/)(ads|banners?|junk)</TT
1426 > This regular expression will match any path of <SPAN
1428 >"example.com"</SPAN
1430 that contains any of the words <SPAN
1440 > (because of the <SPAN
1447 The path does not have to end in these words, just contain them.
1453 >.example.com/(.*/)(ads|banners?|junk)/.*\.(jpe?g|gif|png)$</TT
1457 > This is very much the same as above, except now it must end in either
1471 one is limited to common image formats.
1477 > There are many, many good examples to be found in <TT
1481 and more tutorials below in <A
1482 HREF="appendix.html#REGEX"
1483 >Appendix on regular expressions</A
1496 > All actions are disabled by default, until they are explicitly enabled
1497 somewhere in an actions file. Actions are turned on if preceded with a
1501 >, and turned off if preceded with a <SPAN
1510 >"do that action"</SPAN
1517 >"please block URLs that match the
1518 following patterns"</SPAN
1525 block URLs that match the following patterns, even if <TT
1529 previously applied."</SPAN
1533 Again, actions are invoked by placing them on a line, enclosed in curly braces and
1534 separated by whitespace, like in
1537 >{+some-action -some-other-action{some-parameter}}</TT
1539 followed by a list of URL patterns, one per line, to which they apply.
1540 Together, the actions line and the following pattern lines make up a section
1541 of the actions file. </P
1544 Actions fall into three categories:</P
1552 Boolean, i.e the action can only be <SPAN
1575 > # enable action <TT
1586 > # disable action <TT
1608 Parameterized, where some value is required in order to enable this type of action.
1630 >} # enable action and set parameter to <TT
1636 # overwriting parameter from previous match if necessary
1642 > # disable action. The parameter can be omitted</PRE
1649 > Note that if the URL matches multiple positive forms of a parameterized action,
1650 the last match wins, i.e. the params from earlier matches are simply ignored.
1656 >+hide-user-agent{ Mozilla 1.0 }</TT
1663 Multi-value. These look exactly like parameterized actions,
1664 but they behave differently: If the action applies multiple times to the
1665 same URL, but with different parameters, <SPAN
1678 > matches are remembered. This is used for actions
1679 that can be executed for the same request repeatedly, like adding multiple
1680 headers, or filtering through multiple filters. Syntax:
1701 >} # enable action and add <TT
1706 > to the list of parameters
1717 >} # remove the parameter <TT
1722 > from the list of parameters
1723 # If it was the last one left, disable the action.
1729 > # disable this action completely and remove all parameters from the list</PRE
1739 >+add-header{X-Fun-Header: Some text}</TT
1743 >+filter{html-annoyances}</TT
1750 > If nothing is specified in any actions file, no <SPAN
1754 taken. So in this case <SPAN
1758 normal, non-blocking, non-anonymizing proxy. You must specifically enable the
1759 privacy and blocking features you need (although the provided default actions
1760 files will give a good starting point).</P
1762 > Later defined actions always over-ride earlier ones. So exceptions
1763 to any rules you make, should come in the latter part of the file (or
1764 in a file that is processed later when using multiple actions files such
1768 >). For multi-valued actions, the actions
1769 are applied in the order they are specified. Actions files are processed in
1770 the order they are defined in <TT
1774 installation has three actions files). It also quite possible for any given
1775 URL to match more than one <SPAN
1778 > (because of wildcards and
1779 regular expressions), and thus to trigger more than one set of actions! Last
1782 > The list of valid <SPAN
1793 >8.5.1. add-header</H4
1797 CLASS="VARIABLELIST"
1803 >Confuse log analysis, custom applications</P
1809 > Sends a user defined HTTP header to the web server.
1822 > Any string value is possible. Validity of the defined HTTP headers is not checked.
1823 It is recommended that you use the <SPAN
1837 > This action may be specified multiple times, in order to define multiple
1838 headers. This is rarely needed for the typical user. If you don't know what
1841 >"HTTP headers"</SPAN
1842 > are, you definitely don't need to worry about this
1858 >+add-header{X-User-Tracking: sucks}</PRE
1879 CLASS="VARIABLELIST"
1885 >Block ads or other unwanted content</P
1891 > Requests for URLs to which this action applies are blocked, i.e. the
1892 requests are trapped by <SPAN
1895 > and the requested URL is never retrieved,
1896 but is answered locally with a substitute page or image, as determined by
1900 HREF="actions-file.html#HANDLE-AS-IMAGE"
1907 HREF="actions-file.html#SET-IMAGE-BLOCKER"
1908 >set-image-blocker</A
1914 HREF="actions-file.html#HANDLE-AS-EMPTY-DOCUMENT"
1915 >handle-as-empty-document</A
1940 > sends a special <SPAN
1944 for requests to blocked pages. This page contains links to find out why the request
1945 was blocked, and a click-through to the blocked content (the latter only if compiled with the
1946 force feature enabled). The <SPAN
1949 > page adapts to the available
1950 screen space -- it displays full-blown if space allows, or miniaturized and text-only
1951 if loaded into a small frame or window. If you are using <SPAN
1955 right now, you can take a look at the
1957 HREF="http://ads.bannerserver.example.com/nasty-ads/sponsor.html"
1968 A very important exception occurs if <SPAN
1981 HREF="actions-file.html#HANDLE-AS-IMAGE"
1985 apply to the same request: it will then be replaced by an image. If
1989 HREF="actions-file.html#SET-IMAGE-BLOCKER"
1990 >set-image-blocker</A
1993 (see below) also applies, the type of image will be determined by its parameter,
1994 if not, the standard checkerboard pattern is sent.
1997 > It is important to understand this process, in order
1998 to understand how <SPAN
2002 ads and other unwanted content. Blocking is a core feature, and one
2003 upon which various other features depend.
2009 HREF="actions-file.html#FILTER"
2013 action can perform a very similar task, by <SPAN
2017 banner images and other content through rewriting the relevant URLs in the
2018 document's HTML source, so they don't get requested in the first place.
2019 Note that this is a totally different technique, and it's easy to confuse the two.
2023 >Example usage (section):</DT
2035 # Block and replace with "blocked" page
2036 .nasty-stuff.example.com
2038 {+block +handle-as-image}
2039 # Block and replace with image
2043 {+block +handle-as-empty-document}
2044 # Block and then ignore
2045 adserver.exampleclick.net/.*\.js$</PRE
2060 NAME="CONTENT-TYPE-OVERWRITE"
2062 >8.5.3. content-type-overwrite</H4
2066 CLASS="VARIABLELIST"
2072 >Stop useless download menus from popping up, or change the browser's rendering mode</P
2078 > Replaces the <SPAN
2080 >"Content-Type:"</SPAN
2081 > HTTP server header.
2103 >"Content-Type:"</SPAN
2104 > HTTP server header is used by the
2105 browser to decide what to do with the document. The value of this
2106 header can cause the browser to open a download menu instead of
2107 displaying the document by itself, even if the document's format is
2108 supported by the browser.
2111 > The declared content type can also affect which rendering mode
2112 the browser chooses. If XHTML is delivered as <SPAN
2116 many browsers treat it as yet another broken HTML document.
2117 If it is send as <SPAN
2119 >"application/xml"</SPAN
2121 XHTML support will only display it, if the syntax is correct.
2124 > If you see a web site that proudly uses XHTML buttons, but sets
2127 >"Content-Type: text/html"</SPAN
2128 >, you can use <SPAN
2132 to overwrite it with <SPAN
2134 >"application/xml"</SPAN
2136 the web master's claim inside your XHTML-supporting browser.
2137 If the syntax is incorrect, the browser will complain loudly.
2140 > You can also go the opposite direction: if your browser prints
2141 error messages instead of rendering a document falsely declared
2142 as XHTML, you can overwrite the content type with
2146 > and have it rendered as broken HTML document.
2151 >content-type-overwrite</TT
2155 >"Content-Type:"</SPAN
2156 > headers that look like some kind of text.
2157 If you want to overwrite it unconditionally, you have to combine it with
2161 HREF="actions-file.html#FORCE-TEXT-MODE"
2165 This limitation exists for a reason, think twice before circumventing it.
2168 > Most of the time it's easier to enable
2172 HREF="actions-file.html#FILTER-SERVER-HEADERS"
2173 >filter-server-headers</A
2176 and replace this action with a custom regular expression. It allows you
2177 to activate it for every document of a certain site and it will still
2178 only replace the content types you aimed at.
2181 > Of course you can apply <TT
2183 >content-type-overwrite</TT
2185 to a whole site and then make URL based exceptions, but it's a lot
2186 more work to get the same precision.
2190 >Example usage (sections):</DT
2201 ># Check if www.example.net/ really uses valid XHTML
2202 { +content-type-overwrite{application/xml} }
2205 # but leave the content type unmodified if the URL looks like a style sheet
2206 {-content-type-overwrite}
2207 www.example.net/*.\.css$
2208 www.example.net/*.style</PRE
2223 NAME="CRUNCH-CLIENT-HEADER"
2225 >8.5.4. crunch-client-header</H4
2229 CLASS="VARIABLELIST"
2235 >Remove a client header <SPAN
2238 > has no dedicated action for.</P
2244 > Deletes every header sent by the client that contains the string the user supplied as parameter.
2264 > This action allows you to block client headers for which no dedicated
2272 > will remove every client header that
2273 contains the string you supplied as parameter.
2276 > Regular expressions are <SPAN
2283 use this action to block different headers in the same request, unless
2284 they contain the same string.
2289 >crunch-client-header</TT
2290 > is only meant for quick tests.
2291 If you have to block several different headers, or only want to modify
2292 parts of them, you should enable
2296 HREF="actions-file.html#FILTER-CLIENT-HEADERS"
2297 >filter-client-headers</A
2300 and create your own filter.
2321 > Don't block any header without understanding the consequences.
2329 >Example usage (section):</DT
2340 ># Block the non-existent "Privacy-Violation:" client header
2341 { +crunch-client-header{Privacy-Violation:} }
2358 NAME="CRUNCH-IF-NONE-MATCH"
2360 >8.5.5. crunch-if-none-match</H4
2364 CLASS="VARIABLELIST"
2370 >Prevent yet another way to track the user's steps between sessions.</P
2378 >"If-None-Match:"</SPAN
2379 > HTTP client header.
2399 > Removing the <SPAN
2401 >"If-None-Match:"</SPAN
2402 > HTTP client header
2403 is useful for filter testing, where you want to force a real
2404 reload instead of getting status code <SPAN
2408 would cause the browser to use a cached copy of the page.
2411 > It is also useful to make sure the header isn't used as a cookie
2415 > Blocking the <SPAN
2417 >"If-None-Match:"</SPAN
2418 > header shouldn't cause any
2419 caching problems, as long as the <SPAN
2421 >"If-Modified-Since:"</SPAN
2423 isn't blocked as well.
2426 > It is recommended to use this action together with
2430 HREF="actions-file.html#HIDE-IF-MODIFIED-SINCE"
2431 >hide-if-modified-since</A
2438 HREF="actions-file.html#OVERWRITE-LAST-MODIFIED"
2439 >overwrite-last-modified</A
2445 >Example usage (section):</DT
2456 ># Let the browser revalidate cached documents without being tracked across sessions
2457 { +hide-if-modified-since{-60} \
2458 +overwrite-last-modified{randomize} \
2459 +crunch-if-none-match}
2475 NAME="CRUNCH-INCOMING-COOKIES"
2477 >8.5.6. crunch-incoming-cookies</H4
2481 CLASS="VARIABLELIST"
2487 > Prevent the web server from setting any cookies on your system
2496 >"Set-Cookie:"</SPAN
2497 > HTTP headers from server replies.
2517 > This action is only concerned with <SPAN
2534 HREF="actions-file.html#CRUNCH-OUTGOING-COOKIES"
2535 >crunch-outgoing-cookies</A
2544 > to disable cookies completely.
2553 > to use this action in conjunction
2557 HREF="actions-file.html#SESSION-COOKIES-ONLY"
2558 >session-cookies-only</A
2561 since it would prevent the session cookies from being set. See also
2565 HREF="actions-file.html#FILTER-CONTENT-COOKIES"
2566 >filter-content-cookies</A
2583 >+crunch-incoming-cookies</PRE
2598 NAME="CRUNCH-SERVER-HEADER"
2600 >8.5.7. crunch-server-header</H4
2604 CLASS="VARIABLELIST"
2610 >Remove a server header <SPAN
2613 > has no dedicated action for.</P
2619 > Deletes every header sent by the server that contains the string the user supplied as parameter.
2639 > This action allows you to block server headers for which no dedicated
2643 > action exists. <SPAN
2647 will remove every server header that contains the string you supplied as parameter.
2650 > Regular expressions are <SPAN
2657 use this action to block different headers in the same request, unless
2658 they contain the same string.
2663 >crunch-server-header</TT
2664 > is only meant for quick tests.
2665 If you have to block several different headers, or only want to modify
2666 parts of them, you should enable
2670 HREF="actions-file.html#FILTER-SERVER-HEADERS"
2671 >filter-server-headers</A
2674 and create your own filter.
2695 > Don't block any header without understanding the consequences.
2703 >Example usage (section):</DT
2714 ># Crunch server headers that try to prevent caching
2715 { +crunch-server-header{no-cache} }
2731 NAME="CRUNCH-OUTGOING-COOKIES"
2733 >8.5.8. crunch-outgoing-cookies</H4
2737 CLASS="VARIABLELIST"
2743 > Prevent the web server from reading any cookies from your system
2753 > HTTP headers from client requests.
2773 > This action is only concerned with <SPAN
2790 HREF="actions-file.html#CRUNCH-INCOMING-COOKIES"
2791 >crunch-incoming-cookies</A
2800 > to disable cookies completely.
2809 > to use this action in conjunction
2813 HREF="actions-file.html#SESSION-COOKIES-ONLY"
2814 >session-cookies-only</A
2817 since it would prevent the session cookies from being read.
2832 >+crunch-outgoing-cookies</PRE
2847 NAME="DEANIMATE-GIFS"
2849 >8.5.9. deanimate-gifs</H4
2853 CLASS="VARIABLELIST"
2859 >Stop those annoying, distracting animated GIF images.</P
2865 > De-animate GIF animations, i.e. reduce them to their first or last image.
2891 > This will also shrink the images considerably (in bytes, not pixels!). If
2895 > is given, the first frame of the animation
2896 is used as the replacement. If <SPAN
2899 > is given, the last
2900 frame of the animation is used instead, which probably makes more sense for
2901 most banner animations, but also has the risk of not showing the entire
2902 last frame (if it is only a delta to an earlier frame).
2905 > You can safely use this action with patterns that will also match non-GIF
2906 objects, because no attempt will be made at anything that doesn't look like
2922 >+deanimate-gifs{last}</PRE
2937 NAME="DOWNGRADE-HTTP-VERSION"
2939 >8.5.10. downgrade-http-version</H4
2943 CLASS="VARIABLELIST"
2949 >Work around (very rare) problems with HTTP/1.1</P
2955 > Downgrades HTTP/1.1 client requests and server replies to HTTP/1.0.
2975 > This is a left-over from the time when <SPAN
2979 didn't support important HTTP/1.1 features well. It is left here for the
2980 unlikely case that you experience HTTP/1.1 related problems with some server
2981 out there. Not all (optional) HTTP/1.1 features are supported yet, so there
2982 is a chance you might need this action.
2986 >Example usage (section):</DT
2997 >{+downgrade-http-version}
2998 problem-host.example.com</PRE
3013 NAME="FAST-REDIRECTS"
3015 >8.5.11. fast-redirects</H4
3019 CLASS="VARIABLELIST"
3025 >Fool some click-tracking scripts and speed up indirect links.</P
3031 > Detects redirection URLs and redirects the browser without contacting
3032 the redirection server first.
3051 >"simple-check"</SPAN
3052 > to just search for the string <SPAN
3056 to detect redirection URLs.
3063 >"check-decoded-url"</SPAN
3064 > to decode URLs (if necessary) before searching
3065 for redirection URLs.
3075 Many sites, like yahoo.com, don't just link to other sites. Instead, they
3076 will link to some script on their own servers, giving the destination as a
3077 parameter, which will then redirect you to the final target. URLs
3078 resulting from this scheme typically look like:
3081 >"http://www.example.org/click-tracker.cgi?target=http%3a//www.example.net/"</SPAN
3085 > Sometimes, there are even multiple consecutive redirects encoded in the
3086 URL. These redirections via scripts make your web browsing more traceable,
3087 since the server from which you follow such a link can see where you go
3088 to. Apart from that, valuable bandwidth and time is wasted, while your
3089 browser asks the server for one redirect after the other. Plus, it feeds
3093 > This feature is currently not very smart and is scheduled for improvement.
3094 If it is enabled by default, you will have to create some exceptions to
3095 this action. It can lead to failures in several ways:
3098 > Not every URLs with other URLs as parameters is evil.
3099 Some sites offer a real service that requires this information to work.
3100 For example a validation service needs to know, which document to validate.
3104 > assumes that every URL parameter that
3105 looks like another URL is a redirection target, and will always redirect to
3106 the last one. Most of the time the assumption is correct, but if it isn't,
3107 the user gets redirected anyway.
3110 > Another failure occurs if the URL contains other parameters after the URL parameter.
3114 >"http://www.example.org/?redirect=http%3a//www.example.net/&foo=bar"</SPAN
3116 contains the redirection URL <SPAN
3118 >"http://www.example.net/"</SPAN
3120 followed by another parameter. <TT
3124 and will cause a redirect to <SPAN
3126 >"http://www.example.net/&foo=bar"</SPAN
3128 Depending on the target server configuration, the parameter will be silently ignored
3131 >"page not found"</SPAN
3132 > error. It is possible to fix these redirected
3136 HREF="actions-file.html#FILTER-CLIENT-HEADERS"
3137 >filter-client-headers</A
3140 but it requires a little effort.
3143 > To detect a redirection URL, <TT
3147 looks for the string <SPAN
3150 >, either in plain text
3151 (invalid but often used) or encoded as <SPAN
3155 Some sites use their own URL encoding scheme, encrypt the address
3156 of the target server or replace it with a database id. In theses cases
3160 > is fooled and the request reaches the
3161 redirection server where it probably gets logged.
3176 > { +fast-redirects{simple-check} }
3179 { +fast-redirects{check-decoded-url} }
3180 another.example.com/testing</PRE
3201 CLASS="VARIABLELIST"
3207 >Get rid of HTML and JavaScript annoyances, banner advertisements (by size),
3208 do fun text replacements, add personalized effects, etc.</P
3214 > All files of text-based type, most notably HTML and
3215 JavaScript, to which this action applies, can be filtered on-the-fly
3216 through the specified regular expression based substitutions. (Note: as of
3217 version 3.0.3 plain text documents are exempted from filtering, because
3218 web servers often use the <TT
3222 files whose type they don't know.) By default, filtering works only on the
3223 raw document content itself (that which can be seen with <TT
3241 > The name of a filter, as defined in the <A
3242 HREF="filter-file.html"
3245 Filters can be defined in one or more files as defined by the
3249 HREF="config.html#FILTERFILE"
3260 > is the collection of filters
3261 supplied by the developers. Locally defined filters should go
3262 in their own file, such as <TT
3268 > When used in its negative form,
3269 and without parameters, <SPAN
3275 > filtering is completely disabled.
3282 > For your convenience, there are a number of pre-defined filters available
3283 in the distribution filter file that you can use. See the examples below for
3287 > Filtering requires buffering the page content, which may appear to
3288 slow down page rendering since nothing is displayed until all content has
3289 passed the filters. (It does not really take longer, but seems that way
3290 since the page is not incrementally displayed.) This effect will be more
3291 noticeable on slower connections.
3296 >"Rolling your own"</SPAN
3298 filters requires a knowledge of
3300 HREF="http://en.wikipedia.org/wiki/Regular_expressions"
3309 HREF="http://en.wikipedia.org/wiki/Html"
3316 This is very powerful feature, and potentially very intrusive.
3317 Filters should be used with caution, and where an equivalent
3324 > The amount of data that can be filtered is limited to the
3328 HREF="config.html#BUFFER-LIMIT"
3332 option in the main <A
3336 default is 4096 KB (4 Megs). Once this limit is exceeded, the buffered
3337 data, and all pending data, is passed through unfiltered.
3340 > Inappropriate MIME types, such as zipped files, are not filtered at all.
3341 (Again, only text-based types except plain text). Encrypted SSL data
3342 (from HTTPS servers) cannot be filtered either, since this would violate
3343 the integrity of the secure transaction. In some situations it might
3344 be necessary to protect certain text, like source code, from filtering
3345 by defining appropriate <TT
3351 > At this time, <SPAN
3354 > cannot uncompress compressed
3355 documents. If you want filtering to work on all documents, even those that
3356 would normally be sent compressed, you must use the
3360 HREF="actions-file.html#PREVENT-COMPRESSION"
3361 >prevent-compression</A
3364 action in conjunction with <TT
3370 > Filtering can achieve some of the same effects as the
3374 HREF="actions-file.html#BLOCK"
3378 action, i.e. it can be used to block ads and banners. But the mechanism
3379 works quite differently. One effective use, is to block ad banners
3380 based on their size (see below), since many of these seem to be somewhat
3387 > with suggestions for new or
3388 improved filters is particularly welcome!
3391 > The below list has only the names and a one-line description of each
3392 predefined filter. There are <A
3393 HREF="filter-file.html#PREDEFINED-FILTERS"
3395 verbose explanations</A
3396 > of what these filters do in the <A
3397 HREF="filter-file.html"
3398 >filter file chapter</A
3403 >Example usage (with filters from the distribution <TT
3408 HREF="filter-file.html#PREDEFINED-FILTERS"
3409 >the Predefined Filters section</A
3411 more explanation on each:</DT
3415 NAME="FILTER-JS-ANNOYANCES"
3426 >+filter{js-annoyances} # Get rid of particularly annoying JavaScript abuse</PRE
3434 NAME="FILTER-JS-EVENTS"
3445 >+filter{js-events} # Kill all JS event bindings (Radically destructive! Only for extra nasty sites)</PRE
3453 NAME="FILTER-HTML-ANNOYANCES"
3464 >+filter{html-annoyances} # Get rid of particularly annoying HTML abuse</PRE
3472 NAME="FILTER-CONTENT-COOKIES"
3483 >+filter{content-cookies} # Kill cookies that come in the HTML or JS content</PRE
3491 NAME="FILTER-REFRESH-TAGS"
3502 >+filter{refresh-tags} # Kill automatic refresh tags (for dial-on-demand setups)</PRE
3510 NAME="FILTER-UNSOLICITED-POPUPS"
3521 >+filter{unsolicited-popups} # Disable only unsolicited pop-up windows. Useful if your browser lacks this ability.</PRE
3529 NAME="FILTER-ALL-POPUPS"
3540 >+filter{all-popups} # Kill all popups in JavaScript and HTML. Useful if your browser lacks this ability.</PRE
3548 NAME="FILTER-IMG-REORDER"
3559 >+filter{img-reorder} # Reorder attributes in <img> tags to make the banners-by-* filters more effective</PRE
3567 NAME="FILTER-BANNERS-BY-SIZE"
3578 >+filter{banners-by-size} # Kill banners by size</PRE
3586 NAME="FILTER-BANNERS-BY-LINK"
3597 >+filter{banners-by-link} # Kill banners by their links to known clicktrackers</PRE
3605 NAME="FILTER-WEBBUGS"
3616 >+filter{webbugs} # Squish WebBugs (1x1 invisible GIFs used for user tracking)</PRE
3624 NAME="FILTER-TINY-TEXTFORMS"
3635 >+filter{tiny-textforms} # Extend those tiny textareas up to 40x80 and kill the hard wrap</PRE
3643 NAME="FILTER-JUMPING-WINDOWS"
3654 >+filter{jumping-windows} # Prevent windows from resizing and moving themselves</PRE
3662 NAME="FILTER-FRAMESET-BORDERS"
3673 >+filter{frameset-borders} # Give frames a border and make them resizeable</PRE
3681 NAME="FILTER-DEMORONIZER"
3692 >+filter{demoronizer} # Fix MS's non-standard use of standard charsets</PRE
3700 NAME="FILTER-SHOCKWAVE-FLASH"
3711 >+filter{shockwave-flash} # Kill embedded Shockwave Flash objects</PRE
3719 NAME="FILTER-QUICKTIME-KIOSKMODE"
3730 >+filter{quicktime-kioskmode} # Make Quicktime movies savable</PRE
3749 >+filter{fun} # Text replacements for subversive browsing fun!</PRE
3757 NAME="FILTER-CRUDE-PARENTAL"
3768 >+filter{crude-parental} # Crude parental filtering (demo only)</PRE
3776 NAME="FILTER-IE-EXPLOITS"
3787 >+filter{ie-exploits} # Disable some known Internet Explorer bug exploits</PRE
3795 NAME="FILTER-SITE-SPECIFICS"
3806 >+filter{site-specifics} # Custom filters for specific site related problems</PRE
3814 NAME="FILTER-GOOGLE"
3825 >+filter{google} # Removes text ads and other Google specific improvements</PRE
3844 >+filter{yahoo} # Removes text ads and other Yahoo specific improvements</PRE
3863 >+filter{msn} # Removes text ads and other MSN specific improvements</PRE
3871 NAME="FILTER-BLOGSPOT"
3882 >+filter{blogspot} # Cleans up Blogspot blogs</PRE
3890 NAME="FILTER-HTML-TO-XML"
3901 >+filter{html-to-xml} # Header filter to change the Content-Type from html to xml</PRE
3909 NAME="FILTER-XML-TO-HTML"
3920 >+filter{xml-to-html} # Header filter to change the Content-Type from xml to html</PRE
3928 NAME="FILTER-NO-PING"
3939 >+filter{no-ping} # Removes non-standard ping attributes from anchor and area tags</PRE
3947 NAME="FILTER-HIDE-TOR-EXIT-NOTATION"
3958 >+filter{hide-tor-exit-notation} # Header filter to remove the Tor exit node notation in Host and Referer headers</PRE
3973 NAME="FILTER-CLIENT-HEADERS"
3975 >8.5.13. filter-client-headers</H4
3979 CLASS="VARIABLELIST"
3985 > To apply filtering to the client's (browser's) headers
3995 > filters only apply
3996 to the document content itself. This will extend those filters to
3997 include the client's headers as well.
4017 > Regular expressions can be used to filter headers as well. Check your
4018 filters closely before activating this action, as it can easily lead to broken
4023 These filters are applied to each header on its own, not to them
4024 all at once. This makes it easier to diagnose problems, but on the downside
4025 you can't write filters that only change header x if header y's value is
4029 > The filters are used after the other header actions have finished and can
4030 use their output as input.
4033 > Whenever possible one should specify <TT
4040 >, the whole header name and the colon, to make sure
4041 the filter doesn't cause havoc to other headers or the
4042 page itself. For example if you want to transform
4061 >s@Galeon/\d\.\d\.\d @@</PRE
4077 >s@^(User-Agent:.*) Galeon/\d\.\d\.\d (Firefox/\d\.\d\.\d\.\d)$@$1 $2@</PRE
4084 >Example usage (section):</DT
4095 >{+filter-client-headers +filter{test_filter}}
4096 problem-host.example.com
4112 NAME="FILTER-SERVER-HEADERS"
4114 >8.5.14. filter-server-headers</H4
4118 CLASS="VARIABLELIST"
4124 > To apply filtering to the server's headers
4134 > filters only apply
4135 to the document content itself. This will extend those filters to
4136 include the server's headers as well.
4158 >filter-client-headers</TT
4160 the server instead. To filter both server and client, use both.
4165 >filter-client-headers</TT
4167 filters before activating this action, as it can easily lead to broken
4172 These filters are applied to each header on its own, not to them
4173 all at once. This makes it easier to diagnose problems, but on the downside
4174 you can't write filters that only change header x if header y's value is
4178 > The filters are used after the other header actions have finished and can
4179 use their output as input.
4182 > Remember too, whenever possible one should specify <TT
4189 >, the whole header name and the colon, to make sure
4190 the filter doesn't cause havoc to other headers or the
4191 page itself. See above for example.
4195 >Example usage (section):</DT
4206 >{+filter-server-headers +filter{test_filter}}
4207 problem-host.example.com
4223 NAME="FORCE-TEXT-MODE"
4225 >8.5.15. force-text-mode</H4
4229 CLASS="VARIABLELIST"
4238 > to treat a document as if it was in some kind of <SPAN
4250 > Declares a document as text, even if the <SPAN
4252 >"Content-Type:"</SPAN
4253 > isn't detected as such.
4276 HREF="actions-file.html#FILTER"
4283 > tries to only filter files that are
4284 in some kind of text format. The same restrictions apply to
4288 HREF="actions-file.html#CONTENT-TYPE-OVERWRITE"
4289 >content-type-overwrite</A
4294 >force-text-mode</TT
4295 > declares a document as text,
4296 without looking at the <SPAN
4298 >"Content-Type:"</SPAN
4320 > Think twice before activating this action. Filtering binary data
4321 with regular expressions can cause file damage.
4356 NAME="HANDLE-AS-EMPTY-DOCUMENT"
4358 >8.5.16. handle-as-empty-document</H4
4362 CLASS="VARIABLELIST"
4368 >Mark URLs that should be replaced by empty documents <SPAN
4372 >if they get blocked</I
4380 > This action alone doesn't do anything noticeable. It just marks URLs.
4384 HREF="actions-file.html#BLOCK"
4394 the presence or absence of this mark decides whether an HTML <SPAN
4398 page, or an empty document will be sent to the client as a substitute for the blocked content.
4405 > document isn't literally empty, but actually contains a single space.
4425 > Some browsers complain about syntax errors if JavaScript documents
4426 are blocked with <SPAN
4430 default HTML page; this option can be used to silence them.
4431 And of course this action can also be used to eliminate the <SPAN
4435 BLOCKED message in frames.
4438 > The content type for the empty document can be specified with
4442 HREF="actions-file.html#CONTENT-TYPE-OVERWRITE"
4443 >content-type-overwrite{}</A
4446 but usually this isn't necessary.
4461 ># Block all documents on example.org that end with ".js",
4462 # but send an empty document instead of the usual HTML message.
4463 {+block +handle-as-empty-document}
4480 NAME="HANDLE-AS-IMAGE"
4482 >8.5.17. handle-as-image</H4
4486 CLASS="VARIABLELIST"
4492 >Mark URLs as belonging to images (so they'll be replaced by images <SPAN
4496 >if they do get blocked</I
4498 >, rather than HTML pages)</P
4504 > This action alone doesn't do anything noticeable. It just marks URLs as images.
4508 HREF="actions-file.html#BLOCK"
4518 the presence or absence of this mark decides whether an HTML <SPAN
4522 page, or a replacement image (as determined by the <TT
4525 HREF="actions-file.html#SET-IMAGE-BLOCKER"
4526 >set-image-blocker</A
4528 > action) will be sent to the
4529 client as a substitute for the blocked content.
4549 > The below generic example section is actually part of <TT
4553 It marks all URLs with well-known image file name extensions as images and should
4557 > Users will probably only want to use the handle-as-image action in conjunction with
4561 HREF="actions-file.html#BLOCK"
4564 >, to block sources of banners, whose URLs don't
4565 reflect the file type, like in the second example section.
4568 > Note that you cannot treat HTML pages as images in most cases. For instance, (in-line) ad
4569 frames require an HTML page to be sent, or they won't display properly.
4572 >handle-as-image</TT
4573 > in this situation will not replace the
4574 ad frame with an image, but lead to error messages.
4578 >Example usage (sections):</DT
4589 ># Generic image extensions:
4592 /.*\.(gif|jpg|jpeg|png|bmp|ico)$
4594 # These don't look like images, but they're banners and should be
4595 # blocked as images:
4597 {+block +handle-as-image}
4598 some.nasty-banner-server.com/junk.cgi?output=trash
4600 # Banner source! Who cares if they also have non-image content?
4601 ad.doubleclick.net </PRE
4616 NAME="HIDE-ACCEPT-LANGUAGE"
4618 >8.5.18. hide-accept-language</H4
4622 CLASS="VARIABLELIST"
4628 >Pretend to use different language settings.</P
4634 > Deletes or replaces the <SPAN
4636 >"Accept-Language:"</SPAN
4637 > HTTP header in client requests.
4653 >, or any user defined value.
4660 > Faking the browser's language settings can be useful to make a
4661 foreign User-Agent set with
4665 HREF="actions-file.html#HIDE-USER-AGENT"
4672 > However some sites with content in different languages check the
4675 >"Accept-Language:"</SPAN
4676 > to decide which one to take by default.
4677 Sometimes it isn't possible to later switch to another language without
4680 >"Accept-Language:"</SPAN
4684 > Therefore it's a good idea to either only change the
4687 >"Accept-Language:"</SPAN
4688 > header to languages you understand,
4689 or to languages that aren't wide spread.
4692 > Before setting the <SPAN
4694 >"Accept-Language:"</SPAN
4696 to a rare language, you should consider that it helps to
4697 make your requests unique and thus easier to trace.
4698 If you don't plan to change this header frequently,
4699 you should stick to a common language.
4703 >Example usage (section):</DT
4714 ># Pretend to use Canadian language settings.
4715 {+hide-accept-language{en-ca} \
4716 +hide-user-agent{Mozilla/5.0 (X11; U; OpenBSD i386; en-CA; rv:1.8.0.4) Gecko/20060628 Firefox/1.5.0.4} \
4733 NAME="HIDE-CONTENT-DISPOSITION"
4735 >8.5.19. hide-content-disposition</H4
4739 CLASS="VARIABLELIST"
4745 >Prevent download menus for content you prefer to view inside the browser.</P
4751 > Deletes or replaces the <SPAN
4753 >"Content-Disposition:"</SPAN
4754 > HTTP header set by some servers.
4770 >, or any user defined value.
4777 > Some servers set the <SPAN
4779 >"Content-Disposition:"</SPAN
4781 documents they assume you want to save locally before viewing them.
4784 >"Content-Disposition:"</SPAN
4785 > header contains the file name
4786 the browser is supposed to use by default.
4789 > In most browsers that understand this header, it makes it impossible to
4796 > the document, without downloading it first,
4797 even if it's just a simple text file or an image.
4800 > Removing the <SPAN
4802 >"Content-Disposition:"</SPAN
4804 to prevent this annoyance, but some browsers additionally check the
4807 >"Content-Type:"</SPAN
4808 > header, before they decide if they can
4809 display a document without saving it first. In these cases, you have
4810 to change this header as well, before the browser stops displaying
4814 > It is also possible to change the server's file name suggestion
4815 to another one, but in most cases it isn't worth the time to set
4831 ># Disarm the download link in Sourceforge's patch tracker
4833 +content-type-overwrite{text/plain}\
4834 +hide-content-disposition{block} }
4835 .sourceforge.net/tracker/download.php</PRE
4850 NAME="HIDE-IF-MODIFIED-SINCE"
4852 >8.5.20. hide-if-modified-since</H4
4856 CLASS="VARIABLELIST"
4862 >Prevent yet another way to track the user's steps between sessions.</P
4870 >"If-Modified-Since:"</SPAN
4871 > HTTP client header or modifies its value.
4887 >, or a user defined value that specifies a range of hours.
4894 > Removing this header is useful for filter testing, where you want to force a real
4895 reload instead of getting status code <SPAN
4898 >, which would cause the
4899 browser to use a cached copy of the page.
4902 > Instead of removing the header, <TT
4904 >hide-if-modified-since</TT
4906 also add or subtract a random amount of time to/from the header's value.
4907 You specify a range of minutes where the random factor should be chosen from and
4911 > does the rest. A negative value means
4912 subtracting, a positive value adding.
4915 > Randomizing the value of the <SPAN
4917 >"If-Modified-Since:"</SPAN
4919 sure it isn't used as a cookie replacement, but you will run into
4920 caching problems if the random range is too high.
4923 > It is a good idea to only use a small negative value and let
4927 HREF="actions-file.html#OVERWRITE-LAST-MODIFIED"
4928 >overwrite-last-modified</A
4931 handle the greater changes.
4934 > It is also recommended to use this action together with
4938 HREF="actions-file.html#CRUNCH-IF-NONE-MATCH"
4939 >crunch-if-none-match</A
4945 >Example usage (section):</DT
4956 ># Let the browser revalidate without being tracked across sessions
4957 { +hide-if-modified-since{-60} \
4958 +overwrite-last-modified{randomize} \
4959 +crunch-if-none-match}
4975 NAME="HIDE-FORWARDED-FOR-HEADERS"
4977 >8.5.21. hide-forwarded-for-headers</H4
4981 CLASS="VARIABLELIST"
4987 >Improve privacy by hiding the true source of the request</P
4993 > Deletes any existing <SPAN
4995 >"X-Forwarded-for:"</SPAN
4996 > HTTP header from client requests,
4997 and prevents adding a new one.
5017 > It is fairly safe to leave this on.
5020 > This action is scheduled for improvement: It should be able to generate forged
5023 >"X-Forwarded-for:"</SPAN
5024 > headers using random IP addresses from a specified network,
5025 to make successive requests from the same client look like requests from a pool of different
5026 users sharing the same proxy.
5041 >+hide-forwarded-for-headers</PRE
5056 NAME="HIDE-FROM-HEADER"
5058 >8.5.22. hide-from-header</H4
5062 CLASS="VARIABLELIST"
5068 >Keep your (old and ill) browser from telling web servers your email address</P
5074 > Deletes any existing <SPAN
5077 > HTTP header, or replaces it with the
5094 >, or any user defined value.
5104 > will completely remove the header
5105 (not to be confused with the <TT
5108 HREF="actions-file.html#BLOCK"
5115 > Alternately, you can specify any value you prefer to be sent to the web
5116 server. If you do, it is a matter of fairness not to use any address that
5117 is actually used by a real person.
5120 > This action is rarely needed, as modern web browsers don't send
5139 >+hide-from-header{block}</PRE
5152 >+hide-from-header{spam-me-senseless@sittingduck.example.com}</PRE
5167 NAME="HIDE-REFERRER"
5169 >8.5.23. hide-referrer</H4
5176 CLASS="VARIABLELIST"
5182 >Conceal which link you followed to get to a particular site</P
5191 > (sic) HTTP header from the client request,
5192 or replaces it with a forged one.
5211 >"conditional-block"</SPAN
5212 > to delete the header completely if the host has changed.</P
5219 > to delete the header unconditionally.</P
5226 > to pretend to be coming from the homepage of the server we are talking to.</P
5230 >Any other string to set a user defined referrer.</P
5240 >conditional-block</TT
5241 > is the only parameter,
5242 that isn't easily detected in the server's log file. If it blocks the
5243 referrer, the request will look like the visitor used a bookmark or
5244 typed in the address directly.
5247 > Leaving the referrer unmodified for requests on the same host
5248 allows the server owner to see the visitor's <SPAN
5252 but in most cases she could also get that information by comparing
5253 other parts of the log file: for example the User-Agent if it isn't
5254 a very common one, or the user's IP address if it doesn't change between
5258 > Always blocking the referrer, or using a custom one, can lead to
5259 failures on servers that check the referrer before they answer any
5260 requests, in an attempt to prevent their valuable content from being
5261 embedded or linked to elsewhere.
5266 >conditional-block</TT
5271 will work with referrer checks, as long as content and valid referring page
5272 are on the same host. Most of the time that's the case.
5279 > is an alternate spelling of
5283 > and the two can be can be freely
5284 substituted with each other. (<SPAN
5288 correct English spelling, however the HTTP specification has a bug - it
5289 requires it to be spelled as <SPAN
5307 >+hide-referrer{forge}</PRE
5320 >+hide-referrer{http://www.yahoo.com/}</PRE
5335 NAME="HIDE-USER-AGENT"
5337 >8.5.24. hide-user-agent</H4
5341 CLASS="VARIABLELIST"
5347 >Conceal your type of browser and client operating system</P
5353 > Replaces the value of the <SPAN
5355 >"User-Agent:"</SPAN
5357 in client requests with the specified value.
5370 > Any user-defined string.
5395 > This can lead to problems on web sites that depend on looking at this header in
5396 order to customize their content for different browsers (which, by the
5403 > the right thing to do: good web sites
5404 work browser-independently).
5412 > Using this action in multi-user setups or wherever different types of
5413 browsers will access the same <SPAN
5423 >. In single-user, single-browser
5424 setups, you might use it to delete your OS version information from
5425 the headers, because it is an invitation to exploit known bugs for your
5426 OS. It is also occasionally useful to forge this in order to access
5427 sites that won't let you in otherwise (though there may be a good
5428 reason in some cases). Example of this: some MSN sites will not
5432 > enter, yet forging to a
5436 > user-agent works just fine.
5437 (Must be just a silly MS goof, I'm sure :-).
5440 > This action is scheduled for improvement.
5455 >+hide-user-agent{Netscape 6.1 (X11; I; Linux 2.4.18 i686)}</PRE
5470 NAME="INSPECT-JPEGS"
5472 >8.5.25. inspect-jpegs</H4
5476 CLASS="VARIABLELIST"
5482 >To protect against the MS buffer over-run in JPEG processing</P
5488 > Protect against a known exploit
5508 > See Microsoft Security Bulletin MS04-028. JPEG images are one of the most
5509 common image types found across the Internet. The exploit as described can
5510 allow execution of code on the target system, giving an attacker access
5511 to the system in question by merely planting an altered JPEG image, which
5512 would have no obvious indications of what lurks inside. This action
5513 prevents unwanted intrusion.
5528 >+inspect-jpegs</PRE
5544 >8.5.26. kill-popups<A
5551 CLASS="VARIABLELIST"
5557 >Eliminate those annoying pop-up windows (deprecated)</P
5563 > While loading the document, replace JavaScript code that opens
5564 pop-up windows with (syntactically neutral) dummy code on the fly.
5584 > This action is basically a built-in, hardwired special-purpose filter
5585 action, but there are important differences: For <TT
5589 the document need not be buffered, so it can be incrementally rendered while
5590 downloading. But <TT
5593 > doesn't catch as many pop-ups as
5597 HREF="actions-file.html#FILTER-ALL-POPUPS"
5606 does and is not as smart as <TT
5609 HREF="actions-file.html#FILTER-UNSOLICITED-POPUPS"
5613 >unsolicited-popups</I
5621 > Think of it as a fast and efficient replacement for a filter that you
5622 can use if you don't want any filtering at all. Note that it doesn't make
5623 sense to combine it with any <TT
5626 HREF="actions-file.html#FILTER"
5630 since as soon as one <TT
5633 HREF="actions-file.html#FILTER"
5637 the whole document needs to be buffered anyway, which destroys the advantage of
5641 > action over its filter equivalent.
5644 > Killing all pop-ups unconditionally is problematic. Many shops and banks rely on
5645 pop-ups to display forms, shopping carts etc, and the <TT
5648 HREF="actions-file.html#FILTER-UNSOLICITED-POPUPS"
5652 >unsolicited-popups</I
5657 > does a better job of catching only the unwanted ones.
5660 > If the only kind of pop-ups that you want to kill are exit consoles (those
5667 > windows that appear when you close an other
5668 one), you might want to use
5672 HREF="actions-file.html#FILTER"
5684 > This action is most appropriate for browsers that don't have any controls
5685 for unwanted pop-ups. Not recommended for general usage.
5714 NAME="LIMIT-CONNECT"
5716 >8.5.27. limit-connect</H4
5720 CLASS="VARIABLELIST"
5726 >Prevent abuse of <SPAN
5729 > as a TCP proxy relay or disable SSL for untrusted sites</P
5735 > Specifies to which ports HTTP CONNECT requests are allowable.
5748 > A comma-separated list of ports or port ranges (the latter using dashes, with the minimum
5749 defaulting to 0 and the maximum to 65K).
5756 > By default, i.e. if no <TT
5763 > only allows HTTP CONNECT
5764 requests to port 443 (the standard, secure HTTPS port). Use
5768 > if more fine-grained control is desired
5769 for some or all destinations.
5772 > The CONNECT methods exists in HTTP to allow access to secure websites
5776 > URLs) through proxies. It works very simply:
5777 the proxy connects to the server on the specified port, and then
5778 short-circuits its connections to the client and to the remote server.
5779 This can be a big security hole, since CONNECT-enabled proxies can be
5780 abused as TCP relays very easily.
5786 > relays HTTPS traffic without seeing
5787 the decoded content. Websites can leverage this limitation to circumvent <SPAN
5791 filters. By specifying an invalid port range you can disable HTTPS entirely.
5792 If you plan to disable SSL by default, consider enabling
5796 HREF="actions-file.html#TREAT-FORBIDDEN-CONNECTS-LIKE-BLOCKS"
5797 >treat-forbidden-connects-like-blocks</A
5800 as well, to be able to quickly create exceptions.
5804 >Example usages:</DT
5815 >+limit-connect{443} # This is the default and need not be specified.
5816 +limit-connect{80,443} # Ports 80 and 443 are OK.
5817 +limit-connect{-3, 7, 20-100, 500-} # Ports less than 3, 7, 20 to 100 and above 500 are OK.
5818 +limit-connect{-} # All ports are OK
5819 +limit-connect{,} # No HTTPS/SSL traffic is allowed</PRE
5834 NAME="PREVENT-COMPRESSION"
5836 >8.5.28. prevent-compression</H4
5840 CLASS="VARIABLELIST"
5846 > Ensure that servers send the content uncompressed, so it can be
5850 HREF="actions-file.html#FILTER"
5860 > Removes the Accept-Encoding header which can be used to ask for compressed transfer.
5880 > More and more websites send their content compressed by default, which
5881 is generally a good idea and saves bandwidth. But for the <TT
5884 HREF="actions-file.html#FILTER"
5890 HREF="actions-file.html#DEANIMATE-GIFS"
5897 HREF="actions-file.html#KILL-POPUPS"
5904 > needs access to the uncompressed data.
5905 Unfortunately, <SPAN
5908 > can't yet(!) uncompress, filter, and
5909 re-compress the content on the fly. So if you want to ensure that all websites, including
5910 those that normally compress, can be filtered, you need to use this action.
5913 > This will slow down transfers from those websites, though. If you use any of the above-mentioned
5914 actions, you will typically want to use <TT
5916 >prevent-compression</TT
5921 > Note that some (rare) ill-configured sites don't handle requests for uncompressed
5922 documents correctly (they send an empty document body). If you use <TT
5924 >prevent-compression</TT
5926 per default, you'll have to add exceptions for those sites. See the example for how to do that.
5930 >Example usage (sections):</DT
5941 ># Selectively turn off compression, and enable a filter
5943 { +filter{tiny-textforms} +prevent-compression }
5944 # Match only these sites
5949 # Or instead, we could set a universal default:
5951 { +prevent-compression }
5954 # Then maybe make exceptions for ill-behaved sites:
5956 { -prevent-compression }
5958 www.pclinuxonline.com</PRE
5973 NAME="OVERWRITE-LAST-MODIFIED"
5975 >8.5.29. overwrite-last-modified</H4
5979 CLASS="VARIABLELIST"
5985 >Prevent yet another way to track the user's steps between sessions.</P
5993 >"Last-Modified:"</SPAN
5994 > HTTP server header or modifies its value.
6007 > One of the keywords: <SPAN
6012 >"reset-to-request-time"</SPAN
6024 > Removing the <SPAN
6026 >"Last-Modified:"</SPAN
6027 > header is useful for filter
6028 testing, where you want to force a real reload instead of getting status
6032 >, which would cause the browser to reuse the old
6033 version of the page.
6039 > option overwrites the value of the
6042 >"Last-Modified:"</SPAN
6043 > header with a randomly chosen time
6044 between the original value and the current time. In theory the server
6045 could send each document with a different <SPAN
6047 >"Last-Modified:"</SPAN
6049 header to track visits without using cookies. <SPAN
6053 makes it impossible and the browser can still revalidate cached documents.
6058 >"reset-to-request-time"</SPAN
6059 > overwrites the value of the
6062 >"Last-Modified:"</SPAN
6063 > header with the current time. You could use
6064 this option together with
6068 HREF="actions-file.html#HIDE-IF-MODIFIED-SINCE"
6069 >hided-if-modified-since</A
6072 to further customize your random range.
6075 > The preferred parameter here is <SPAN
6079 to use, as long as the time settings are more or less correct.
6080 If the server sets the <SPAN
6082 >"Last-Modified:"</SPAN
6083 > header to the time
6084 of the request, the random range becomes zero and the value stays the same.
6085 Therefore you should later randomize it a second time with
6089 HREF="actions-file.html#HIDE-IF-MODIFIED-SINCE"
6090 >hided-if-modified-since</A
6096 > It is also recommended to use this action together with
6100 HREF="actions-file.html#CRUNCH-IF-NONE-MATCH"
6101 >crunch-if-none-match</A
6118 ># Let the browser revalidate without being tracked across sessions
6119 { +hide-if-modified-since{-60} \
6120 +overwrite-last-modified{randomize} \
6121 +crunch-if-none-match}
6139 >8.5.30. redirect</H4
6143 CLASS="VARIABLELIST"
6149 > Redirect requests to other sites.
6156 > Convinces the browser that the requested document has been moved
6157 to another location and the browser should get it from there.
6177 > This action is useful to replace whole documents with ones of your
6178 choosing. This can be used to enforce safe surfing, or just as a simple
6182 > You can do the same by combining the actions
6186 HREF="actions-file.html#BLOCK"
6193 HREF="actions-file.html#HANDLE-AS-IMAGE"
6200 HREF="actions-file.html#SET-IMAGE-BLOCKER"
6201 >set-image-blocker{URL}</A
6204 It doesn't sound right for non-image documents, and that's why this action
6208 > This action will be ignored if you use it together with
6212 HREF="actions-file.html#BLOCK"
6219 >Example usages:</DT
6230 ># Replace example.com's style sheet with another one
6231 { +redirect{http://localhost/css-replacements/example.com.css} }
6232 example.com/stylesheet.css
6234 # Create a short, easy to remember nickname for a favorite site
6235 { +redirect{http://www.privoxy.org/user-manual/actions-file.html} }
6251 NAME="SEND-VANILLA-WAFER"
6253 >8.5.31. send-vanilla-wafer</H4
6257 CLASS="VARIABLELIST"
6263 > Feed log analysis scripts with useless data.
6270 > Sends a cookie with each request stating that you do not accept any copyright
6271 on cookies sent to you, and asking the site operator not to track you.
6291 > The vanilla wafer is a (relatively) unique header and could conceivably be used to track you.
6294 > This action is rarely used and not enabled in the default configuration.
6309 >+send-vanilla-wafer</PRE
6326 >8.5.32. send-wafer</H4
6330 CLASS="VARIABLELIST"
6336 > Send custom cookies or feed log analysis scripts with even more useless data.
6343 > Sends a custom, user-defined cookie with each request.
6356 > A string of the form <SPAN
6376 > Being multi-valued, multiple instances of this action can apply to the same request,
6377 resulting in multiple cookies being sent.
6380 > This action is rarely used and not enabled in the default configuration.
6384 >Example usage (section):</DT
6395 >{+send-wafer{UsingPrivoxy=true}}
6396 my-internal-testing-server.void</PRE
6411 NAME="SESSION-COOKIES-ONLY"
6413 >8.5.33. session-cookies-only</H4
6417 CLASS="VARIABLELIST"
6423 > Allow only temporary <SPAN
6426 > cookies (for the current
6427 browser session <SPAN
6445 >"Set-Cookie:"</SPAN
6447 server headers. Most browsers will not store such cookies permanently and
6448 forget them in between sessions.
6468 > This is less strict than <TT
6471 HREF="actions-file.html#CRUNCH-INCOMING-COOKIES"
6472 >crunch-incoming-cookies</A
6478 HREF="actions-file.html#CRUNCH-OUTGOING-COOKIES"
6479 >crunch-outgoing-cookies</A
6481 > and allows you to browse
6482 websites that insist or rely on setting cookies, without compromising your privacy too badly.
6485 > Most browsers will not permanently store cookies that have been processed by
6488 >session-cookies-only</TT
6489 > and will forget about them between sessions.
6490 This makes profiling cookies useless, but won't break sites which require cookies so
6491 that you can log in for transactions. This is generally turned on for all
6492 sites, and is the recommended setting.
6503 >session-cookies-only</TT
6508 HREF="actions-file.html#CRUNCH-INCOMING-COOKIES"
6509 >crunch-incoming-cookies</A
6515 HREF="actions-file.html#CRUNCH-OUTGOING-COOKIES"
6516 >crunch-outgoing-cookies</A
6518 >. If you do, cookies
6519 will be plainly killed.
6522 > Note that it is up to the browser how it handles such cookies without an <SPAN
6526 field. If you use an exotic browser, you might want to try it out to be sure.
6529 > This setting also has no effect on cookies that may have been stored
6530 previously by the browser before starting <SPAN
6534 These would have to be removed manually.
6542 HREF="actions-file.html#FILTER-CONTENT-COOKIES"
6543 >content-cookies filter</A
6545 to block some types of cookies. Content cookies are not effected by
6548 >session-cookies-only</TT
6564 >+session-cookies-only</PRE
6579 NAME="SET-IMAGE-BLOCKER"
6581 >8.5.34. set-image-blocker</H4
6585 CLASS="VARIABLELIST"
6591 >Choose the replacement for blocked images</P
6597 > This action alone doesn't do anything noticeable. If <SPAN
6607 HREF="actions-file.html#BLOCK"
6619 HREF="actions-file.html#HANDLE-AS-IMAGE"
6629 apply, i.e. if the request is to be blocked as an image,
6636 > the parameter of this action decides what will be
6637 sent as a replacement.
6657 > to send a built-in checkerboard pattern image. The image is visually
6658 decent, scales very well, and makes it obvious where banners were busted.
6666 > to send a built-in transparent image. This makes banners disappear
6667 completely, but makes it hard to detect where <SPAN
6671 images on a given page and complicates troubleshooting if <SPAN
6675 has blocked innocent images, like navigation icons.
6689 send a redirect to <TT
6695 to any image anywhere, even in your local filesystem via <SPAN
6699 (But note that not all browsers support redirecting to a local file system).
6702 > A good application of redirects is to use special <SPAN
6706 URLs, which send the built-in images, as <TT
6712 This has the same visual effect as specifying <SPAN
6719 the first place, but enables your browser to cache the replacement image, instead of requesting
6720 it over and over again.
6729 > The URLs for the built-in images are <SPAN
6731 >"http://config.privoxy.org/send-banner?type=<TT
6752 > There is a third (advanced) type, called <SPAN
6764 >set-image-blocker</TT
6765 >, but meant for use from <A
6766 HREF="filter-file.html"
6769 Auto will select the type of image that would have applied to the referring page, had it been an image.
6787 >+set-image-blocker{pattern}</PRE
6794 > Redirect to the BSD devil:
6805 >+set-image-blocker{http://www.freebsd.org/gifs/dae_up3.gif}</PRE
6812 > Redirect to the built-in pattern for better caching:
6823 >+set-image-blocker{http://config.privoxy.org/send-banner?type=pattern}</PRE
6838 NAME="TREAT-FORBIDDEN-CONNECTS-LIKE-BLOCKS"
6840 >8.5.35. treat-forbidden-connects-like-blocks</H4
6844 CLASS="VARIABLELIST"
6850 >Block forbidden connects with an easy to find error message.</P
6856 > If this action is enabled, <SPAN
6860 makes a difference between forbidden connects and ordinary blocks.
6884 HREF="actions-file.html#LIMIT-CONNECT"
6890 with a short error message inside the headers. If the browser doesn't display
6891 headers (most don't), you just see an empty page.
6894 > With this action enabled, <SPAN
6898 the message that is used for ordinary blocks instead. If you decide
6899 to make an exception for the page in question, you can do so by
6909 > requests the clients tell
6913 > which host they are interested
6914 in, but not which document they plan to get later. As a result, the
6917 >"Go there anyway"</SPAN
6918 > link becomes rather useless:
6919 it lets the client request the home page of the forbidden host
6920 through unencrypted HTTP, still using the port of the last request.
6923 > If you previously configured <SPAN
6927 request through a SSL tunnel, everything will work. Most likely you haven't
6928 and the server will respond with an error message because it is expecting
6944 >+treat-forbidden-connects-like-blocks</PRE
6961 >8.5.36. Summary</H3
6963 > Note that many of these actions have the potential to cause a page to
6964 misbehave, possibly even not to display at all. There are many ways
6965 a site designer may choose to design his site, and what HTTP header
6966 content, and other criteria, he may depend on. There is no way to have hard
6967 and fast rules for all sites. See the <A
6968 HREF="appendix.html#ACTIONSANAT"
6970 > for a brief example on troubleshooting
6993 >, can be defined by combining other actions.
6994 These can in turn be invoked just like the built-in actions.
6995 Currently, an alias name can contain any character except space, tab,
7013 > that you only use <SPAN
7033 Alias names are not case sensitive, and are not required to start with a
7040 > sign, since they are merely textually
7043 > Aliases can be used throughout the actions file, but they <SPAN
7048 defined in a special section at the top of the file!</I
7051 And there can only be one such section per actions file. Each actions file may
7052 have its own alias section, and the aliases defined in it are only visible
7053 within that file.</P
7055 > There are two main reasons to use aliases: One is to save typing for frequently
7056 used combinations of actions, the other one is a gain in flexibility: If you
7057 decide once how you want to handle shops by defining an alias called
7061 >, you can later change your policy on shops in
7068 > place, and your changes will take effect everywhere
7069 in the actions file where the <SPAN
7072 > alias is used. Calling aliases
7073 by their purpose also makes your actions files more readable.</P
7075 > Currently, there is one big drawback to using aliases, though:
7079 >'s built-in web-based action file
7080 editor honors aliases when reading the actions files, but it expands
7081 them before writing. So the effects of your aliases are of course preserved,
7082 but the aliases themselves are lost when you edit sections that use aliases
7085 > Now let's define some aliases...</P
7095 > # Useful custom aliases we can use later.
7097 # Note the (required!) section header line and that this section
7098 # must be at the top of the actions file!
7102 # These aliases just save typing later:
7103 # (Note that some already use other aliases!)
7105 +crunch-all-cookies = +<A
7106 HREF="actions-file.html#CRUNCH-INCOMING-COOKIES"
7107 >crunch-incoming-cookies</A
7109 HREF="actions-file.html#CRUNCH-OUTGOING-COOKIES"
7110 >crunch-outgoing-cookies</A
7112 -crunch-all-cookies = -<A
7113 HREF="actions-file.html#CRUNCH-INCOMING-COOKIES"
7114 >crunch-incoming-cookies</A
7116 HREF="actions-file.html#CRUNCH-OUTGOING-COOKIES"
7117 >crunch-outgoing-cookies</A
7119 +block-as-image = +block +handle-as-image
7120 allow-all-cookies = -crunch-all-cookies -<A
7121 HREF="actions-file.html#SESSION-COOKIES-ONLY"
7122 >session-cookies-only</A
7124 HREF="actions-file.html#FILTER-CONTENT-COOKIES"
7125 >filter{content-cookies}</A
7128 # These aliases define combinations of actions
7129 # that are useful for certain types of sites:
7132 HREF="actions-file.html#BLOCK"
7135 HREF="actions-file.html#FILTER"
7137 > -crunch-all-cookies -<A
7138 HREF="actions-file.html#FAST-REDIRECTS"
7141 HREF="actions-file.html#HIDE-REFERER"
7144 HREF="actions-file.html#KILL-POPUPS"
7147 HREF="actions-file.html#PREVENT-COMPRESSION"
7148 >prevent-compression</A
7151 shop = -crunch-all-cookies -<A
7152 HREF="actions-file.html#FILTER-ALL-POPUPS"
7153 >filter{all-popups}</A
7155 HREF="actions-file.html#KILL-POPUPS"
7159 # Short names for other aliases, for really lazy people ;-)
7161 c0 = +crunch-all-cookies
7162 c1 = -crunch-all-cookies</PRE
7168 > ...and put them to use. These sections would appear in the lower part of an
7169 actions file and define exceptions to the default actions (as specified further
7183 > # These sites are either very complex or very keen on
7184 # user data and require minimal interference to work:
7187 .office.microsoft.com
7188 .windowsupdate.microsoft.com
7189 # Gmail is really mail.google.com, not gmail.com
7193 # Allow cookies (for setting and retrieving your customer data)
7197 .worldpay.com # for quietpc.com
7200 # These shops require pop-ups:
7202 {-kill-popups -filter{all-popups} -filter{unsolicited-popups}}
7204 .overclockers.co.uk</PRE
7210 > Aliases like <SPAN
7216 > are typically used for
7220 > sites that require more than one action to be disabled
7221 in order to function properly.</P
7230 >8.7. Actions Files Tutorial</H2
7232 > The above chapters have shown <A
7233 HREF="actions-file.html"
7234 >which actions files
7235 there are and how they are organized</A
7236 >, how actions are <A
7237 HREF="actions-file.html#ACTIONS"
7240 HREF="actions-file.html#ACTIONS-APPLY"
7244 HREF="actions-file.html#AF-PATTERNS"
7248 HREF="actions-file.html#ALIASES"
7250 >. Now, let's look at an
7258 file and see how all these pieces come together:</P
7266 >8.7.1. default.action</H3
7268 >Every config file should start with a short comment stating its purpose:</P
7278 ># Sample default.action file <ijbswa-developers@lists.sourceforge.net></PRE
7284 >Then, since this is the <TT
7288 first section is a special section for internal use that you needn't
7289 change or worry about:</P
7299 >##########################################################################
7300 # Settings -- Don't change! For internal Privoxy use ONLY.
7301 ##########################################################################
7304 for-privoxy-version=3.0</PRE
7310 >After that comes the (optional) alias section. We'll use the example
7311 section from the above <A
7312 HREF="actions-file.html#ALIASES"
7313 >chapter on aliases</A
7315 that also explains why and how aliases are used:</P
7325 >##########################################################################
7327 ##########################################################################
7330 # These aliases just save typing later:
7331 # (Note that some already use other aliases!)
7333 +crunch-all-cookies = +<A
7334 HREF="actions-file.html#CRUNCH-INCOMING-COOKIES"
7335 >crunch-incoming-cookies</A
7337 HREF="actions-file.html#CRUNCH-OUTGOING-COOKIES"
7338 >crunch-outgoing-cookies</A
7340 -crunch-all-cookies = -<A
7341 HREF="actions-file.html#CRUNCH-INCOMING-COOKIES"
7342 >crunch-incoming-cookies</A
7344 HREF="actions-file.html#CRUNCH-OUTGOING-COOKIES"
7345 >crunch-outgoing-cookies</A
7347 +block-as-image = +block +handle-as-image
7348 mercy-for-cookies = -crunch-all-cookies -<A
7349 HREF="actions-file.html#SESSION-COOKIES-ONLY"
7350 >session-cookies-only</A
7352 HREF="actions-file.html#FILTER-CONTENT-COOKIES"
7353 >filter{content-cookies}</A
7356 # These aliases define combinations of actions
7357 # that are useful for certain types of sites:
7360 HREF="actions-file.html#BLOCK"
7363 HREF="actions-file.html#FILTER"
7365 > -crunch-all-cookies -<A
7366 HREF="actions-file.html#FAST-REDIRECTS"
7369 HREF="actions-file.html#HIDE-REFERER"
7372 HREF="actions-file.html#KILL-POPUPS"
7375 shop = -crunch-all-cookies -<A
7376 HREF="actions-file.html#FILTER-ALL-POPUPS"
7377 >filter{all-popups}</A
7379 HREF="actions-file.html#KILL-POPUPS"
7387 > Now come the regular sections, i.e. sets of actions, accompanied
7388 by URL patterns to which they apply. Remember <SPAN
7393 are disabled when matching starts</I
7395 >, so we have to explicitly
7396 enable the ones we want.</P
7398 > The first regular section is probably the most important. It has only
7407 HREF="actions-file.html#AF-PATTERNS"
7408 >matches all URLs</A
7410 set of actions used in this <SPAN
7418 be applied to all requests as a start</I
7420 >. It can be partly or
7421 wholly overridden by later matches further down this file, or in user.action,
7422 but it will still be largely responsible for your overall browsing
7425 > Again, at the start of matching, all actions are disabled, so there is
7426 no real need to disable any actions here, but we will do that nonetheless,
7427 to have a complete listing for your reference. (Remember: a <SPAN
7431 preceding the action name enables the action, a <SPAN
7435 Also note how this long line has been made more readable by splitting it into
7436 multiple lines with line continuation.</P
7446 >##########################################################################
7447 # "Defaults" section:
7448 ##########################################################################
7451 HREF="actions-file.html#ADD-HEADER"
7455 HREF="actions-file.html#BLOCK"
7459 HREF="actions-file.html#CONTENT-TYPE-OVERWRITE"
7460 >content-type-overwrite</A
7463 HREF="actions-file.html#CRUNCH-CLIENT-HEADER"
7464 >crunch-client-header</A
7467 HREF="actions-file.html#CRUNCH-IF-NONE-MATCH"
7468 >crunch-if-none-match</A
7471 HREF="actions-file.html#CRUNCH-INCOMING-COOKIES"
7472 >crunch-incoming-cookies</A
7475 HREF="actions-file.html#CRUNCH-SERVER-HEADER"
7476 >crunch-server-header</A
7479 HREF="actions-file.html#CRUNCH-OUTGOING-COOKIES"
7480 >crunch-outgoing-cookies</A
7483 HREF="actions-file.html#DEANIMATE-GIFS"
7487 HREF="actions-file.html#DOWNGRADE-HTTP-VERSION"
7488 >downgrade-http-version</A
7491 HREF="actions-file.html#FAST-REDIRECTS"
7492 >fast-redirects{check-decoded-url}</A
7495 HREF="actions-file.html#FILTER-JS-ANNOYANCES"
7496 >filter{js-annoyances}</A
7499 HREF="actions-file.html#FILTER-JS-EVENTS"
7500 >filter{js-events}</A
7503 HREF="actions-file.html#FILTER-HTML-ANNOYANCES"
7504 >filter{html-annoyances}</A
7507 HREF="actions-file.html#FILTER-CONTENT-COOKIES"
7508 >filter{content-cookies}</A
7511 HREF="actions-file.html#FILTER-REFRESH-TAGS"
7512 >filter{refresh-tags}</A
7515 HREF="actions-file.html#FILTER-UNSOLICITED-POPUPS"
7516 >filter{unsolicited-popups}</A
7519 HREF="actions-file.html#FILTER-ALL-POPUPS"
7520 >filter{all-popups}</A
7523 HREF="actions-file.html#FILTER-IMG-REORDER"
7524 >filter{img-reorder}</A
7527 HREF="actions-file.html#FILTER-BANNERS-BY-SIZE"
7528 >filter{banners-by-size}</A
7531 HREF="actions-file.html#FILTER-BANNERS-BY-LINK"
7532 >filter{banners-by-link}</A
7535 HREF="actions-file.html#FILTER-WEBBUGS"
7539 HREF="actions-file.html#FILTER-TINY-TEXTFORMS"
7540 >filter{tiny-textforms}</A
7543 HREF="actions-file.html#FILTER-JUMPING-WINDOWS"
7544 >filter{jumping-windows}</A
7547 HREF="actions-file.html#FILTER-FRAMESET-BORDERS"
7548 >filter{frameset-borders}</A
7551 HREF="actions-file.html#FILTER-DEMORONIZER"
7552 >filter{demoronizer}</A
7555 HREF="actions-file.html#FILTER-SHOCKWAVE-FLASH"
7556 >filter{shockwave-flash}</A
7559 HREF="actions-file.html#FILTER-QUICKTIME-KIOSKMODE"
7560 >filter{quicktime-kioskmode}</A
7563 HREF="actions-file.html#FILTER-FUN"
7567 HREF="actions-file.html#FILTER-CRUDE-PARENTAL"
7568 >filter{crude-parental}</A
7571 HREF="actions-file.html#FILTER-IE-EXPLOITS"
7572 >filter{ie-exploits}</A
7575 HREF="actions-file.html#FILTER-CLIENT-HEADERS"
7576 >filter-client-headers</A
7579 HREF="actions-file.html#FILTER-SERVER-HEADERS"
7580 >filter-server-headers</A
7583 HREF="actions-file.html#FILTER-GOOGLE"
7587 HREF="actions-file.html#FILTER-YAHOO"
7591 HREF="actions-file.html#FILTER-MSN"
7595 HREF="actions-file.html#FILTER-BLOGSPOT"
7599 HREF="actions-file.html#FILTER-XML-TO-HTML"
7600 >filter-xml-to-html</A
7603 HREF="actions-file.html#FILTER-HTML-TO-XML"
7604 >filter-html-to-xml</A
7607 HREF="actions-file.html#FILTER-NO-PING"
7611 HREF="actions-file.html#FILTER-HIDE-TOR-EXIT-NOTATION"
7612 >filter-hide-tor-exit-notation</A
7615 HREF="actions-file.html#FORCE-TEXT-MODE"
7619 HREF="actions-file.html#HANDLE-AS-EMPTY-DOCUMENT"
7620 >handle-as-empty-document</A
7623 HREF="actions-file.html#HANDLE-AS-IMAGE"
7627 HREF="actions-file.html#HIDE-ACCEPT-LANGUAGE"
7628 >hide-accept-language</A
7631 HREF="actions-file.html#HIDE-CONTENT-DISPOSITION"
7632 >hide-content-disposition</A
7635 HREF="actions-file.html#HIDE-IF-MODIFIED-SINCE"
7636 >hide-if-modified-since</A
7639 HREF="actions-file.html#HIDE-FORWARDED-FOR-HEADERS"
7640 >hide-forwarded-for-headers</A
7643 HREF="actions-file.html#HIDE-FROM-HEADER"
7644 >hide-from-header{block}</A
7647 HREF="actions-file.html#HIDE-REFERER"
7648 >hide-referrer{forge}</A
7651 HREF="actions-file.html#HIDE-USER-AGENT"
7655 HREF="actions-file.html#INSPECT-JPEGS"
7659 HREF="actions-file.html#KILL-POPUPS"
7663 HREF="actions-file.html#LIMIT-CONNECT"
7667 HREF="actions-file.html#PREVENT-COMPRESSION"
7668 >prevent-compression</A
7671 HREF="actions-file.html#OVERWRITE-LAST-MODIFIED"
7672 >overwrite-last-modified</A
7675 HREF="actions-file.html#REDIRECT"
7679 HREF="actions-file.html#SEND-VANILLA-WAFER"
7680 >send-vanilla-wafer</A
7683 HREF="actions-file.html#SEND-WAFER"
7687 HREF="actions-file.html#SESSION-COOKIES-ONLY"
7688 >session-cookies-only</A
7691 HREF="actions-file.html#SET-IMAGE-BLOCKER"
7692 >set-image-blocker{pattern}</A
7695 HREF="actions-file.html#TREAT-FORBIDDEN-CONNECTS-LIKE-BLOCKS"
7696 >treat-forbidden-connects-like-blocks</A
7699 / # forward slash will match *all* potential URL patterns.</PRE
7705 > The default behavior is now set. Note that some actions, like not hiding
7706 the user agent, are part of a <SPAN
7708 >"general policy"</SPAN
7710 universally and won't get any exceptions defined later. Other choices,
7711 like not blocking (which is <SPAN
7718 default!) need exceptions, i.e. we need to specify explicitly what we
7719 want to block in later sections.</P
7721 > The first of our specialized sections is concerned with <SPAN
7725 sites, i.e. sites that require minimum interference, because they are either
7726 very complex or very keen on tracking you (and have mechanisms in place that
7727 make them unusable for people who avoid being tracked). We will simply use
7731 > alias instead of stating the list
7732 of actions explicitly:</P
7742 >##########################################################################
7743 # Exceptions for sites that'll break under the default action set:
7744 ##########################################################################
7746 # "Fragile" Use a minimum set of actions for these sites (see alias above):
7749 .office.microsoft.com # surprise, surprise!
7750 .windowsupdate.microsoft.com
7751 mail.google.com</PRE
7757 > Shopping sites are not as fragile, but they typically
7758 require cookies to log in, and pop-up windows for shopping
7759 carts or item details. Again, we'll use a pre-defined alias:</P
7773 .worldpay.com # for quietpc.com
7784 HREF="actions-file.html#FAST-REDIRECTS"
7788 action, which we enabled per default above, breaks some sites. So disable
7789 it for popular sites where we know it misbehaves:</P
7800 HREF="actions-file.html#FAST-REDIRECTS"
7806 .altavista.com/.*(like|url|link):http
7807 .altavista.com/trans.*urltext=http
7814 > It is important that <SPAN
7818 URLs belong to images, so that <SPAN
7825 be blocked, a substitute image can be sent, rather than an HTML page.
7826 Contacting the remote site to find out is not an option, since it
7827 would destroy the loading time advantage of banner blocking, and it
7828 would feed the advertisers (in terms of money <SPAN
7835 information). We can mark any URL as an image with the <TT
7838 HREF="actions-file.html#HANDLE-AS-IMAGE"
7842 and marking all URLs that end in a known image file extension is a
7853 >##########################################################################
7855 ##########################################################################
7857 # Define which file types will be treated as images, in case they get
7858 # blocked further down this file:
7861 HREF="actions-file.html#HANDLE-AS-IMAGE"
7864 /.*\.(gif|jpe?g|png|bmp|ico)$</PRE
7870 > And then there are known banner sources. They often use scripts to
7871 generate the banners, so it won't be visible from the URL that the
7872 request is for an image. Hence we block them <SPAN
7879 mark them as images in one go, with the help of our
7882 >+block-as-image</TT
7883 > alias defined above. (We could of
7884 course just as well use <TT
7887 HREF="actions-file.html#BLOCK"
7891 HREF="actions-file.html#HANDLE-AS-IMAGE"
7895 Remember that the type of the replacement image is chosen by the
7899 HREF="actions-file.html#SET-IMAGE-BLOCKER"
7900 >set-image-blocker</A
7903 action. Since all URLs have matched the default section with its
7907 HREF="actions-file.html#SET-IMAGE-BLOCKER"
7908 >set-image-blocker</A
7911 action before, it still applies and needn't be repeated:</P
7921 ># Known ad generators:
7926 .ad.*.doubleclick.net
7927 .a.yimg.com/(?:(?!/i/).)*$
7928 .a[0-9].yimg.com/(?:(?!/i/).)*$
7936 > One of the most important jobs of <SPAN
7940 is to block banners. Many of these can be <SPAN
7947 HREF="actions-file.html#FILTER"
7949 >{banners-by-size}</TT
7951 action, which we enabled above, and which deletes the references to banner
7952 images from the pages while they are loaded, so the browser doesn't request
7953 them anymore, and hence they don't need to be blocked here. But this naturally
7954 doesn't catch all banners, and some people choose not to use filters, so we
7955 need a comprehensive list of patterns for banner URLs here, and apply the
7959 HREF="actions-file.html#BLOCK"
7962 > action to them.</P
7964 > First comes many generic patterns, which do most of the work, by
7965 matching typical domain and path name components of banners. Then comes
7966 a list of individual patterns for specific sites, which is omitted here
7967 to keep the example short:</P
7977 >##########################################################################
7978 # Block these fine banners:
7979 ##########################################################################
7981 HREF="actions-file.html#BLOCK"
7991 /.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?)
7992 /(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/
7994 # Site-specific patterns (abbreviated):
8002 > It's quite remarkable how many advertisers actually call their banner
8008 >.com, or call the directory
8009 in which the banners are stored simply <SPAN
8013 generic patterns are surprisingly effective.</P
8015 > But being very generic, they necessarily also catch URLs that we don't want
8016 to block. The pattern <TT
8028 >.nasty-corp.com"</SPAN
8038 >.sourcefroge.net"</SPAN
8048 >l.some-provider.net."</SPAN
8050 well-known exceptions to the <TT
8053 HREF="actions-file.html#BLOCK"
8059 > Note that these are exceptions to exceptions from the default! Consider the URL
8062 >"downloads.sourcefroge.net"</SPAN
8063 >: Initially, all actions are deactivated,
8064 so it wouldn't get blocked. Then comes the defaults section, which matches the
8065 URL, but just deactivates the <TT
8068 HREF="actions-file.html#BLOCK"
8072 action once again. Then it matches <TT
8075 >, an exception to the
8076 general non-blocking policy, and suddenly
8080 HREF="actions-file.html#BLOCK"
8083 > applies. And now, it'll match
8090 HREF="actions-file.html#BLOCK"
8094 applies, so (unless it matches <SPAN
8100 > further down) it ends up
8104 HREF="actions-file.html#BLOCK"
8107 > action applying.</P
8117 >##########################################################################
8118 # Save some innocent victims of the above generic block patterns:
8119 ##########################################################################
8124 HREF="actions-file.html#BLOCK"
8127 adv[io]*. # (for advogato.org and advice.*)
8128 adsl. # (has nothing to do with ads)
8129 adobe. # (has nothing to do with ads either)
8130 ad[ud]*. # (adult.* and add.*)
8131 .edu # (universities don't host banners (yet!))
8132 .*loads. # (downloads, uploads etc)
8140 www.globalintersec.com/adv # (adv = advanced)
8141 www.ugu.com/sui/ugu/adv</PRE
8147 > Filtering source code can have nasty side effects,
8148 so make an exception for our friends at sourceforge.net,
8149 and all paths with <SPAN
8152 > in them. Note that
8156 HREF="actions-file.html#FILTER"
8166 > filters in one fell swoop!</P
8176 ># Don't filter code!
8179 HREF="actions-file.html#FILTER"
8186 .sourceforge.net</PRE
8195 > is of course much more
8196 comprehensive, but we hope this example made clear how it works.</P
8205 >8.7.2. user.action</H3
8207 > So far we are painting with a broad brush by setting general policies,
8208 which would be a reasonable starting point for many people. Now,
8209 you might want to be more specific and have customized rules that
8210 are more suitable to your personal habits and preferences. These would
8211 be for narrowly defined situations like your ISP or your bank, and should
8215 >, which is parsed after all other
8216 actions files and hence has the last word, over-riding any previously
8217 defined actions. <TT
8227 > place for your personal settings, since
8231 > is actively maintained by the
8235 > developers and you'll probably want
8236 to install updated versions from time to time.</P
8238 > So let's look at a few examples of things that one might typically do in
8252 ># My user.action file. <fred@foobar.com></PRE
8259 HREF="actions-file.html#ALIASES"
8261 > are local to the actions
8262 file that they are defined in, you can't use the ones from
8266 >, unless you repeat them here:</P
8276 ># Aliases are local to the file they are defined in.
8277 # (Re-)define aliases for this file:
8281 # These aliases just save typing later, and the alias names should
8282 # be self explanatory.
8284 +crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies
8285 -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
8286 allow-all-cookies = -crunch-all-cookies -session-cookies-only
8287 allow-popups = -filter{all-popups} -kill-popups
8288 +block-as-image = +block +handle-as-image
8289 -block-as-image = -block
8291 # These aliases define combinations of actions that are useful for
8292 # certain types of sites:
8294 fragile = -block -crunch-all-cookies -filter -fast-redirects -hide-referrer -kill-popups
8295 shop = -crunch-all-cookies allow-popups
8297 # Allow ads for selected useful free sites:
8299 allow-ads = -block -filter{banners-by-size} -filter{banners-by-link}
8301 # Alias for specific file types that are text, but might have conflicting
8302 # MIME types. We want the browser to force these to be text documents.
8303 handle-as-text = -<A
8304 HREF="actions-file.html#FILTER"
8307 HREF="actions-file.html#CONTENT-TYPE-OVERWRITE"
8308 >content-type-overwrite{text/plain}</A
8310 HREF="actions-file.html#FORCE-TEXT-MODE"
8313 HREF="actions-file.html#HIDE-CONTENT-DISPOSITION"
8314 >hide-content-disposition</A
8321 > Say you have accounts on some sites that you visit regularly, and
8322 you don't want to have to log in manually each time. So you'd like
8323 to allow persistent cookies for these sites. The
8326 >allow-all-cookies</TT
8327 > alias defined above does exactly
8328 that, i.e. it disables crunching of cookies in any direction, and the
8329 processing of cookies to make them only temporary.</P
8339 >{ allow-all-cookies }
8349 > Your bank is allergic to some filter, but you don't know which, so you disable them all:</P
8360 HREF="actions-file.html#FILTER"
8363 .your-home-banking-site.com</PRE
8369 > Some file types you may not want to filter for various reasons:</P
8379 ># Technical documentation is likely to contain strings that might
8380 # erroneously get altered by the JavaScript-oriented filters:
8385 # And this stupid host sends streaming video with a wrong MIME type,
8386 # so that Privoxy thinks it is getting HTML and starts filtering:
8388 stupid-server.example.com/</PRE
8394 > Example of a simple <A
8395 HREF="actions-file.html#BLOCK"
8397 > action. Say you've
8398 seen an ad on your favourite page on example.com that you want to get rid of.
8399 You have right-clicked the image, selected <SPAN
8401 >"copy image location"</SPAN
8403 and pasted the URL below while removing the leading http://, into a
8407 > section. Note that <TT
8411 > need not be specified, since all URLs ending in
8415 > will be tagged as images by the general rules as set
8416 in default.action anyway:</P
8427 HREF="actions-file.html#BLOCK"
8430 www.example.com/nasty-ads/sponsor.gif
8431 another.popular.site.net/more/junk/here/</PRE
8437 > The URLs of dynamically generated banners, especially from large banner
8438 farms, often don't use the well-known image file name extensions, which
8439 makes it impossible for <SPAN
8443 the file type just by looking at the URL.
8446 >+block-as-image</TT
8447 > alias defined above for
8449 Note that objects which match this rule but then turn out NOT to be an
8450 image are typically rendered as a <SPAN
8452 >"broken image"</SPAN
8454 browser. Use cautiously.</P
8464 >{ +block-as-image }
8474 > Now you noticed that the default configuration breaks Forbes Magazine,
8475 but you were too lazy to find out which action is the culprit, and you
8476 were again too lazy to give <A
8480 you just used the <TT
8483 > alias on the site, and
8490 > -- it worked. The <TT
8494 aliases disables those actions that are most likely to break a site. Also,
8495 good for testing purposes to see if it is <SPAN
8499 that is causing the problem or not. We later find other regular sites
8500 that misbehave, and add those to our personalized list of troublemakers:</P
8519 > You like the <SPAN
8522 > text replacements in <TT
8526 but it is disabled in the distributed actions file. (My colleagues on the team just
8527 don't have a sense of humour, that's why! ;-). So you'd like to turn it on in your private,
8528 update-safe config, once and for all:</P
8539 HREF="actions-file.html#FILTER-FUN"
8542 / # For ALL sites!</PRE
8548 > Note that the above is not really a good idea: There are exceptions
8549 to the filters in <TT
8553 really shouldn't be filtered, like code on CVS->Web interfaces. Since
8557 > has the last word, these exceptions
8558 won't be valid for the <SPAN
8561 > filtering specified here.</P
8563 > You might also worry about how your favourite free websites are
8564 funded, and find that they rely on displaying banner advertisements
8565 to survive. So you might want to specifically allow banners for those
8566 sites that you feel provide value to you:</P
8588 > has been aliased to
8592 HREF="actions-file.html#BLOCK"
8599 HREF="actions-file.html#FILTER-BANNERS-BY-SIZE"
8600 >filter{banners-by-size}</A
8606 HREF="actions-file.html#FILTER-BANNERS-BY-LINK"
8607 >filter{banners-by-link}</A
8611 > Invoke another alias here to force an over-ride of the MIME type <TT
8613 > application/x-sh</TT
8614 > which typically would open a download type
8615 dialog. In my case, I want to look at the shell script, and then I can save
8616 it should I choose to.</P
8636 > is generally the best place to define
8637 exceptions and additions to the default policies of
8641 >. Some actions are safe to have their
8642 default policies set here though. So let's set a default policy to have a
8646 > image as opposed to the checkerboard pattern for
8656 > of course matches all URL
8657 paths and patterns:</P
8668 HREF="actions-file.html#SET-IMAGE-BLOCKER"
8669 >set-image-blocker{blank}</A
8684 SUMMARY="Footer navigation table"
8713 HREF="filter-file.html"
8723 >The Main Configuration File</TD