1 <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V3.1//EN" [
2 <!entity % dummy "IGNORE">
3 <!entity supported SYSTEM "supported.sgml">
4 <!entity newfeatures SYSTEM "newfeatures.sgml">
5 <!entity p-intro SYSTEM "privoxy.sgml">
6 <!entity seealso SYSTEM "seealso.sgml">
7 <!entity buildsource SYSTEM "buildsource.sgml">
8 <!entity contacting SYSTEM "contacting.sgml">
9 <!entity history SYSTEM "history.sgml">
10 <!entity copyright SYSTEM "copyright.sgml">
11 <!entity license SYSTEM "license.sgml">
12 <!entity p-authors SYSTEM "p-authors.sgml">
13 <!entity config SYSTEM "p-config.sgml">
14 <!entity p-version "3.0.4">
15 <!entity p-status "BETA">
16 <!entity % p-authors-formal "INCLUDE"> <!-- include additional text, etc -->
17 <!entity % p-not-stable "INCLUDE">
18 <!entity % p-stable "IGNORE">
19 <!entity % p-text "IGNORE"> <!-- define we are not a text only doc -->
20 <!entity % p-doc "INCLUDE"> <!-- and we are a formal doc -->
21 <!entity % p-readme "IGNORE">
22 <!entity % user-man "IGNORE">
23 <!entity % config-file "IGNORE">
24 <!entity % p-supp-userman "IGNORE"> <!-- Omit some from supported.sgml -->
25 <!entity my-copy "©"> <!-- kludge for docbook2man -->
26 <!entity % draft "IGNORE"> <!-- WIP stuff -->
29 File : $Source: /cvsroot/ijbswa/current/doc/source/user-manual.sgml,v $
32 This file belongs into
33 ijbswa.sourceforge.net:/home/groups/i/ij/ijbswa/htdocs/
35 $Id: user-manual.sgml,v 2.13 2006/08/22 11:04:59 hal9 Exp $
37 Copyright (C) 2001- 2006 Privoxy Developers <developers@privoxy.org>
40 ========================================================================
41 NOTE: Please read developer-manual/documentation.html before touching
42 anything in this, or other Privoxy documentation.
43 ========================================================================
50 <title>Privoxy &p-version; User Manual</title>
54 <!-- Completely the wrong markup, but very little is allowed -->
55 <!-- in this part of an article. FIXME -->
56 <link linkend="copyright">Copyright</link> &my-copy; 2001 - 2004 by
57 <ulink url="http://www.privoxy.org/">Privoxy Developers</ulink>
61 <pubdate>$Id: user-manual.sgml,v 2.13 2006/08/22 11:04:59 hal9 Exp $</pubdate>
65 Note: the following should generate a separate page, and a live link to it,
66 all nicely done. But it doesn't for some mysterious reason. Please leave
67 commented unless it can be fixed proper. For the time being, the
68 copyright/license declarations will be in their own sgml.
81 This is here to keep vim syntax file from breaking :/
82 If I knew enough to fix it, I would.
83 PLEASE DO NOT REMOVE! HB: hal@foobox.net
89 The <citetitle>User Manual</citetitle> gives users information on how to
90 install, configure and use <ulink
91 url="http://www.privoxy.org/"><application>Privoxy</application></ulink>.
94 <!-- Include privoxy.sgml boilerplate: -->
96 <!-- end privoxy.sgml -->
99 You can find the latest version of the <citetitle>User Manual</citetitle> at <ulink
100 url="http://www.privoxy.org/user-manual/">http://www.privoxy.org/user-manual/</ulink>.
101 Please see the <link linkend="contact">Contact section</link> on how to
102 contact the developers.
106 <!-- Feel free to send a note to the developers at <email>ijbswa-developers@lists.sourceforge.net</email>. -->
112 <!-- ~~~~~ New section ~~~~~ -->
113 <sect1 label="1" id="introduction"><title>Introduction</title>
115 This documentation is included with the current &p-status; version of
116 <application>Privoxy</application>, v.&p-version;<![%p-not-stable;[,
117 and is mostly complete at this point. The most up to date reference for the
118 time being is still the comments in the source files and in the individual
119 configuration files. Development of version 3.0 is currently nearing
120 completion, and includes many significant changes and enhancements over
121 earlier versions. The target release date for
122 stable v3.0 is <quote>soon</quote> ;-)]]>.
125 <!-- include only in non-stable versions -->
128 Since this is a &p-status; version, not all new features are well tested. This
129 documentation may be slightly out of sync as a result (especially with
130 CVS sources). And there <emphasis>may be</emphasis> bugs, though hopefully
135 <!-- ~~~~~ New section ~~~~~ -->
136 <sect2 id="features"><title>Features</title>
138 In addition to <application>Internet Junkbuster's</application> traditional
139 features of ad and banner blocking and cookie management,
140 <application>Privoxy</application> provides new features<![%p-not-stable;[,
141 some of them currently under development]]>:
143 <!-- Include newfeatures.sgml boilerplate here: -->
145 <!-- end boilerplate -->
150 <!-- ~ End section ~ -->
153 <!-- ~~~~~ New section ~~~~~ -->
154 <sect1 id="installation"><title>Installation</title>
157 <application>Privoxy</application> is available both in convenient pre-compiled
158 packages for a wide range of operating systems, and as raw source code.
159 For most users, we recommend using the packages, which can be downloaded from our
160 <ulink url="http://sourceforge.net/projects/ijbswa/">Privoxy Project
165 Note: If you have a previous <application>Junkbuster</application> or
166 <application>Privoxy</application> installation on your system, you
167 will need to remove it. On some platforms, this may be done for you as part
168 of their installation procedure. (See below for your platform). In any case
169 <emphasis>be sure to backup your old configuration if it is valuable to
170 you.</emphasis> See the <link linkend="upgradersnote">note to
171 upgraders</link> section below.
174 <!-- ~~~~~ New section ~~~~~ -->
175 <sect2 id="installation-packages"><title>Binary Packages</title>
177 How to install the binary packages depends on your operating system:
180 <!-- ~~~~~ New section ~~~~~ -->
181 <sect3 id="installation-pack-rpm"><title>Red Hat, SuSE and Conectiva RPMs</title>
184 RPMs can be installed with <literal>rpm -Uvh privoxy-&p-version;-1.rpm</literal>,
185 and will use <filename>/etc/privoxy</filename> for the location
186 of configuration files.
190 Note that on Red Hat, <application>Privoxy</application> will
191 <emphasis>not</emphasis> be automatically started on system boot. You will
192 need to enable that using <command>chkconfig</command>,
193 <command>ntsysv</command>, or similar methods. Note that SuSE will
194 automatically start Privoxy in the boot process.
198 If you have problems with failed dependencies, try rebuilding the SRC RPM:
199 <literal>rpm --rebuild privoxy-&p-version;-1.src.rpm</literal>. This
200 will use your locally installed libraries and RPM version.
204 Also note that if you have a <application>Junkbuster</application> RPM installed
205 on your system, you need to remove it first, because the packages conflict.
206 Otherwise, RPM will try to remove <application>Junkbuster</application>
207 automatically, before installing <application>Privoxy</application>.
211 <!-- ~~~~~ New section ~~~~~ -->
212 <sect3 id="installation-deb"><title>Debian</title>
214 DEBs can be installed with <literal>apt-get install privoxy</literal>,
215 and will use <filename>/etc/privoxy</filename> for the location of
220 <!-- ~~~~~ New section ~~~~~ -->
221 <sect3 id="installation-pack-win"><title>Windows</title>
224 Just double-click the installer, which will guide you through
225 the installation process. You will find the configuration files
226 in the same directory as you installed Privoxy in.
230 <!-- ~~~~~ New section ~~~~~ -->
231 <sect3 id="installation-pack-bintgz"><title>Solaris, NetBSD, FreeBSD, HP-UX</title>
234 Create a new directory, <literal>cd</literal> to it, then unzip and
235 untar the archive. For the most part, you'll have to figure out where
236 things go. <!-- FIXME, more info needed? -->
240 <!-- ~~~~~ New section ~~~~~ -->
241 <sect3 id="installation-os2"><title>OS/2</title>
244 First, make sure that no previous installations of
245 <application>Junkbuster</application> and / or
246 <application>Privoxy</application> are left on your
247 system. Check that no <application>Junkbuster</application>
248 or <application>Privoxy</application> objects are in
254 Then, just double-click the WarpIN self-installing archive, which will
255 guide you through the installation process. A shadow of the
256 <application>Privoxy</application> executable will be placed in your
257 startup folder so it will start automatically whenever OS/2 starts.
261 The directory you choose to install <application>Privoxy</application>
262 into will contain all of the configuration files.
266 <!-- ~~~~~ New section ~~~~~ -->
267 <sect3 id="installation-mac"><title>Mac OSX</title>
269 Unzip the downloaded file (you can either double-click on the file
270 from the finder, or from the desktop if you downloaded it there).
271 Then, double-click on the package installer icon named
272 <literal>Privoxy.pkg</literal>
273 and follow the installation process.
274 <application>Privoxy</application> will be installed in the folder
275 <literal>/Library/Privoxy</literal>.
276 It will start automatically whenever you start up. To prevent it from
277 starting automatically, remove or rename the folder
278 <literal>/Library/StartupItems/Privoxy</literal>.
281 To start Privoxy by hand, double-click on
282 <literal>StartPrivoxy.command</literal> in the
283 <literal>/Library/Privoxy</literal> folder.
284 Or, type this command in the Terminal:
288 /Library/Privoxy/StartPrivoxy.command
292 You will be prompted for the administrator password.
296 <!-- ~~~~~ New section ~~~~~ -->
297 <sect3 id="installation-amiga"><title>AmigaOS</title>
299 Copy and then unpack the <filename>lha</filename> archive to a suitable location.
300 All necessary files will be installed into <application>Privoxy</application>
301 directory, including all configuration and log files. To uninstall, just
302 remove this directory.
306 <!-- ~~~~~ New section ~~~~~ -->
307 <sect3 id="installattion-gentoo"><title>Gentoo</title>
309 Gentoo source packages (Ebuilds) for <application>Privoxy</application> are
310 contained in the Gentoo Portage Tree (they are not on the download page,
311 but there is a Gentoo section, where you can see when a new
312 <application>Privoxy</application> Version is added to the Portage Tree).
315 Before installing <application>Privoxy</application> under Gentoo just do
316 first <literal>emerge rsync</literal> to get the latest changes from the
317 Portage tree. With <literal>emerge privoxy</literal> you install the latest
321 Configuration files are in <filename>/etc/privoxy</filename>, the
322 documentation is in <filename>/usr/share/doc/privoxy-&p-version;</filename>
323 and the Log directory is in <filename>/var/log/privoxy</filename>.
329 <!-- ~~~~~ New section ~~~~~ -->
330 <sect2 id="installation-source"><title>Building from Source</title>
333 The most convenient way to obtain the <application>Privoxy</application> sources
334 is to download the source tarball from our <ulink url="http://sf.net/projects/ijbswa/">project
339 If you like to live on the bleeding edge and are not afraid of using
340 possibly unstable development versions, you can check out the up-to-the-minute
341 version directly from <ulink url="http://sourceforge.net/cvs/?group_id=11118">the
342 CVS repository</ulink>.
344 deprecated...out of business.
345 or simply download <ulink
346 url="http://cvs.sourceforge.net/cvstarballs/ijbswa-cvsroot.tar.bz2">the nightly CVS
351 <!-- include buildsource.sgml boilerplate: -->
353 <!-- end boilerplate -->
356 <!-- ~~~~~ New section ~~~~~ -->
357 <sect2 id="installation-keepupdated"><title>Keeping your Installation Up-to-Date</title>
359 As user feedback comes in and development continues, we will make updated versions
360 of both the main <link linkend="actions-file">actions file</link> (as a <ulink
361 url="http://sourceforge.net/project/showfiles.php?group_id=11118&release_id=103670">separate
362 package</ulink>) and the software itself (including the actions file) available for
367 If you wish to receive an email notification whenever we release updates of
368 <application>Privoxy</application> or the actions file, <ulink
369 url="http://lists.sourceforge.net/lists/listinfo/ijbswa-announce/">subscribe
370 to our announce mailing list</ulink>, ijbswa-announce@lists.sourceforge.net.
374 In order not to lose your personal changes and adjustments when updating
375 to the latest <literal>default.action</literal> file we <emphasis>strongly
376 recommend</emphasis> that you use <literal>user.action</literal> for your
377 customization of <application>Privoxy</application>. See the <link
378 linkend="actions-file">Chapter on actions files</link> for details.
386 <!-- ~ End section ~ -->
388 <!-- ~~~~~ New section ~~~~~ -->
389 <sect1 id="whatsnew">
390 <title>What's New in this Release</title>
392 There are many new features in <application>Privoxy</application> &p-version;
400 Mulitiple <link linkend="filter-file">filter files</link> can now be specifed in <filename>config</filename>.
406 There are a number of new <link linkend="actions-file">actions</link>:
414 <literal><link linkend="content-type-overwrite">content-type-overwrite</link></literal>
419 <literal><link linkend="crunch-client-header">crunch-client-header</link></literal>
424 <literal><link linkend="crunch-if-none-match">crunch-if-none-match</link></literal>
429 <literal><link linkend="crunch-server-header">crunch-server-header</link></literal>
434 <literal><link linkend="fast-redirects">fast-redirects</link></literal>
439 <literal><link linkend="force-text-mode">force-text-mode</link></literal>
444 <literal><link linkend="handle-as-empty-document">handle-as-empty-document</link></literal>
449 <literal><link linkend="hide-accept-language">hide-accept-language</link></literal>
454 <literal><link linkend="hide-content-disposition">hide-content-disposition</link></literal>
459 <literal><link linkend="hide-if-modified-since">hide-if-modified-since</link></literal>
464 <literal><link linkend="hide-referrer">hide-referrer</link></literal>
469 <literal><link linkend="inspect-jpegs">inspect-jpegs</link></literal>
474 <literal><link linkend="overwrite-last-modified">overwrite-last-modified</link></literal>
479 <literal><link linkend="redirect">redirect</link></literal>
484 <literal><link linkend="treat-forbidden-connects-like-blocks">treat-forbidden-connects-like-blocks</link></literal>
495 <application>MS-Windows</application> versions can now be installed and
496 started as a <emphasis>service</emphasis>.
502 In addition, there are various bug fixes and enhancements, including
503 error pages are no longer cached, better DNS error handling, and logging
512 <!-- ~~~~~ New section ~~~~~ -->
514 <sect2 id="upgradersnote">
515 <title>Note to Upgraders</title>
518 A quick list of things to be aware of before upgrading from earlier
519 versions of <application>Privoxy</application>:
527 Some installers may remove earlier versions completely, including
528 configuration files. Save any important configuration files!
533 What constitutes a <quote>default</quote> configuration has changed,
534 and you may want to review which actions are <quote>on</quote> by
535 default. This is primarily a matter of emphasis, but some features
536 you may have been used to, may now be <quote>off</quote> by default.
542 <!-- I think it is best to keep this somewhat vague, in case -->
543 <!-- the situation changes under our feet. -->
544 Some installers may not automatically start
545 <application>Privoxy</application> after installation.
554 <!-- ~~~~~ New section ~~~~~ -->
555 <sect1 id="quickstart"><title>Quickstart to Using <application>Privoxy</application></title>
561 Install <application>Privoxy</application>. See the <link
562 linkend="installation">Installation Section</link> below for platform specific
569 Advanced users and those who want to offer <application>Privoxy</application>
570 service to more than just their local machine should check the <link
571 linkend="config">main config file</link>, especially the <link
572 linkend="access-control">security-relevant</link> options. These are
579 Start <application>Privoxy</application>, if the installation program has
580 not done this already (may vary according to platform). See the section
581 <link linkend="startup">Starting <application>Privoxy</application></link>.
587 Set your browser to use <application>Privoxy</application> as HTTP and
588 HTTPS (SSL) proxy by setting the proxy configuration for address of
589 <literal>127.0.0.1</literal> and port <literal>8118</literal>.
590 (<application>Junkbuster</application> and earlier versions of
591 <application>Privoxy</application> used port 8000.) See the section <link
592 linkend="startup">Starting <application>Privoxy</application></link> below
593 for more details on this.
599 Flush your browser's disk and memory caches, to remove any cached ad images.
600 If using <application>Privoxy</application> to manage cookies, you should
601 remove any currently stored cookies too.
607 A default installation should provide a reasonable starting point for
608 most. There will undoubtedly be occasions where you will want to adjust the
609 configuration, but that can be dealt with as the need arises. Little
610 to no initial configuration is required in most cases.
613 See the <link linkend="configuration">Configuration section</link> for more
614 configuration options, and how to customize your installation.
615 <![%draft;[ You might also want to look at the <link
616 linkend="quickstart-ad-blocking">next section</link> for a quick
617 introduction to how <application>Privoxy</application> blocks ads and
624 If you experience ads that slipped through, innocent images that are
625 blocked, or otherwise feel the need to fine-tune
626 <application>Privoxy's</application> behaviour, take a look at the <link
627 linkend="actions-file">actions files</link>. As a quick start, you might
628 find the <link linkend="act-examples">richly commented examples</link>
629 helpful. You can also view and edit the actions files through the <ulink
630 url="http://config.privoxy.org">web-based user interface</ulink>. The
631 Appendix <quote><link linkend="actionsanat">Anatomy of an
632 Action</link></quote> has hints how to debug actions that
633 <quote>misbehave</quote>.
639 For easy access to Privoxy's most important controls, drag the provided
640 <link linkend="bookmarklets">Bookmarklets</link> into your browser's
647 Please see the section <link linkend="contact">Contacting the
648 Developers</link> on how to report bugs or problems with websites or to get
655 Now enjoy surfing with enhanced control, comfort and privacy!
663 <!-- ~~~~~ New section ~~~~~ -->
665 <sect2 id="quickstart-ad-blocking">
666 <title>Quickstart to Ad Blocking</title>
668 NOTE: This section is deliberately redundant for those that don't
669 want to read the whole thing (which is getting lengthy).
672 Ad blocking is but one of <application>Privoxy's</application>
673 array of features. Many of these features are for the technically minded advanced
674 user. But, ad and banner blocking is surely common ground for everybody.
677 This section will provide a quick summary of ad blocking so
678 you can get up to speed quickly without having to read the more extensive
679 information provided below, though this is highly recommended.
682 First a bit of a warning ... blocking ads is much like blocking SPAM: the
683 more aggressive you are about it, the more likely you are to block
684 things that were not intended. So there is a trade off here. If you want
685 extreme ad free browsing, be prepared to deal with more
686 <quote>problem</quote> sites, and to spend more time adjusting the
687 configuration to solve these unintended consequences. In short, there is
688 not an easy way to eliminate <emphasis>all</emphasis> ads. Either take
689 the easy way and settle for <emphasis>most</emphasis> ads blocked with the
690 default configuration, or jump in and tweak it for your personal surfing
691 habits and preferences.
694 Secondly, a brief explanation of <application>Privoxy's </application>
695 <quote>actions</quote>. <quote>Actions</quote> in this context, are
696 the directives we use to tell <application>Privoxy</application> to perform
697 some task relating to HTTP transactions (i.e. web browsing). We tell
698 <application>Privoxy</application> to take some <quote>action</quote>. Each
699 action has a unique name and function. While there are many potential
700 <application>actions</application> in <application>Privoxy's</application>
701 arsenal, only a few are used for ad blocking. <link
702 linkend="actions">Actions</link>, and <link linkend="actions-file">action
703 configuration files</link>, are explained in depth below.
706 Actions are specified in <application>Privoxy's</application> configuration,
707 followed by one or more URLs to which the action should apply. URLs
708 can actually be URL type <link linkend="af-patterns">patterns</link> that use
709 wildcards so they can apply potentially to a range of similar URLs. The
710 actions, together with the URL patterns are called a section.
713 When you connect to a website, the full URL will either match one or more
714 of the sections as defined in <application>Privoxy's</application> configuration,
715 or not. If so, then <application>Privoxy</application> will perform the
716 respective actions. If not, then nothing special happens. Furthermore, web
717 pages may contain embedded, secondary URLs that your web browser will
718 use to load additional components of the page, as it parses the
719 original page's HTML content. An ad image for instance, is just an URL
720 embedded in the page somewhere. The image itself may be on the same server,
721 or a server somewhere else on the Internet. Complex web pages will have many
726 The actions we need to know about for ad blocking are: <literal><link
727 linkend="block">block</link></literal>, <literal><link
728 linkend="handle-as-image">handle-as-image</link></literal>, and
729 <literal><link linkend="set-image-blocker">set-image-blocker</link></literal>:
737 <literal><link linkend="block">block</link></literal> - this action stops
738 any contact between your browser and any URL patterns that match this
739 action's configuration. It can be used for blocking ads, but also anything
740 that is determined to be unwanted. By itself, it simply stops any
741 communication with the remote server and sends <application>Privoxy</application>'s
742 own built-in BLOCKED page instead to let you now what has happened.
748 <literal><link linkend="handle-as-image">handle-as-image</link></literal> -
749 tells <application>Privoxy</application> to treat this URL as an image.
750 <application>Privoxy</application>'s default configuration already does this
751 for all common image types (e.g. GIF), but there are many situations where this
752 is not so easy to determine. So we'll force it in these cases. This is particularly
753 important for ad blocking, since only if we know that it's an image of
754 some kind, can we replace it with an image of our choosing, instead of the
755 <application>Privoxy</application> BLOCKED page (which would only result in
756 a <quote>broken image</quote> icon). There are some limitations to this
757 though. For instance, you can't just brute-force an image substitution for
758 an entire HTML page in most situations.
765 linkend="set-image-blocker">set-image-blocker</link></literal> - tells
766 <application>Privoxy</application> what to display in place of an ad image that
767 has hit a block rule. For this to come into play, the URL must match a
768 <literal><link linkend="block">block</link></literal> action somewhere in the
769 configuration, <emphasis>and</emphasis>, it must also match an
770 <literal><link linkend="handle-as-image">handle-as-image</link></literal> action.
773 The configuration options on what to display instead of the ad are:
777 <emphasis>pattern</emphasis> - a checkerboard pattern, so that an ad
778 replacement is obvious. This is the default.
783 <emphasis>blank</emphasis> - A very small empty GIF image is displayed.
784 This is the so-called <quote>invisible</quote> configuration option.
789 <emphasis>http://<URL></emphasis> - A redirect to any image anywhere
790 of the user's choosing (advanced usage).
799 The quickest way to adjust any of these settings is with your browser through
800 the special <application>Privoxy</application> editor at <ulink
801 url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>
802 (shortcut: <ulink url="http://p.p/">http://p.p/show-status</ulink>). This
803 is an internal page, and does not require Internet access. Select the
804 appropriate <quote>actions</quote> file, and click
805 <quote><guibutton>Edit</guibutton></quote>. It is best to put personal or
806 local preferences in <filename>user.action</filename> since this is not
807 meant to be overwritten during upgrades, and will over-ride the settings in
808 other files. Here you can insert new <quote>actions</quote>, and URLs for ad
809 blocking or other purposes, and make other adjustments to the configuration.
810 <application>Privoxy</application> will detect these changes automatically.
814 A quick and simple step by step example:
822 Right click on the ad image to be blocked, then select
823 <quote><guimenuitem>Copy Link Location</guimenuitem></quote> from the
831 url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>
836 Find <filename>user.action</filename> in the top section, and click
837 on <quote><guibutton>Edit</guibutton></quote>:
840 <!-- image of editor and actions files selections -->
842 <figure pgwide="0" float="0"><title>Actions Files in Use</title>
845 <imagedata fileref="../images/files-in-use.jpg" format="jpg">
848 <phrase>[ Screenshot of Actions Files in Use ]</phrase>
857 You should have a section with only
858 <literal><link linkend="block">block</link></literal> listed under
859 <quote>Actions:</quote>.
860 If not, click a <quote><guibutton>Insert new section below</guibutton></quote>
861 button, and in the new section that just appeared, click the
862 <guibutton>Edit</guibutton> button right under the word <quote>Actions:</quote>.
863 This will bring up a list of all actions. Find
864 <literal><link linkend="block">block</link></literal> near the top, and click
865 in the <quote>Enabled</quote> column, then <quote><guibutton>Submit</guibutton></quote>
871 Now, in the <literal><link linkend="block">block</link></literal> actions section,
872 click the <quote><guibutton>Add</guibutton></quote> button, and paste the URL the
873 browser got from <quote><guimenuitem>Copy Link Location</guimenuitem></quote>.
874 Remove the <literal>http://</literal> at the beginning of the URL. Then, click
875 <quote><guibutton>Submit</guibutton></quote> (or
876 <quote><guibutton>OK</guibutton></quote> if in a pop-up window).
881 Now go back to the original page, and press <keycap>SHIFT-Reload</keycap>
882 (or flush all browser caches). The image should be gone now.
890 This is a very crude and simple example. There might be good reasons to use a
891 wildcard pattern match to include potentially similar images from the same
892 site. For a more extensive explanation of <quote>patterns</quote>, and
893 the entire actions concept, see <link linkend="actions-file">the Actions
898 For advanced users who want to hand edit their config files, you might want
899 to now go to the <link linkend="act-examples">Actions Files Tutorial</link>.
900 The ideas explained therein also apply to the web-based editor.
907 <!-- ~ End section ~ -->
910 <!-- ~~~~~ New section ~~~~~ -->
912 <title>Starting <application>Privoxy</application></title>
914 Before launching <application>Privoxy</application> for the first time, you
915 will want to configure your browser(s) to use
916 <application>Privoxy</application> as a HTTP and HTTPS proxy. The default is
917 127.0.0.1 (or localhost) for the proxy address, and port 8118 (earlier versions
918 used port 8000). This is the one configuration step that must be done!
921 Please note that <application>Privoxy</application> can only proxy HTTP and
922 HTTPS traffic. It will not work with FTP or other protocols.
925 <!-- image of Mozilla Proxy configuration -->
927 <figure pgwide="0" float="0"><title>Proxy Configuration (Mozilla)</title>
930 <imagedata fileref="../images/proxy_setup.jpg" format="jpg">
933 <phrase>[ Screenshot of Mozilla Proxy Configuration ]</phrase>
941 With <application>Firefox</application>, this can be set under:
945 <!-- Mix ascii and gui art, something for everybody -->
946 <!-- spacing on this is tricky -->
947 <guibutton>Tools</guibutton>
949 <guibutton>Options</guibutton>
951 <guibutton>General</guibutton>
953 <guibutton>Connection Settings</guibutton>
955 <guibutton>Manual Proxy Configuration</guibutton>
960 With <application>Netscape</application> (and
961 <application>Mozilla</application>), this can be set under:
966 <!-- Mix ascii and gui art, something for everybody -->
967 <!-- spacing on this is tricky -->
968 <guibutton>Edit</guibutton>
970 <guibutton>Preferences</guibutton>
972 <guibutton>Advanced</guibutton>
974 <guibutton>Proxies</guibutton>
976 <guibutton>HTTP Proxy</guibutton>
980 For <application>Internet Explorer</application>:
984 <!-- Mix ascii and gui art, something for everybody -->
985 <!-- spacing on this is tricky -->
986 <guibutton>Tools</guibutton>
988 <guibutton>Internet Properties</guibutton>
990 <guibutton>Connections</guibutton>
992 <guibutton>LAN Settings</guibutton>
996 Then, check <quote>Use Proxy</quote> and fill in the appropriate info
997 (Address: 127.0.0.1, Port: 8118). Include HTTPS (SSL), if you want HTTPS
1002 After doing this, flush your browser's disk and memory caches to force a
1003 re-reading of all pages and to get rid of any ads that may be cached. You
1004 are now ready to start enjoying the benefits of using
1005 <application>Privoxy</application>!
1009 <application>Privoxy</application> itself is typically started by specifying the
1010 main configuration file to be used on the command line. If no configuration
1011 file is specified on the command line, <application>Privoxy</application>
1012 will look for a file named <filename>config</filename> in the current
1013 directory. Except on Win32 where it will try <filename>config.txt</filename>.
1016 <sect2 id="start-redhat">
1017 <title>Red Hat and Conectiva</title>
1019 We use a script. Note that Red Hat does not start Privoxy upon booting per
1020 default. It will use the file <filename>/etc/privoxy/config</filename> as
1021 its main configuration file.
1025 # /etc/rc.d/init.d/privoxy start
1030 <sect2 id="start-debian">
1031 <title>Debian</title>
1033 We use a script. Note that Debian starts Privoxy upon booting per
1034 default. It will use the file
1035 <filename>/etc/privoxy/config</filename> as its main configuration
1040 # /etc/init.d/privoxy start
1045 <sect2 id="start-suse">
1048 We use a script. It will use the file <filename>/etc/privoxy/config</filename>
1049 as its main configuration file. Note that SuSE starts Privoxy upon booting
1059 <sect2 id="start-windows">
1060 <title>Windows</title>
1062 Click on the Privoxy Icon to start Privoxy. If no configuration file is
1063 specified on the command line, <application>Privoxy</application> will look
1064 for a file named <filename>config.txt</filename>. Note that Windows will
1065 automatically start Privoxy upon booting you PC.
1069 <sect2 id="start-unices">
1070 <title>Solaris, NetBSD, FreeBSD, HP-UX and others</title>
1072 Example Unix startup command:
1076 # /usr/sbin/privoxy /etc/privoxy/config
1081 <sect2 id="start-os2">
1084 During installation, <application>Privoxy</application> is configured to
1085 start automatically when the system restarts. You can start it manually by
1086 double-clicking on the <application>Privoxy</application> icon in the
1087 <application>Privoxy</application> folder.
1091 <sect2 id="start-macosx">
1092 <title>Mac OSX</title>
1094 During installation, <application>Privoxy</application> is configured to
1095 start automatically when the system restarts. To start Privoxy by hand,
1096 double-click on the <literal>StartPrivoxy.command</literal> icon in the
1097 <literal>/Library/Privoxy</literal> folder. Or, type this command
1102 /Library/Privoxy/StartPrivoxy.command
1106 You will be prompted for the administrator password.
1111 <sect2 id="start-amigaos">
1112 <title>AmigaOS</title>
1114 Start <application>Privoxy</application> (with RUN <>NIL:) in your
1115 <filename>startnet</filename> script (AmiTCP), in
1116 <filename>s:user-startup</filename> (RoadShow), as startup program in your
1117 startup script (Genesis), or as startup action (Miami and MiamiDx).
1118 <application>Privoxy</application> will automatically quit when you quit your
1119 TCP/IP stack (just ignore the harmless warning your TCP/IP stack may display that
1120 <application>Privoxy</application> is still running).
1124 <sect2 id="start-gentoo">
1125 <title>Gentoo</title>
1127 A script is again used. It will use the file <filename>/etc/privoxy/config
1128 </filename> as its main configuration file.
1132 /etc/init.d/privoxy start
1136 Note that <application>Privoxy</application> is not automatically started at
1137 boot time by default. You can change this with the <literal>rc-update</literal>
1142 rc-update add privoxy default
1150 See the section <link linkend="cmdoptions">Command line options</link> for
1154 must find a better place for this paragraph
1157 The included default configuration files should give a reasonable starting
1158 point. Most of the per site configuration is done in the
1159 <ulink url="actions-file.html"><quote>actions</quote></ulink> files. These are
1160 where various cookie actions are defined, ad and banner blocking, and other
1161 aspects of <application>Privoxy</application> configuration. There are several
1162 such files included, with varying levels of aggressiveness.
1166 You will probably want to keep an eye out for sites for which you may prefer
1167 persistent cookies, and add these to your actions configuration as needed. By
1168 default, most of these will be accepted only during the current browser
1169 session (aka <quote>session cookies</quote>), unless you add them to the
1170 configuration. If you want the browser to handle this instead, you will need
1171 to edit <filename>user.action</filename> (or through the web based interface)
1172 and disable this feature. If you use more than one browser, it would make
1173 more sense to let <application>Privoxy</application> handle this. In which
1174 case, the browser(s) should be set to accept all cookies.
1178 Another feature where you will probably want to define exceptions for trusted
1179 sites is the popup-killing (through the <ulink
1180 url="actions-file.html#KILL-POPUPS"><quote>+kill-popups</quote></ulink> and
1182 url="actions-file.html#FILTER-POPUPS"><quote>+filter{popups}</quote></ulink>
1183 actions), because your favorite shopping, banking, or leisure site may need
1184 popups (explained below).
1188 <application>Privoxy</application> is HTTP/1.1 compliant, but not all of
1189 the optional 1.1 features are as yet supported. In the unlikely event that
1190 you experience inexplicable problems with browsers that use HTTP/1.1 per default
1191 (like <application>Mozilla</application> or recent versions of I.E.), you might
1192 try to force HTTP/1.0 compatibility. For Mozilla, look under <literal>Edit ->
1193 Preferences -> Debug -> Networking</literal>.
1194 Alternatively, set the <quote>+downgrade-http-version</quote> config option in
1195 <filename>default.action</filename> which will downgrade your browser's HTTP
1196 requests from HTTP/1.1 to HTTP/1.0 before processing them.
1200 After running <application>Privoxy</application> for a while, you can
1201 start to fine tune the configuration to suit your personal, or site,
1202 preferences and requirements. There are many, many aspects that can
1203 be customized. <quote>Actions</quote>
1204 can be adjusted by pointing your browser to
1205 <ulink url="http://config.privoxy.org/">http://config.privoxy.org/</ulink>
1206 (shortcut: <ulink url="http://p.p/">http://p.p/</ulink>),
1207 and then follow the link to <quote>View & Change the Current Configuration</quote>.
1208 (This is an internal page and does not require Internet access.)
1212 In fact, various aspects of <application>Privoxy</application>
1213 configuration can be viewed from this page, including
1214 current configuration parameters, source code version numbers,
1215 the browser's request headers, and <quote>actions</quote> that apply
1216 to a given URL. In addition to the actions file
1217 editor mentioned above, <application>Privoxy</application> can also
1218 be turned <quote>on</quote> and <quote>off</quote> (toggled) from this page.
1222 If you encounter problems, try loading the page without
1223 <application>Privoxy</application>. If that helps, enter the URL where
1224 you have the problems into <ulink url="http://p.p/show-url-info">the browser
1225 based rule tracing utility</ulink>. See which rules apply and why, and
1226 then try turning them off for that site one after the other, until the problem
1227 is gone. When you have found the culprit, you might want to turn the rest on
1232 If the above paragraph sounds gibberish to you, you might want to <link
1233 linkend="actions-file">read more about the actions concept</link>
1234 or even dive deep into the <link linkend="actionsanat">Appendix
1239 If you can't get rid of the problem at all, think you've found a bug in
1240 Privoxy, want to propose a new feature or smarter rules, please see the
1241 section <link linkend="contact"><quote>Contacting the
1242 Developers</quote></link> below.
1247 <!-- ~~~~~ New section ~~~~~ -->
1248 <sect2 id="cmdoptions">
1249 <title>Command Line Options</title>
1251 <application>Privoxy</application> may be invoked with the following
1252 command-line options:
1260 <emphasis>--version</emphasis>
1263 Print version info and exit. Unix only.
1268 <emphasis>--help</emphasis>
1271 Print short usage info and exit. Unix only.
1276 <emphasis>--no-daemon</emphasis>
1279 Don't become a daemon, i.e. don't fork and become process group
1280 leader, and don't detach from controlling tty. Unix only.
1285 <emphasis>--pidfile FILE</emphasis>
1289 On startup, write the process ID to <emphasis>FILE</emphasis>. Delete the
1290 <emphasis>FILE</emphasis> on exit. Failure to create or delete the
1291 <emphasis>FILE</emphasis> is non-fatal. If no <emphasis>FILE</emphasis>
1292 option is given, no PID file will be used. Unix only.
1297 <emphasis>--user USER[.GROUP]</emphasis>
1301 After (optionally) writing the PID file, assume the user ID of
1302 <emphasis>USER</emphasis>, and if included the GID of GROUP. Exit if the
1303 privileges are not sufficient to do so. Unix only.
1308 <emphasis>--chroot</emphasis>
1312 Before changing to the user ID given in the <emphasis>--user</emphasis> option,
1313 chroot to that user's home directory, i.e. make the kernel pretend to the Privoxy
1314 process that the directory tree starts there. If set up carefully, this can limit
1315 the impact of possible vulnerabilities in Privoxy to the files contained in that hierarchy.
1321 <emphasis>configfile</emphasis>
1324 If no <emphasis>configfile</emphasis> is included on the command line,
1325 <application>Privoxy</application> will look for a file named
1326 <quote>config</quote> in the current directory (except on Win32
1327 where it will look for <quote>config.txt</quote> instead). Specify
1328 full path to avoid confusion. If no config file is found,
1329 <application>Privoxy</application> will fail to start.
1340 <!-- ~ End section ~ -->
1343 <!-- ~~~~~ New section ~~~~~ -->
1344 <sect1 id="configuration"><title><application>Privoxy</application> Configuration</title>
1346 All <application>Privoxy</application> configuration is stored
1347 in text files. These files can be edited with a text editor.
1348 Many important aspects of <application>Privoxy</application> can
1349 also be controlled easily with a web browser.
1353 <!-- ~~~~~ New section ~~~~~ -->
1356 <title>Controlling <application>Privoxy</application> with Your Web Browser</title>
1358 <application>Privoxy</application>'s user interface can be reached through the special
1359 URL <ulink url="http://config.privoxy.org/">http://config.privoxy.org/</ulink>
1360 (shortcut: <ulink url="http://p.p/">http://p.p/</ulink>),
1361 which is a built-in page and works without Internet access.
1362 You will see the following section:
1366 <!-- Needs to be put in a table and colorized -->
1369 <bridgehead renderas="sect2"> Privoxy Menu</bridgehead>
1373 ▪ <ulink url="http://config.privoxy.org/show-status">View & change the current configuration</ulink>
1376 ▪ <ulink url="http://config.privoxy.org/show-version">View the source code version numbers</ulink>
1379 ▪ <ulink url="http://config.privoxy.org/show-request">View the request headers.</ulink>
1382 ▪ <ulink url="http://config.privoxy.org/show-url-info">Look up which actions apply to a URL and why</ulink>
1385 ▪ <ulink url="http://config.privoxy.org/toggle">Toggle Privoxy on or off</ulink>
1388 ▪ <ulink url="http://www.privoxy.org/
1389 &p-version;/user-manual/">Documentation</ulink>
1397 This should be self-explanatory. Note the first item leads to an editor for the
1398 <link linkend="actions-file">actions files</link>, which is where the ad, banner,
1399 cookie, and URL blocking magic is configured as well as other advanced features of
1400 <application>Privoxy</application>. This is an easy way to adjust various
1401 aspects of <application>Privoxy</application> configuration. The actions
1402 file, and other configuration files, are explained in detail below.
1406 <quote>Toggle Privoxy On or Off</quote> is handy for sites that might
1407 have problems with your current actions and filters. You can in fact use
1408 it as a test to see whether it is <application>Privoxy</application>
1409 causing the problem or not. <application>Privoxy</application> continues
1410 to run as a proxy in this case, but all manipulation is disabled, i.e.
1411 <application>Privoxy</application> acts like a normal forwarding proxy. There
1412 is even a toggle <link linkend="bookmarklets">Bookmarklet</link> offered, so
1413 that you can toggle <application>Privoxy</application> with one click from
1419 <!-- ~ End section ~ -->
1424 <!-- ~~~~~ New section ~~~~~ -->
1426 <sect2 id="confoverview">
1427 <title>Configuration Files Overview</title>
1429 For Unix, *BSD and Linux, all configuration files are located in
1430 <filename>/etc/privoxy/</filename> by default. For MS Windows, OS/2, and
1431 AmigaOS these are all in the same directory as the
1432 <application>Privoxy</application> executable. <![%p-not-stable;[ The name
1433 and number of configuration files has changed from previous versions, and is
1434 subject to change as development progresses.]]>
1438 The installed defaults provide a reasonable starting point, though
1439 some settings may be aggressive by some standards. For the time being, the
1440 principle configuration files are:
1448 The <link linkend="config">main configuration file</link> is named <filename>config</filename>
1449 on Linux, Unix, BSD, OS/2, and AmigaOS and <filename>config.txt</filename>
1450 on Windows. This is a required file.
1456 <filename>default.action</filename> (the main <link linkend="actions-file">actions file</link>)
1457 is used to define which <quote>actions</quote> relating to banner-blocking, images, pop-ups,
1458 content modification, cookie handling etc should be applied by default. It also defines many
1459 exceptions (both positive and negative) from this default set of actions that enable
1460 <application>Privoxy</application> to selectively eliminate the junk, and only the junk, on
1461 as many websites as possible.
1464 Multiple actions files may be defined in <filename>config</filename>. These
1465 are processed in the order they are defined. Local customizations and locally
1466 preferred exceptions to the default policies as defined in
1467 <filename>default.action</filename> (which you will most probably want
1468 to define sooner or later) are probably best applied in
1469 <filename>user.action</filename>, where you can preserve them across
1470 upgrades. <filename>standard.action</filename> is for
1471 <application>Privoxy's</application> internal use.
1474 There is also a web based editor that can be accessed from
1476 url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>
1478 url="http://p.p/show-status">http://p.p/show-status</ulink>) for the
1479 various actions files.
1485 <quote>Filter files</quote> (the <link linkend="filter-file">filter
1486 file</link>) can be used to re-write the raw page content, including
1487 viewable text as well as embedded HTML and JavaScript, and whatever else
1488 lurks on any given web page. The filtering jobs are only pre-defined here;
1489 whether to apply them or not is up to the actions files.
1490 <filename>default.filter</filename> includes various filters made
1491 available for use by the developers. Some are much more intrusive than
1492 others, and all should be used with caution. You may define additional
1493 filter files in <filename>config</filename> as you can with
1494 actions files. We suggest <filename>user.filter</filename> for any
1495 locally defined filters or customizations.
1503 All files use the <quote><literal>#</literal></quote> character to denote a
1504 comment (the rest of the line will be ignored) and understand line continuation
1505 through placing a backslash ("<literal>\</literal>") as the very last character
1506 in a line. If the <literal>#</literal> is preceded by a backslash, it looses
1507 its special function. Placing a <literal>#</literal> in front of an otherwise
1508 valid configuration line to prevent it from being interpreted is called "commenting
1513 The actions files and filter files
1514 can use Perl style <link linkend="regex">regular expressions</link> for
1515 maximum flexibility.
1519 After making any changes, there is no need to restart
1520 <application>Privoxy</application> in order for the changes to take
1521 effect. <application>Privoxy</application> detects such changes
1522 automatically. Note, however, that it may take one or two additional
1523 requests for the change to take effect. When changing the listening address
1524 of <application>Privoxy</application>, these <quote>wake up</quote> requests
1525 must obviously be sent to the <emphasis>old</emphasis> listening address.
1530 While under development, the configuration content is subject to change.
1531 The below documentation may not be accurate by the time you read this.
1532 Also, what constitutes a <quote>default</quote> setting, may change, so
1533 please check all your configuration files on important issues.
1539 <!-- ~ End section ~ -->
1542 <!-- ~~~~~~~~ New section Header ~~~~~~~~~ -->
1544 <!-- **************************************************** -->
1545 <!-- Include config.sgml here -->
1546 <!-- This is where the entire config file is detailed. -->
1548 <!-- end include -->
1551 <!-- ~ End section ~ -->
1555 <!-- ~~~~~~~~ New section Header ~~~~~~~~~ -->
1557 <sect1 id="actions-file"><title>Actions Files</title>
1560 The actions files are used to define what actions
1561 <application>Privoxy</application> takes for which URLs, and thus determine
1562 how ad images, cookies and various other aspects of HTTP content and
1563 transactions are handled, and on which sites (or even parts thereof). There
1564 are three such files included with <application>Privoxy</application>
1565 with differing purposes:
1572 <filename>default.action</filename> - is the primary action file
1573 that sets the initial values for all actions. It is intended to
1574 provide a base level of functionality for
1575 <application>Privoxy's</application> array of features. So it is
1576 a set of broad rules that should work reasonably well for users everywhere.
1577 This is the file that the developers are keeping updated, and <link
1578 linkend="installation-keepupdated">making available to users</link>.
1583 <filename>user.action</filename> - is intended to be for local site
1584 preferences and exceptions. As an example, if your ISP or your bank
1585 has specific requirements, and need special handling, this kind of
1586 thing should go here. This file will not be upgraded.
1591 <filename>standard.action</filename> - is used by the web based editor,
1592 to set various pre-defined sets of rules for the default actions section
1593 in <filename>default.action</filename>. These have increasing levels of
1594 aggressiveness <emphasis>and have no influence on your browsing unless
1595 you select them explicitly in the editor</emphasis>. It is not recommend
1599 The default profiles, and their associated actions, as pre-defined in
1600 <filename>standard.action</filename> are:
1603 <table frame=all><title>Default Configurations</title>
1604 <tgroup cols=4 align=left colsep=1 rowsep=1>
1605 <colspec colname=c1>
1606 <colspec colname=c2>
1607 <colspec colname=c3>
1608 <colspec colname=c4>
1611 <entry>Feature</entry>
1612 <entry>Cautious</entry>
1613 <entry>Medium</entry>
1614 <entry>Adventuresome</entry>
1619 <!-- <entry>f1</entry> -->
1620 <!-- <entry>f2</entry> -->
1621 <!-- <entry>f3</entry> -->
1622 <!-- <entry>f4</entry> -->
1628 <entry>Ad-blocking by URL</entry>
1635 <entry>Ad-filtering by size</entry>
1642 <entry>GIF de-animation</entry>
1649 <entry>Referer forging</entry>
1656 <entry>Cookie handling</entry>
1658 <entry>session-only</entry>
1663 <entry>Pop-up killing</entry>
1664 <entry>unsolicited</entry>
1665 <entry>unsolicited</entry>
1670 <entry>Fast redirects</entry>
1677 <entry>HTML taming</entry>
1684 <entry>JavaScript taming</entry>
1691 <entry>Web-bug killing</entry>
1698 <entry>Fun text replacements</entry>
1705 <entry>Image tag reordering</entry>
1712 <entry>Ad-filtering by link</entry>
1719 <entry>Demoronizer</entry>
1736 The list of actions files to be used are defined in the main configuration
1737 file, and are processed in the order they are defined (e.g.
1738 <filename>default.action</filename> is typically process before
1739 <filename>user.action</filename>). The content of these can all be viewed and
1741 url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>.
1745 An actions file typically has multiple sections. If you want to use
1746 <quote>aliases</quote> in an actions file, you have to place the (optional)
1747 <link linkend="aliases">alias section</link> at the top of that file.
1748 Then comes the default set of rules which will apply universally to all
1749 sites and pages (be <emphasis>very careful</emphasis> with using such a
1750 universal set in <filename>user.action</filename> or any other actions file after
1751 <filename>default.action</filename>, because it will override the result
1752 from consulting any previous file). And then below that,
1753 exceptions to the defined universal policies. You can regard
1754 <filename>user.action</filename> as an appendix to <filename>default.action</filename>,
1755 with the advantage that is a separate file, which makes preserving your
1756 personal settings across <application>Privoxy</application> upgrades easier.
1760 Actions can be used to block anything you want, including ads, banners, or
1761 just some obnoxious URL that you would rather not see. Cookies can be accepted
1762 or rejected, or accepted only during the current browser session (i.e. not
1763 written to disk), content can be modified, JavaScripts tamed, user-tracking
1764 fooled, and much more. See below for a <link linkend="actions">complete list
1768 <!-- ~~~~~ New section ~~~~~ -->
1770 <title>Finding the Right Mix</title>
1772 Note that some <link linkend="actions">actions</link>, like cookie suppression
1773 or script disabling, may render some sites unusable that rely on these
1774 techniques to work properly. Finding the right mix of actions is not always easy and
1775 certainly a matter of personal taste. In general, it can be said that the more
1776 <quote>aggressive</quote> your default settings (in the top section of the
1777 actions file) are, the more exceptions for <quote>trusted</quote> sites you
1778 will have to make later. If, for example, you want to crunch all cookies per
1779 default, you'll have to make exceptions from that rule for sites that you
1780 regularly use and that require cookies for actually useful puposes, like maybe
1781 your bank, favorite shop, or newspaper.
1785 We have tried to provide you with reasonable rules to start from in the
1786 distribution actions files. But there is no general rule of thumb on these
1787 things. There just are too many variables, and sites are constantly changing.
1788 Sooner or later you will want to change the rules (and read this chapter again :).
1792 <!-- ~~~~~ New section ~~~~~ -->
1794 <title>How to Edit</title>
1796 The easiest way to edit the actions files is with a browser by
1797 using our browser-based editor, which can be reached from <ulink
1798 url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>.
1799 The editor allows both fine-grained control over every single feature on a
1800 per-URL basis, and easy choosing from wholesale sets of defaults like
1801 <quote>Cautious</quote>, <quote>Medium</quote> or <quote>Adventuresome</quote>.
1802 Warning: the <quote>Adventuresome</quote> setting is not only more aggressive,
1803 but includes settings that are fun and subversive, and which some may find of
1808 If you prefer plain text editing to GUIs, you can of course also directly edit the
1809 the actions files. Look at <filename>default.action</filename> which is richly
1815 <sect2 id="actions-apply">
1816 <title>How Actions are Applied to URLs</title>
1818 Actions files are divided into sections. There are special sections,
1819 like the <quote><link linkend="aliases">alias</link></quote> sections which will
1820 be discussed later. For now let's concentrate on regular sections: They have a
1821 heading line (often split up to multiple lines for readability) which consist
1822 of a list of actions, separated by whitespace and enclosed in curly braces.
1823 Below that, there is a list of URL patterns, each on a separate line.
1827 To determine which actions apply to a request, the URL of the request is
1828 compared to all patterns in each <quote>action file</quote> file. Every time it matches, the list of
1829 applicable actions for the URL is incrementally updated, using the heading
1830 of the section in which the pattern is located. If multiple matches for
1831 the same URL set the same action differently, the last match wins. If not,
1832 the effects are aggregated. E.g. a URL might match a regular section with
1833 a heading line of <literal>{
1834 +<link linkend="handle-as-image">handle-as-image</link> }</literal>,
1835 then later another one with just <literal>{
1836 +<link linkend="block">block</link> }</literal>, resulting
1837 in <emphasis>both</emphasis> actions to apply.
1841 You can trace this process for any given URL by visiting <ulink
1842 url="http://config.privoxy.org/show-url-info">http://config.privoxy.org/show-url-info</ulink>.
1846 More detail on this is provided in the Appendix, <link linkend="ACTIONSANAT">
1847 Anatomy of an Action</link>.
1851 <!-- ~~~~~ New section ~~~~~ -->
1852 <sect2 id="af-patterns">
1853 <title>Patterns</title>
1855 As mentioned, <application>Privoxy</application> uses <quote>patterns</quote>
1856 to determine what actions might apply to which sites and pages your browser
1857 attempts to access. These <quote>patterns</quote> use wild card type
1858 <emphasis>pattern</emphasis> matching to achieve a high degree of
1859 flexibility. This allows one expression to be expanded and potentially match
1860 against many similar patterns.
1864 Generally, a <application>Privoxy</application> pattern has the form
1865 <literal><domain>/<path></literal>, where both the
1866 <literal><domain></literal> and <literal><path></literal> are
1867 optional. (This is why the special <literal>/</literal> pattern matches all
1868 URLs). Note that the protocol portion of the URL pattern (e.g.
1869 <literal>http://</literal>) should <emphasis>not</emphasis> be included in
1870 the pattern. This is assumed already!
1875 <term><literal>www.example.com/</literal></term>
1878 is a domain-only pattern and will match any request to <literal>www.example.com</literal>,
1879 regardless of which document on that server is requested.
1884 <term><literal>www.example.com</literal></term>
1887 means exactly the same. For domain-only patterns, the trailing <literal>/</literal> may
1893 <term><literal>www.example.com/index.html</literal></term>
1896 matches only the single document <literal>/index.html</literal>
1897 on <literal>www.example.com</literal>.
1902 <term><literal>/index.html</literal></term>
1905 matches the document <literal>/index.html</literal>, regardless of the domain,
1906 i.e. on <emphasis>any</emphasis> web server.
1911 <term><literal>index.html</literal></term>
1914 matches nothing, since it would be interpreted as a domain name and
1915 there is no top-level domain called <literal>.html</literal>.
1922 <!-- ~~~~~ New section ~~~~~ -->
1923 <sect3><title>The Domain Pattern</title>
1926 The matching of the domain part offers some flexible options: if the
1927 domain starts or ends with a dot, it becomes unanchored at that end.
1933 <term><literal>.example.com</literal></term>
1936 matches any domain that <emphasis>ENDS</emphasis> in
1937 <literal>.example.com</literal>
1942 <term><literal>www.</literal></term>
1945 matches any domain that <emphasis>STARTS</emphasis> with
1946 <literal>www.</literal>
1951 <term><literal>.example.</literal></term>
1954 matches any domain that <emphasis>CONTAINS</emphasis> <literal>.example.</literal>
1955 (Correctly speaking: It matches any FQDN that contains <literal>example</literal> as a domain.)
1962 Additionally, there are wild-cards that you can use in the domain names
1963 themselves. They work pretty similar to shell wild-cards: <quote>*</quote>
1964 stands for zero or more arbitrary characters, <quote>?</quote> stands for
1965 any single character, you can define character classes in square
1966 brackets and all of that can be freely mixed:
1971 <term><literal>ad*.example.com</literal></term>
1974 matches <quote>adserver.example.com</quote>,
1975 <quote>ads.example.com</quote>, etc but not <quote>sfads.example.com</quote>
1980 <term><literal>*ad*.example.com</literal></term>
1983 matches all of the above, and then some.
1988 <term><literal>.?pix.com</literal></term>
1991 matches <literal>www.ipix.com</literal>,
1992 <literal>pictures.epix.com</literal>, <literal>a.b.c.d.e.upix.com</literal> etc.
1997 <term><literal>www[1-9a-ez].example.c*</literal></term>
2000 matches <literal>www1.example.com</literal>,
2001 <literal>www4.example.cc</literal>, <literal>wwwd.example.cy</literal>,
2002 <literal>wwwz.example.com</literal> etc., but <emphasis>not</emphasis>
2003 <literal>wwww.example.com</literal>.
2011 <!-- ~ End section ~ -->
2014 <!-- ~~~~~ New section ~~~~~ -->
2015 <sect3><title>The Path Pattern</title>
2018 <application>Privoxy</application> uses Perl compatible regular expressions
2019 (through the <ulink url="http://www.pcre.org/">PCRE</ulink> library) for
2024 There is an <link linkend="regex">Appendix</link> with a brief quick-start into regular
2025 expressions, and full (very technical) documentation on PCRE regex syntax is available on-line
2026 at <ulink url="http://www.pcre.org/man.txt">http://www.pcre.org/man.txt</ulink>.
2027 You might also find the Perl man page on regular expressions (<literal>man perlre</literal>)
2028 useful, which is available on-line at <ulink
2029 url="http://www.perldoc.com/perl5.6/pod/perlre.html">http://www.perldoc.com/perl5.6/pod/perlre.html</ulink>.
2033 Note that the path pattern is automatically left-anchored at the <quote>/</quote>,
2034 i.e. it matches as if it would start with a <quote>^</quote> (regular expression speak
2035 for the beginning of a line).
2039 Please also note that matching in the path is <emphasis>CASE INSENSITIVE</emphasis>
2040 by default, but you can switch to case sensitive at any point in the pattern by using the
2041 <quote>(?-i)</quote> switch: <literal>www.example.com/(?-i)PaTtErN.*</literal> will match
2042 only documents whose path starts with <literal>PaTtErN</literal> in
2043 <emphasis>exactly</emphasis> this capitalization.
2049 <!-- ~ End section ~ -->
2052 <!-- ~~~~~ New section ~~~~~ -->
2054 <sect2 id="actions">
2055 <title>Actions</title>
2057 All actions are disabled by default, until they are explicitly enabled
2058 somewhere in an actions file. Actions are turned on if preceded with a
2059 <quote>+</quote>, and turned off if preceded with a <quote>-</quote>. So a
2060 <literal>+action</literal> means <quote>do that action</quote>, e.g.
2061 <literal>+block</literal> means <quote>please block URLs that match the
2062 following patterns</quote>, and <literal>-block</literal> means <quote>don't
2063 block URLs that match the following patterns, even if <literal>+block</literal>
2064 previously applied.</quote>
2069 Again, actions are invoked by placing them on a line, enclosed in curly braces and
2070 separated by whitespace, like in
2071 <literal>{+some-action -some-other-action{some-parameter}}</literal>,
2072 followed by a list of URL patterns, one per line, to which they apply.
2073 Together, the actions line and the following pattern lines make up a section
2074 of the actions file.
2078 There are three classes of actions:
2085 Boolean, i.e the action can only be <quote>enabled</quote> or
2086 <quote>disabled</quote>. Syntax:
2090 +<replaceable class="function">name</replaceable> # enable action <replaceable class="parameter">name</replaceable>
2091 -<replaceable class="function">name</replaceable> # disable action <replaceable class="parameter">name</replaceable></screen>
2094 Example: <literal>+block</literal>
2101 Parameterized, where some value is required in order to enable this type of action.
2106 +<replaceable class="function">name</replaceable>{<replaceable class="parameter">param</replaceable>} # enable action and set parameter to <replaceable class="parameter">param</replaceable>,
2107 # overwriting parameter from previous match if necessary
2108 -<replaceable class="function">name</replaceable> # disable action. The parameter can be omitted</screen>
2111 Note that if the URL matches multiple positive forms of a parameterized action,
2112 the last match wins, i.e. the params from earlier matches are simply ignored.
2115 Example: <literal>+hide-user-agent{ Mozilla 1.0 }</literal>
2121 Multi-value. These look exactly like parameterized actions,
2122 but they behave differently: If the action applies multiple times to the
2123 same URL, but with different parameters, <emphasis>all</emphasis> the parameters
2124 from <emphasis>all</emphasis> matches are remembered. This is used for actions
2125 that can be executed for the same request repeatedly, like adding multiple
2126 headers, or filtering through multiple filters. Syntax:
2130 +<replaceable class="function">name</replaceable>{<replaceable class="parameter">param</replaceable>} # enable action and add <replaceable class="parameter">param</replaceable> to the list of parameters
2131 -<replaceable class="function">name</replaceable>{<replaceable class="parameter">param</replaceable>} # remove the parameter <replaceable class="parameter">param</replaceable> from the list of parameters
2132 # If it was the last one left, disable the action.
2133 <replaceable class="parameter">-name</replaceable> # disable this action completely and remove all parameters from the list</screen>
2136 Examples: <literal>+add-header{X-Fun-Header: Some text}</literal> and
2137 <literal>+filter{html-annoyances}</literal>
2145 If nothing is specified in any actions file, no <quote>actions</quote> are
2146 taken. So in this case <application>Privoxy</application> would just be a
2147 normal, non-blocking, non-anonymizing proxy. You must specifically enable the
2148 privacy and blocking features you need (although the provided default actions
2149 files will give a good starting point).
2153 Later defined actions always over-ride earlier ones. So exceptions
2154 to any rules you make, should come in the latter part of the file (or
2155 in a file that is processed later when using multiple actions files). For
2156 multi-valued actions, the actions are applied in the order they are specified.
2157 Actions files are processed in the order they are defined in
2158 <filename>config</filename> (the default installation has three actions
2159 files). It also quite possible for any given URL pattern to match more than
2160 one pattern and thus more than one set of actions!
2163 <!-- start actions listing -->
2165 The list of valid <application>Privoxy</application> actions are:
2169 <!-- ********************************************************** -->
2170 <!-- Please note the below defined actions use id's that are -->
2171 <!-- probably linked from other places, so please don't change. -->
2173 <!-- ********************************************************** -->
2176 <!-- ~~~~~ New section ~~~~~ -->
2178 <sect3 renderas="sect4" id="add-header">
2179 <title>add-header</title>
2183 <term>Typical use:</term>
2185 <para>Confuse log analysis, custom applications</para>
2190 <term>Effect:</term>
2193 Sends a user defined HTTP header to the web server.
2200 <!-- boolean, parameterized, Multi-value -->
2202 <para>Multi-value.</para>
2207 <term>Parameter:</term>
2210 Any string value is possible. Validity of the defined HTTP headers is not checked.
2211 It is recommended that you use the <quote><literal>X-</literal></quote> prefix
2221 This action may be specified multiple times, in order to define multiple
2222 headers. This is rarely needed for the typical user. If you don't know what
2223 <quote>HTTP headers</quote> are, you definitely don't need to worry about this
2230 <term>Example usage:</term>
2233 <screen>+add-header{X-User-Tracking: sucks}</screen>
2241 <!-- ~~~~~ New section ~~~~~ -->
2242 <sect3 renderas="sect4" id="block">
2243 <title>block</title>
2247 <term>Typical use:</term>
2249 <para>Block ads or other obnoxious content</para>
2254 <term>Effect:</term>
2257 Requests for URLs to which this action applies are blocked, i.e. the requests are not
2258 forwarded to the remote server, but answered locally with a substitute page or image,
2259 as determined by the <literal><link linkend="handle-as-image">handle-as-image</link></literal>
2260 and <literal><link linkend="set-image-blocker">set-image-blocker</link></literal> actions.
2267 <!-- boolean, parameterized, Multi-value -->
2269 <para>Boolean.</para>
2274 <term>Parameter:</term>
2284 <application>Privoxy</application> sends a special <quote>BLOCKED</quote> page
2285 for requests to blocked pages. This page contains links to find out why the request
2286 was blocked, and a click-through to the blocked content (the latter only if compiled with the
2287 force feature enabled). The <quote>BLOCKED</quote> page adapts to the available
2288 screen space -- it displays full-blown if space allows, or miniaturized and text-only
2289 if loaded into a small frame or window. If you are using <application>Privoxy</application>
2290 right now, you can take a look at the
2291 <ulink url="http://ads.bannerserver.example.com/nasty-ads/sponsor.html"><quote>BLOCKED</quote>
2295 A very important exception occurs if <emphasis>both</emphasis>
2296 <literal>block</literal> and <literal><link linkend="handle-as-image">handle-as-image</link></literal>,
2297 apply to the same request: it will then be replaced by an image. If
2298 <literal><link linkend="set-image-blocker">set-image-blocker</link></literal>
2299 (see below) also applies, the type of image will be determined by its parameter,
2300 if not, the standard checkerboard pattern is sent.
2303 It is important to understand this process, in order
2304 to understand how <application>Privoxy</application> deals with
2305 ads and other unwanted content.
2308 The <literal><link linkend="filter">filter</link></literal>
2309 action can perform a very similar task, by <quote>blocking</quote>
2310 banner images and other content through rewriting the relevant URLs in the
2311 document's HTML source, so they don't get requested in the first place.
2312 Note that this is a totally different technique, and it's easy to confuse the two.
2318 <term>Example usage (section):</term>
2321 <screen>{+block} # Block and replace with "blocked" page
2322 .nasty-stuff.example.com
2324 {+block +handle-as-image} # Block and replace with image
2336 <!-- ~~~~~ New section ~~~~~ -->
2337 <sect3 renderas="sect4" id="content-type-overwrite">
2341 <title>content-type-overwrite</title>
2345 <term>Typical use:</term>
2347 <para>Stop useless download menus from popping up, or change the browser's rendering mode</para>
2352 <term>Effect:</term>
2355 Replaces the <quote>Content-Type:</quote> HTTP server header.
2362 <!-- Boolean, Parameterized, Multi-value -->
2364 <para>Parameterized.</para>
2369 <term>Parameter:</term>
2381 The <quote>Content-Type:</quote> HTTP server header is used by the
2382 browser to decide what to do with the document. The value of this
2383 header can cause the browser to open a download menu instead of
2384 displaying the document by itself, even if the document's format is
2385 supported by the browser.
2388 The declared content type can also affect which rendering mode
2389 the browser chooses. If XHTML is delivered as <quote>text/html</quote>,
2390 many browsers treat it as yet another broken HTML document.
2391 If it is send as <quote>application/xml</quote>, browsers with
2392 XHTML support will only display it, if the syntax is correct.
2395 If you see a web site that proudly uses XHTML buttons, but sets
2396 <quote>Content-Type: text/html</quote>, you can use Privoxy
2397 to overwrite it with <quote>application/xml</quote> and validate
2398 the web master's claim inside your XHTML-supporting browser.
2399 If the syntax is incorrect, the browser will complain loudly.
2402 You can also go the opposite direction: if your browser prints
2403 error messages instead of rendering a document falsely declared
2404 as XHTML, you can overwrite the content type with
2405 <quote>text/html</quote> and have it rendered as broken HTML document.
2408 By default <literal>content-type-overwrite</literal> only replaces
2409 <quote>Content-Type:</quote> headers that look like some kind of text.
2410 If you want to overwrite it unconditionally, you have to combine it with
2411 <literal><link linkend="force-text-mode">force-text-mode</link></literal>.
2412 This limitation exists for a reason, think twice before circumventing it.
2415 Most of the time it's easier to enable
2416 <literal><link linkend="filter-server-headers">filter-server-headers</link></literal>
2417 and replace this action with a custom regular expression. It allows you
2418 to activate it for every document of a certain site and it will still
2419 only replace the content types you aimed at.
2422 Of course you can apply <literal>content-type-overwrite</literal>
2423 to a whole site and then make URL based exceptions, but it's a lot
2424 more work to get the same precision.
2430 <term>Example usage (sections):</term>
2433 <screen># Check if www.example.net/ really uses valid XHTML
2434 {+content-type-overwrite {application/xml}}
2436 # but leave the content type unmodified if the URL looks like a style sheet
2437 {-content-type-overwrite}
2438 www.example.net/*.\.css$
2439 www.example.net/*.style
2448 <!-- ~~~~~ New section ~~~~~ -->
2449 <sect3 renderas="sect4" id="crunch-client-header">
2453 <title>crunch-server-header</title>
2457 <term>Typical use:</term>
2459 <para>Remove a client header <application>Privoxy</application> has no dedicated action for.</para>
2464 <term>Effect:</term>
2467 Deletes every header send by the client that contains the string the user supplied as parameter.
2474 <!-- Boolean, Parameterized, Multi-value -->
2476 <para>Parameterized.</para>
2481 <term>Parameter:</term>
2493 This action allows you to block client headers for which no dedicated
2494 <application>Privoxy</application> action exists.
2495 <application>Privoxy</application> will remove every client header that
2496 contains the string you supplied as parameter.
2499 Regular expressions are <emphasis>not supported</emphasis> and you can't
2500 use this action to block different headers in the same request, unless
2501 they contain the same string.
2504 <literal>crunch-client-header</literal> is only meant for quick tests.
2505 If you have to block several different headers, or only want to modify
2506 parts of them, you should enable
2507 <literal><link linkend="filter-client-headers">filter-client-headers</link></literal>
2508 and create your own filter.
2512 Don't block any header without understanding the consequences.
2519 <term>Example usage (section):</term>
2522 <screen># Block the non-existent "Privacy-Violation:" client header
2523 {+crunch-client-header {Privacy-Violation:}}
2533 <!-- ~~~~~ New section ~~~~~ -->
2534 <sect3 renderas="sect4" id="crunch-if-none-match">
2535 <title>crunch-if-none-match</title>
2541 <term>Typical use:</term>
2543 <para>Prevent yet another way to track the user's steps between sessions.</para>
2548 <term>Effect:</term>
2551 Deletes the <quote>If-None-Match:</quote> HTTP client header.
2558 <!-- Boolean, Parameterized, Multi-value -->
2560 <para>Boolean.</para>
2565 <term>Parameter:</term>
2577 Removing the <quote>If-None-Match:</quote> HTTP client header
2578 is useful for filter testing, where you want to force a real
2579 reload instead of getting status code <quote>304</quote> which
2580 would cause the browser to use a cached copy of the page.
2583 It is also useful to make sure the header isn't used as a cookie
2587 Blocking the <quote>If-None-Match:</quote> header shouldn't cause any
2588 caching problems, as long as the <quote>If-Modified-Since:</quote> header
2589 isn't blocked as well.
2592 It is recommended to use this action together with
2593 <literal><link linkend="hide-if-modified-since">hide-if-modified-since</link></literal>
2595 <literal><link linkend="overwrite-last-modified">overwrite-last-modified</link></literal>.
2601 <term>Example usage (section):</term>
2604 <screen># Let the browser revalidate cached documents without being tracked across sessions
2605 {+hide-if-modified-since {-1} \
2606 +overwrite-last-modified {randomize} \
2607 +crunch-if-none-match}
2616 <!-- ~~~~~ New section ~~~~~ -->
2617 <sect3 renderas="sect4" id="crunch-incoming-cookies">
2618 <title>crunch-incoming-cookies</title>
2622 <term>Typical use:</term>
2625 Prevent the web server from setting any cookies on your system
2631 <term>Effect:</term>
2634 Deletes any <quote>Set-Cookie:</quote> HTTP headers from server replies.
2641 <!-- Boolean, Parameterized, Multi-value -->
2643 <para>Boolean.</para>
2648 <term>Parameter:</term>
2660 This action is only concerned with <emphasis>incoming</emphasis> cookies. For
2661 <emphasis>outgoing</emphasis> cookies, use
2662 <literal><link linkend="crunch-outgoing-cookies">crunch-outgoing-cookies</link></literal>.
2663 Use <emphasis>both</emphasis> to disable cookies completely.
2666 It makes <emphasis>no sense at all</emphasis> to use this action in conjunction
2667 with the <literal><link linkend="session-cookies-only">session-cookies-only</link></literal> action,
2668 since it would prevent the session cookies from being set. See also
2669 <literal><link linkend="filter-content-cookies">filter-content-cookies</link></literal>.
2675 <term>Example usage:</term>
2678 <screen>+crunch-incoming-cookies</screen>
2686 <!-- ~~~~~ New section ~~~~~ -->
2687 <sect3 renderas="sect4" id="crunch-server-header">
2688 <title>crunch-server-header</title>
2694 <term>Typical use:</term>
2696 <para>Remove a server header <application>Privoxy</application> has no dedicated action for.</para>
2701 <term>Effect:</term>
2704 Deletes every header sent by the server that contains the string the user supplied as parameter.
2711 <!-- Boolean, Parameterized, Multi-value -->
2713 <para>Parameterized.</para>
2718 <term>Parameter:</term>
2730 This action allows you to block server headers for which no dedicated
2731 <application>Privoxy</application> action exists. <application>Privoxy</application>
2732 will remove every server header that contains the string you supplied as parameter.
2735 Regular expressions are <emphasis>not supported</emphasis> and you can't
2736 use this action to block different headers in the same request, unless
2737 they contain the same string.
2740 <literal>crunch-server-header</literal> is only meant for quick tests.
2741 If you have to block several different headers, or only want to modify
2742 parts of them, you should enable
2743 <literal><link linkend="filter-server-headers">filter-server-headers</link></literal>
2744 and create your own filter.
2748 Don't block any header without understanding the consequences.
2755 <term>Example usage (section):</term>
2758 <screen># Crunch server headers that try to prevent caching
2759 {+crunch-server-header {no-cache}}
2768 <!-- ~~~~~ New section ~~~~~ -->
2769 <sect3 renderas="sect4" id="crunch-outgoing-cookies">
2770 <title>crunch-outgoing-cookies</title>
2774 <term>Typical use:</term>
2777 Prevent the web server from reading any cookies from your system
2783 <term>Effect:</term>
2786 Deletes any <quote>Cookie:</quote> HTTP headers from client requests.
2793 <!-- Boolean, Parameterized, Multi-value -->
2795 <para>Boolean.</para>
2800 <term>Parameter:</term>
2812 This action is only concerned with <emphasis>outgoing</emphasis> cookies. For
2813 <emphasis>incoming</emphasis> cookies, use
2814 <literal><link linkend="crunch-incoming-cookies">crunch-incoming-cookies</link></literal>.
2815 Use <emphasis>both</emphasis> to disable cookies completely.
2818 It makes <emphasis>no sense at all</emphasis> to use this action in conjunction
2819 with the <literal><link linkend="session-cookies-only">session-cookies-only</link></literal> action,
2820 since it would prevent the session cookies from being read.
2826 <term>Example usage:</term>
2829 <screen>+crunch-outgoing-cookies</screen>
2838 <!-- ~~~~~ New section ~~~~~ -->
2839 <sect3 renderas="sect4" id="deanimate-gifs">
2840 <title>deanimate-gifs</title>
2844 <term>Typical use:</term>
2846 <para>Stop those annoying, distracting animated GIF images.</para>
2851 <term>Effect:</term>
2854 De-animate GIF animations, i.e. reduce them to their first or last image.
2861 <!-- boolean, parameterized, Multi-value -->
2863 <para>Parameterized.</para>
2868 <term>Parameter:</term>
2871 <quote>last</quote> or <quote>first</quote>
2880 This will also shrink the images considerably (in bytes, not pixels!). If
2881 the option <quote>first</quote> is given, the first frame of the animation
2882 is used as the replacement. If <quote>last</quote> is given, the last
2883 frame of the animation is used instead, which probably makes more sense for
2884 most banner animations, but also has the risk of not showing the entire
2885 last frame (if it is only a delta to an earlier frame).
2888 You can safely use this action with patterns that will also match non-GIF
2889 objects, because no attempt will be made at anything that doesn't look like
2896 <term>Example usage:</term>
2899 <screen>+deanimate-gifs{last}</screen>
2906 <!-- ~~~~~ New section ~~~~~ -->
2907 <sect3 renderas="sect4" id="downgrade-http-version">
2908 <title>downgrade-http-version</title>
2912 <term>Typical use:</term>
2914 <para>Work around (very rare) problems with HTTP/1.1</para>
2919 <term>Effect:</term>
2922 Downgrades HTTP/1.1 client requests and server replies to HTTP/1.0.
2929 <!-- boolean, parameterized, Multi-value -->
2931 <para>Boolean.</para>
2936 <term>Parameter:</term>
2948 This is a left-over from the time when <application>Privoxy</application>
2949 didn't support important HTTP/1.1 features well. It is left here for the
2950 unlikely case that you experience HTTP/1.1 related problems with some server
2951 out there. Not all (optional) HTTP/1.1 features are supported yet, so there
2952 is a chance you might need this action.
2958 <term>Example usage (section):</term>
2961 <screen>{+downgrade-http-version}
2962 problem-host.example.com</screen>
2970 <!-- ~~~~~ New section ~~~~~ -->
2971 <sect3 renderas="sect4" id="fast-redirects">
2972 <title>fast-redirects</title>
2976 <term>Typical use:</term>
2978 <para>Fool some click-tracking scripts and speed up indirect links.</para>
2983 <term>Effect:</term>
2986 Detects redirection URLs and redirects the browser without contacting
2987 the redirection server first.
2994 <!-- boolean, parameterized, Multi-value -->
2996 <para>Parameterized.</para>
3001 <term>Parameter:</term>
3006 <quote>simple-check</quote> to just search for the string <quote>http://</quote>
3007 to detect redirection URLs.
3012 <quote>check-decoded-url</quote> to decode URLs (if necessary) before searching
3013 for redirection URLs.
3024 Many sites, like yahoo.com, don't just link to other sites. Instead, they
3025 will link to some script on their own servers, giving the destination as a
3026 parameter, which will then redirect you to the final target. URLs
3027 resulting from this scheme typically look like:
3028 <quote>http://www.example.org/click-tracker.cgi?target=http%3a//www.example.net/</quote>.
3031 Sometimes, there are even multiple consecutive redirects encoded in the
3032 URL. These redirections via scripts make your web browsing more traceable,
3033 since the server from which you follow such a link can see where you go
3034 to. Apart from that, valuable bandwidth and time is wasted, while your
3035 browser asks the server for one redirect after the other. Plus, it feeds
3039 This feature is currently not very smart and is scheduled for improvement.
3040 If it is enabled by default, you will have to create some exceptions to
3041 this action. It can lead to failures in several ways:
3044 Not every URLs with other URLs as parameters is evil.
3045 Some sites offer a real service that requires this information to work.
3046 For example a validation service needs to know, which document to validate.
3047 <literal>fast-redirects</literal> assumes that every URL parameter that
3048 looks like another URL is a redirection target, and will always redirect to
3049 the last one. Most of the time the assumption is correct, but if it isn't,
3050 the user gets redirected anyway.
3053 Another failure occurs if the URL contains other parameters after the URL parameter.
3055 <quote>http://www.example.org/?redirect=http%3a//www.example.net/&foo=bar</quote>.
3056 contains the redirection URL <quote>http://www.example.net/</quote>,
3057 followed by another parameter. <literal>fast-redirects</literal> doesn't know that
3058 and will cause a redirect to <quote>http://www.example.net/&foo=bar</quote>.
3059 Depending on the target server configuration, the parameter will be silently ignored
3060 or lead to a <quote>page not found</quote> error. It is possible to fix these redirected
3061 requests with <literal><link linkend="filter-client-headers">filter-client-headers</link></literal>
3062 but it requires a little effort.
3065 To detect a redirection URL, <literal>fast-redirects</literal> only
3066 looks for the string <quote>http://</quote>, either in plain text
3067 (invalid but often used) or encoded as <quote>http%3a//</quote>.
3068 Some sites use their own URL encoding scheme, encrypt the address
3069 of the target server or replace it with a database id. In theses cases
3070 <literal>fast-redirects</literal> is fooled and the request reaches the
3071 redirection server where it probably gets logged.
3077 <term>Example usage:</term>
3080 <screen>+fast-redirects{simple-check}</screen>
3083 <screen>+fast-redirects{check-decoded-url}</screen>
3092 <!-- ~~~~~ New section ~~~~~ -->
3093 <sect3 renderas="sect4" id="filter">
3094 <title>filter</title>
3098 <term>Typical use:</term>
3100 <para>Get rid of HTML and JavaScript annoyances, banner advertisements (by size), do fun text replacements, etc.</para>
3105 <term>Effect:</term>
3108 All files of text-based type, most notably HTML and JavaScript, to which this
3109 action applies, are filtered on-the-fly through the specified regular expression
3110 based substitutions. (Note: as of version 3.0.3 plain text documents
3111 are exempted from filtering, because web servers often use the
3112 <literal>text/plain</literal> MIME type for all files whose type they
3120 <!-- boolean, parameterized, Multi-value -->
3122 <para>Parameterized.</para>
3127 <term>Parameter:</term>
3130 The name of a filter, as defined in the <link linkend="filter-file">filter file</link>.
3131 Filters can be defined in one or more files as defined by the
3132 <literal><link linkend="filterfile">filterfile</link></literal>
3133 option in the <link linkend="config">config file</link>.
3134 <filename>default.filter</filename> is the collection of filters
3135 supplied by the developers. Locally defined filters should go
3136 in their own file, such as <filename>user.filter</filename>.
3139 When used in its negative form,
3140 and without parameters, filtering is completely disabled.
3149 For your convenience, there are a number of pre-defined filters available
3150 in the distribution filter file that you can use. See the examples below for
3154 Filtering requires buffering the page content, which may appear to
3155 slow down page rendering since nothing is displayed until all content has
3156 passed the filters. (It does not really take longer, but seems that way
3157 since the page is not incrementally displayed.) This effect will be more
3158 noticeable on slower connections.
3161 This is very powerful feature, and <quote>rolling your own</quote>
3162 filters requires a knowledge of regular expressions and HTML.
3165 The amount of data that can be filtered is limited to the
3166 <literal><link linkend="buffer-limit">buffer-limit</link></literal>
3167 option in the main <link linkend="config">config file</link>. The
3168 default is 4096 KB (4 Megs). Once this limit is exceeded, the buffered
3169 data, and all pending data, is passed through unfiltered.
3172 Inadequate MIME types, such as zipped files, are not filtered at all.
3173 (Again, only text-based types except plain text). Encrypted SSL data
3174 (from HTTPS servers) cannot be filtered either, since this would violate
3175 the integrity of the secure transaction. In some situations it might
3176 be necessary to protect certain text, like source code, from filtering
3177 by defining appropriate <literal>-filter</literal> sections.
3180 At this time, <application>Privoxy</application> cannot (yet!) uncompress compressed
3181 documents. If you want filtering to work on all documents, even those that
3182 would normally be sent compressed, use the
3183 <literal><link linkend="prevent-compression">prevent-compression</link></literal>
3184 action in conjunction with <literal>filter</literal>.
3187 Filtering can achieve some of the same effects as the
3188 <literal><link linkend="block">block</link></literal>
3189 action, i.e. it can be used to block ads and banners. But the mechanism
3190 works quite differently. One effective use, is to block ad banners
3191 based on their size (see below), since many of these seem to be somewhat
3195 <link linkend="contact">Feedback</link> with suggestions for new or
3196 improved filters is particularly welcome!
3199 The below list has only the names and a one-line description of each
3200 predefined filter. There are <link linkend="predefined-filters">more
3201 verbose explanations</link> of what these filters do in the <link
3202 linkend="filter-file">filter file chapter</link>.
3208 <term>Example usage (with filters from the distribution <filename>default.filter</filename> file).
3209 See <link linkend="PREDEFINED-FILTERS">the Predefined Filters section</link> for
3210 more explanation on each:</term>
3213 <anchor id="filter-js-annoyances">
3214 <screen>+filter{js-annoyances} # Get rid of particularly annoying JavaScript abuse</screen>
3217 <anchor id="filter-js-events">
3218 <screen>+filter{js-events} # Kill all JS event bindings (Radically destructive! Only for extra nasty sites)</screen>
3221 <anchor id="filter-html-annoyances">
3222 <screen>+filter{html-annoyances} # Get rid of particularly annoying HTML abuse</screen>
3225 <anchor id="filter-content-cookies">
3226 <screen>+filter{content-cookies} # Kill cookies that come in the HTML or JS content</screen>
3229 <anchor id="filter-refresh-tags">
3230 <screen>+filter{refresh-tags} # Kill automatic refresh tags (for dial-on-demand setups)</screen>
3233 <anchor id="filter-unsolicited-popups">
3234 <screen>+filter{unsolicited-popups} # Disable only unsolicited pop-up windows</screen>
3237 <anchor id="filter-all-popups">
3238 <screen>+filter{all-popups} # Kill all popups in JavaScript and HTML</screen>
3241 <anchor id="filter-img-reorder">
3242 <screen>+filter{img-reorder} # Reorder attributes in <img> tags to make the banners-by-* filters more effective</screen>
3245 <anchor id="filter-banners-by-size">
3246 <screen>+filter{banners-by-size} # Kill banners by size</screen>
3249 <anchor id="filter-banners-by-link">
3250 <screen>+filter{banners-by-link} # Kill banners by their links to known clicktrackers</screen>
3253 <anchor id="filter-webbugs">
3254 <screen>+filter{webbugs} # Squish WebBugs (1x1 invisible GIFs used for user tracking)</screen>
3257 <anchor id="filter-tiny-textforms">
3258 <screen>+filter{tiny-textforms} # Extend those tiny textareas up to 40x80 and kill the hard wrap</screen>
3261 <anchor id="filter-jumping-windows">
3262 <screen>+filter{jumping-windows} # Prevent windows from resizing and moving themselves</screen>
3265 <anchor id="filter-frameset-borders">
3266 <screen>+filter{frameset-borders} # Give frames a border and make them resizable</screen>
3269 <anchor id="filter-demoronizer">
3270 <screen>+filter{demoronizer} # Fix MS's non-standard use of standard charsets</screen>
3273 <anchor id="filter-shockwave-flash">
3274 <screen>+filter{shockwave-flash} # Kill embedded Shockwave Flash objects</screen>
3277 <anchor id="filter-quicktime-kioskmode">
3278 <screen>+filter{quicktime-kioskmode} # Make Quicktime movies saveable</screen>
3281 <anchor id="filter-fun">
3282 <screen>+filter{fun} # Text replacements for subversive browsing fun!</screen>
3285 <anchor id="filter-crude-parental">
3286 <screen>+filter{crude-parental} # Crude parental filtering (demo only)</screen>
3289 <anchor id="filter-ie-exploits">
3290 <screen>+filter{ie-exploits} # Disable some known Internet Explorer bug exploits</screen>
3298 <!-- ~~~~~ New section ~~~~~ -->
3299 <sect3 renderas="sect4" id="force-text-mode">
3300 <title>force-text-mode</title>
3306 <term>Typical use:</term>
3308 <para>Force <application>Privoxy</application> to treat a document as if it was in some kind of <emphasis>text</emphasis> format. </para>
3313 <term>Effect:</term>
3316 Declares a document as text, even if the <quote>Content-Type:</quote> isn't detected as such.
3323 <!-- Boolean, Parameterized, Multi-value -->
3325 <para>Boolean.</para>
3330 <term>Parameter:</term>
3342 As explained <literal><link linkend="filter">above</link></literal>,
3343 <application>Privoxy</application> tries to only filter files that are
3344 in some kind of text format. The same restrictions apply to
3345 <literal><link linkend="content-type-overwrite">content-type-overwrite</link></literal>.
3346 <literal>force-text-mode</literal> declares a document as text,
3347 without looking at the <quote>Content-Type:</quote> first.
3351 Think twice before activating this action. Filtering binary data
3352 with regular expressions can cause file damage.
3359 <term>Example usage:</term>
3372 <!-- ~~~~~ New section ~~~~~ -->
3373 <sect3 renderas="sect4" id="handle-as-empty-document">
3374 <title>handle-as-empty-document</title>
3380 <term>Typical use:</term>
3382 <para>Mark URLs that should be replaced by empty documents <emphasis>if they get blocked</emphasis></para>
3387 <term>Effect:</term>
3390 This action alone doesn't do anything noticeable. It just marks URLs.
3391 If the <literal><link linkend="block">block</link></literal> action <emphasis>also applies</emphasis>,
3392 the presence or absence of this mark decides whether an HTML <quote>blocked</quote>
3393 page, or an empty document will be sent to the client as a substitute for the blocked content.
3394 The <emphasis>empty</emphasis> document isn't literally empty, but actually contains a single space.
3401 <!-- Boolean, Parameterized, Multi-value -->
3403 <para>Boolean.</para>
3408 <term>Parameter:</term>
3420 Some browsers complain about syntax errors if JavaScript documents
3421 are blocked with <application>Privoxy's</application>
3422 default HTML page; this option can be used to silence them.
3425 The content type for the empty document can be specified with
3426 <literal><link linkend="content-type-overwrite">content-type-overwrite{}</link></literal>,
3427 but usually this isn't necessary.
3433 <term>Example usage:</term>
3436 <screen># Block all documents on example.org that end with ".js",
3437 # but send an empty document instead of the usual HTML message.
3438 {+block +handle-as-empty-document}
3448 <!-- ~~~~~ New section ~~~~~ -->
3449 <sect3 renderas="sect4" id="handle-as-image">
3450 <title>handle-as-image</title>
3454 <term>Typical use:</term>
3456 <para>Mark URLs as belonging to images (so they'll be replaced by imagee <emphasis>if they get blocked</emphasis>)</para>
3461 <term>Effect:</term>
3464 This action alone doesn't do anything noticeable. It just marks URLs as images.
3465 If the <literal><link linkend="block">block</link></literal> action <emphasis>also applies</emphasis>,
3466 the presence or absence of this mark decides whether an HTML <quote>blocked</quote>
3467 page, or a replacement image (as determined by the <literal><link
3468 linkend="set-image-blocker">set-image-blocker</link></literal> action) will be sent to the
3469 client as a substitute for the blocked content.
3476 <!-- Boolean, Parameterized, Multi-value -->
3478 <para>Boolean.</para>
3483 <term>Parameter:</term>
3495 The below generic example section is actually part of <filename>default.action</filename>.
3496 It marks all URLs with well-known image file name extensions as images and should
3500 Users will probably only want to use the handle-as-image action in conjunction with
3501 <literal><link linkend="block">block</link></literal>, to block sources of banners, whose URLs don't
3502 reflect the file type, like in the second example section.
3505 Note that you cannot treat HTML pages as images in most cases. For instance, (in-line) ad
3506 frames require an HTML page to be sent, or they won't display properly.
3507 Forcing <literal>handle-as-image</literal> in this situation will not replace the
3508 ad frame with an image, but lead to error messages.
3514 <term>Example usage (sections):</term>
3517 <screen># Generic image extensions:
3520 /.*\.(gif|jpg|jpeg|png|bmp|ico)$
3522 # These don't look like images, but they're banners and should be
3523 # blocked as images:
3525 {+block +handle-as-image}
3526 some.nasty-banner-server.com/junk.cgi?output=trash
3528 # Banner source! Who cares if they also have non-image content?
3538 <!-- ~~~~~ New section ~~~~~ -->
3539 <sect3 renderas="sect4" id="hide-accept-language">
3540 <title>hide-accept-language</title>
3546 <term>Typical use:</term>
3548 <para>Pretend to use different language settings.</para>
3553 <term>Effect:</term>
3556 Deletes or replaces the <quote>Accept-Language:</quote> HTTP header in client requests.
3563 <!-- Boolean, Parameterized, Multi-value -->
3565 <para>Parameterized.</para>
3570 <term>Parameter:</term>
3573 Keyword: <quote>block</quote>, or any user defined value.
3582 Faking the browser's language settings can be useful to make a
3583 foreign User-Agent set with
3584 <literal><link linkend="hide-user-agent">hide-user-agent</link></literal>
3588 However some sites with content in different languages check the
3589 <quote>Accept-Language:</quote> to decide which one to take by default.
3590 Sometimes it isn't possible to later switch to another language without
3591 changing the <quote>Accept-Language:</quote> header first.
3594 Therefore it's a good idea to either only change the
3595 <quote>Accept-Language:</quote> header to languages you understand,
3596 or to languages that aren't wide spread.
3599 Before setting the <quote>Accept-Language:</quote> header
3600 to a rare language, you should consider that it helps to
3601 make your requests unique and thus easier to trace.
3602 If you don't plan to change this header frequently,
3603 you should stick to a common language.
3609 <term>Example usage (section):</term>
3612 <screen># Pretend to use Canadian language settings.
3613 {+hide-accept-language{en-ca} \
3614 +hide-user-agent{Mozilla/5.0 (X11; U; OpenBSD i386; en-CA; rv:1.8.0.4) Gecko/20060628 Firefox/1.5.0.4} \
3624 <!-- ~~~~~ New section ~~~~~ -->
3625 <sect3 renderas="sect4" id="hide-content-disposition">
3626 <title>hide-content-disposition</title>
3632 <term>Typical use:</term>
3634 <para>Prevent download menus for content you prefer to view inside the browser.</para>
3639 <term>Effect:</term>
3642 Deletes or replaces the <quote>Content-Disposition:</quote> HTTP header set by some servers.
3649 <!-- Boolean, Parameterized, Multi-value -->
3651 <para>Parameterized.</para>
3656 <term>Parameter:</term>
3659 Keyword: <quote>block</quote>, or any user defined value.
3668 Some servers set the <quote>Content-Disposition:</quote> HTTP header for
3669 documents they assume you want to save locally before viewing them.
3670 The <quote>Content-Disposition:</quote> header contains the file name
3671 the browser is supposed to use by default.
3674 In most browsers that understand this header, it makes it impossible to
3675 <emphasis>just view</emphasis> the document, without downloading it first,
3676 even if it's just a simple text file or an image.
3679 Removing the <quote>Content-Disposition:</quote> header helps
3680 to prevent this annoyance, but some browsers additionally check the
3681 <quote>Content-Type:</quote> header, before they decide if they can
3682 display a document without saving it first. In these cases, you have
3683 to change this header as well, before the browser stops displaying
3687 It is also possible to change the server's file name suggestion
3688 to another one, but in most cases it isn't worth the time to set
3695 <term>Example usage:</term>
3698 <screen># Disarm the download link in Sourceforge's patch tracker
3700 +content-type-overwrite {text/plain}\
3701 +hide-content-disposition {block} }
3702 .sourceforge.net/tracker/download.php</screen>
3710 <!-- ~~~~~ New section ~~~~~ -->
3711 <sect3 renderas="sect4" id="hide-if-modified-since">
3712 <title>hide-if-modified-since</title>
3718 <term>Typical use:</term>
3720 <para>Prevent yet another way to track the user's steps between sessions.</para>
3725 <term>Effect:</term>
3728 Deletes the <quote>If-Modified-Since:</quote> HTTP client header or modifies its value.
3735 <!-- Boolean, Parameterized, Multi-value -->
3737 <para>Parameterized.</para>
3742 <term>Parameter:</term>
3745 Keyword: <quote>block</quote>, or a user defined value that specifies a range of hours.
3754 Removing this header is useful for filter testing, where you want to force a real
3755 reload instead of getting status code <quote>304</quote>, which would cause the
3756 browser to use a cached copy of the page.
3759 Instead of removing the header, <literal>hide-if-modified-since</literal> can
3760 also add or substract a random amount of time to/from the headers value.
3761 You specify a range of hours were the random factor should be chosen from and
3762 <application>Privoxy</application> does the rest. A negative value means
3763 subtracting, a positive value adding.
3766 Randomizing the value of the <quote>If-Modified-Since:</quote> makes
3767 sure it isn't used as a cookie replacement, but you will run into
3768 caching problems if the random range is too high.
3771 It is a good idea to only use a small negative value and let
3772 <literal><link linkend="overwrite-last-modified">overwrite-last-modified</link></literal>
3773 handle the greater changes.
3776 It is also recommended to use this action together with
3777 <literal><link linkend="crunch-if-none-match">crunch-if-none-match</link></literal>.
3783 <term>Example usage (section):</term>
3786 <screen># Let the browser revalidate without being tracked across sessions
3787 {+hide-if-modified-since {-1}\
3788 +overwrite-last-modified {randomize}\
3789 +crunch-if-none-match}
3798 <!-- ~~~~~ New section ~~~~~ -->
3799 <sect3 renderas="sect4" id="hide-forwarded-for-headers">
3800 <title>hide-forwarded-for-headers</title>
3806 <term>Typical use:</term>
3808 <para>Improve privacy by hiding the true source of the request</para>
3813 <term>Effect:</term>
3816 Deletes any existing <quote>X-Forwarded-for:</quote> HTTP header from client requests,
3817 and prevents adding a new one.
3824 <!-- Boolean, Parameterized, Multi-value -->
3826 <para>Boolean.</para>
3831 <term>Parameter:</term>
3843 It is fairly safe to leave this on.
3846 This action is scheduled for improvement: It should be able to generate forged
3847 <quote>X-Forwarded-for:</quote> headers using random IP addresses from a specified network,
3848 to make successive requests from the same client look like requests from a pool of different
3849 users sharing the same proxy.
3855 <term>Example usage:</term>
3858 <screen>+hide-forwarded-for-headers</screen>
3866 <!-- ~~~~~ New section ~~~~~ -->
3867 <sect3 renderas="sect4" id="hide-from-header">
3868 <title>hide-from-header</title>
3872 <term>Typical use:</term>
3874 <para>Keep your (old and ill) browser from telling web servers your email address</para>
3879 <term>Effect:</term>
3882 Deletes any existing <quote>From:</quote> HTTP header, or replaces it with the
3890 <!-- Boolean, Parameterized, Multi-value -->
3892 <para>Parameterized.</para>
3897 <term>Parameter:</term>
3900 Keyword: <quote>block</quote>, or any user defined value.
3909 The keyword <quote>block</quote> will completely remove the header
3910 (not to be confused with the <literal><link linkend="block">block</link></literal>
3914 Alternately, you can specify any value you prefer to be sent to the web
3915 server. If you do, it is a matter of fairness not to use any address that
3916 is actually used by a real person.
3919 This action is rarely needed, as modern web browsers don't send
3920 <quote>From:</quote> headers anymore.
3926 <term>Example usage:</term>
3929 <screen>+hide-from-header{block}</screen> or
3930 <screen>+hide-from-header{spam-me-senseless@sittingduck.example.com}</screen>
3938 <!-- ~~~~~ New section ~~~~~ -->
3939 <sect3 renderas="sect4" id="hide-referrer">
3940 <title>hide-referrer</title>
3941 <anchor id="hide-referer">
3944 <term>Typical use:</term>
3946 <para>Conceal which link you followed to get to a particular site</para>
3951 <term>Effect:</term>
3954 Deletes the <quote>Referer:</quote> (sic) HTTP header from the client request,
3955 or replaces it with a forged one.
3962 <!-- Boolean, Parameterized, Multi-value -->
3964 <para>Parameterized.</para>
3969 <term>Parameter:</term>
3973 <para><quote>conditional-block</quote> to delete the header completely if the host has changed.</para>
3976 <para><quote>block</quote> to delete the header unconditionally.</para>
3979 <para><quote>forge</quote> to pretend to be coming from the homepage of the server we are talking to.</para>
3982 <para>Any other string to set a user defined referrer.</para>
3992 <literal>conditional-block</literal> is the only parameter,
3993 that isn't easily detected in the server's log file. If it blocks the
3994 referrer, the request will look like the visitor used a bookmark or
3995 typed in the address directly.
3998 Leaving the referrer unmodified for requests on the same host
3999 allows the server owner to see the visitor's <quote>click path</quote>,
4000 but in most cases she could also get that information by comparing
4001 other parts of the log file: for example the User-Agent if it isn't
4002 a very common one, or the user's IP address if it doesn't change between
4006 Always blocking the referrer, or using a custom one, can lead to
4007 failures on servers that check the referrer before they answer any
4008 requests, in an attempt to prevent their valuable content from being
4009 embedded or linked to elsewhere.
4012 Both <literal>conditional-block</literal> and <literal>forge</literal>
4013 will work with referrer checks, as long as content and valid referring page
4014 are on the same host. Most of the time that's the case.
4017 <literal>hide-referer</literal> is an alternate spelling of
4018 <literal>hide-referrer</literal> and the two can be can be freely
4019 substituted with each other. (<quote>referrer</quote> is the
4020 correct English spelling, however the HTTP specification has a bug - it
4021 requires it to be spelled as <quote>referer</quote>.)
4027 <term>Example usage:</term>
4030 <screen>+hide-referrer{forge}</screen> or
4031 <screen>+hide-referrer{http://www.yahoo.com/}</screen>
4039 <!-- ~~~~~ New section ~~~~~ -->
4040 <sect3 renderas="sect4" id="hide-user-agent">
4041 <title>hide-user-agent</title>
4045 <term>Typical use:</term>
4047 <para>Conceal your type of browser and client operating system</para>
4052 <term>Effect:</term>
4055 Replaces the value of the <quote>User-Agent:</quote> HTTP header
4056 in client requests with the specified value.
4063 <!-- Boolean, Parameterized, Multi-value -->
4065 <para>Parameterized.</para>
4070 <term>Parameter:</term>
4073 Any user-defined string.
4083 This can lead to problems on web sites that depend on looking at this header in
4084 order to customize their content for different browsers (which, by the
4085 way, is <emphasis>NOT</emphasis> the right thing to do: good web sites
4086 work browser-independently).
4088 <ulink url="http://www.javascriptkit.com/javaindex.shtml">smart way to do
4094 Using this action in multi-user setups or wherever different types of
4095 browsers will access the same <application>Privoxy</application> is
4096 <emphasis>not recommended</emphasis>. In single-user, single-browser
4097 setups, you might use it to delete your OS version information from
4098 the headers, because it is an invitation to exploit known bugs for your
4099 OS. It is also occasionally useful to forge this in order to access
4100 sites that won't let you in otherwise (though there may be a good
4101 reason in some cases). Example of this: some MSN sites will not
4102 let <application>Mozilla</application> enter, yet forging to a
4103 <application>Netscape 6.1</application> user-agent works just fine.
4104 (Must be just a silly MS goof, I'm sure :-).
4107 This action is scheduled for improvement.
4113 <term>Example usage:</term>
4116 <screen>+hide-user-agent{Netscape 6.1 (X11; I; Linux 2.4.18 i686)}</screen>
4124 <!-- ~~~~~ New section ~~~~~ -->
4125 <sect3 renderas="sect4" id="inspect-jpegs">
4126 <title>inspect-jpegs</title>
4132 <term>Typical use:</term>
4134 <para>To protect against the MS buffer over-run in JPEG processing</para>
4139 <term>Effect:</term>
4142 To protect against a known exploit
4149 <!-- Boolean, Parameterized, Multi-value -->
4151 <para>Boolean.</para>
4156 <term>Parameter:</term>
4168 See Microsoft Security Bulletin MS04-028. JPEG images are one of the most
4169 common image types found across the Internet. The exploit as described can
4170 allow execution of code on the target system, giving an attacker access
4171 to the system in question by merely planting an altered JPEG image, which
4172 would have no obvious indications of what lurks inside. This action
4173 prevents unwanted intrusion.
4180 <term>Example usage:</term>
4182 <para><screen>+inspect-jpegs</screen></para>
4191 <!-- ~~~~~ New section ~~~~~ -->
4192 <sect3 renderas="sect4" id="kill-popups">
4193 <title>kill-popups<anchor id="kill-popup"></title>
4197 <term>Typical use:</term>
4199 <para>Eliminate those annoying pop-up windows (deprecated)</para>
4204 <term>Effect:</term>
4207 While loading the document, replace JavaScript code that opens
4208 pop-up windows with (syntactically neutral) dummy code on the fly.
4215 <!-- Boolean, Parameterized, Multi-value -->
4217 <para>Boolean.</para>
4222 <term>Parameter:</term>
4234 This action is basically a built-in, hardwired special-purpose filter
4235 action, but there are important differences: For <literal>kill-popups</literal>,
4236 the document need not be buffered, so it can be incrementally rendered while
4237 downloading. But <literal>kill-popups</literal> doesn't catch as many pop-ups as
4239 linkend="FILTER-ALL-POPUPS">filter{<replaceable>all-popups</replaceable>}</link></literal>
4240 does and is not as smart as <literal><link
4241 linkend="FILTER-UNSOLICITED-POPUPS">filter{<replaceable>unsolicited-popups</replaceable>}</link>
4245 Think of it as a fast and efficient replacement for a filter that you
4246 can use if you don't want any filtering at all. Note that it doesn't make
4247 sense to combine it with any <literal><link linkend="filter">filter</link></literal> action,
4248 since as soon as one <literal><link linkend="filter">filter</link></literal> applies,
4249 the whole document needs to be buffered anyway, which destroys the advantage of
4250 the <literal>kill-popups</literal> action over its filter equivalent.
4253 Killing all pop-ups unconditionally is problematic. Many shops and banks rely on
4254 pop-ups to display forms, shopping carts etc, and the <literal><link
4255 linkend="FILTER-UNSOLICITED-POPUPS">filter{<replaceable>unsolicited-popups</replaceable>}</link>
4256 </literal> does a fairly good job of catching only the unwanted ones.
4259 If the only kind of pop-ups that you want to kill are exit consoles (those
4260 <emphasis>really nasty</emphasis> windows that appear when you close an other
4261 one), you might want to use
4263 linkend="filter">filter</link>{<replaceable>js-annoyances</replaceable>}</literal>
4269 An alternate spelling is <literal>+kill-popup</literal>, which is
4277 <term>Example usage:</term>
4279 <para><screen>+kill-popups</screen></para>
4286 <!-- ~~~~~ New section ~~~~~ -->
4287 <sect3 renderas="sect4" id="limit-connect">
4288 <title>limit-connect</title>
4292 <term>Typical use:</term>
4294 <para>Prevent abuse of <application>Privoxy</application> as a TCP proxy relay or disable SSL for untrusted sites</para>
4299 <term>Effect:</term>
4302 Specifies to which ports HTTP CONNECT requests are allowable.
4309 <!-- Boolean, Parameterized, Multi-value -->
4311 <para>Parameterized.</para>
4316 <term>Parameter:</term>
4319 A comma-separated list of ports or port ranges (the latter using dashes, with the minimum
4320 defaulting to 0 and the maximum to 65K).
4329 By default, i.e. if no <literal>limit-connect</literal> action applies,
4330 <application>Privoxy</application> only allows HTTP CONNECT
4331 requests to port 443 (the standard, secure HTTPS port). Use
4332 <literal>limit-connect</literal> if more fine-grained control is desired
4333 for some or all destinations.
4336 The CONNECT methods exists in HTTP to allow access to secure websites
4337 (<quote>https://</quote> URLs) through proxies. It works very simply:
4338 the proxy connects to the server on the specified port, and then
4339 short-circuits its connections to the client and to the remote server.
4340 This can be a big security hole, since CONNECT-enabled proxies can be
4341 abused as TCP relays very easily.
4344 <application>Privoxy</application> relays HTTPS traffic without seeing
4345 the decoded content. Websites can leverage this limitation to circumvent Privoxy's
4346 filters. By specifying an invalid port range you can disable HTTPS entirely.
4347 If you plan to disable SSL by default, consider enabling
4348 <literal><link linkend="treat-forbidden-connects-like-blocks ">treat-forbidden-connects-like-blocks</link></literal>
4349 as well, to be able to quickly create exceptions.
4355 <term>Example usages:</term>
4357 <!-- I had trouble getting the spacing to look right in my browser -->
4358 <!-- I probably have the wrong font setup, bollocks. -->
4359 <!-- Apparently the emphasis tag uses a proportional font no matter what -->
4361 <screen>+limit-connect{443} # This is the default and need not be specified.
4362 +limit-connect{80,443} # Ports 80 and 443 are OK.
4363 +limit-connect{-3, 7, 20-100, 500-} # Ports less than 3, 7, 20 to 100 and above 500 are OK.
4364 +limit-connect{-} # All ports are OK
4365 +limit-connect{,} # No HTTPS traffic is allowed</screen>
4372 <!-- ~~~~~ New section ~~~~~ -->
4373 <sect3 renderas="sect4" id="prevent-compression">
4374 <title>prevent-compression</title>
4378 <term>Typical use:</term>
4381 Ensure that servers send the content uncompressed, so it can be
4382 passed through <literal><link linkend="filter">filter</link></literal>s.
4388 <term>Effect:</term>
4391 Removes the Accept-Encoding header which can be used to ask for compressed transfer.
4398 <!-- Boolean, Parameterized, Multi-value -->
4400 <para>Boolean.</para>
4405 <term>Parameter:</term>
4417 More and more websites send their content compressed by default, which
4418 is generally a good idea and saves bandwidth. But for the <literal><link
4419 linkend="filter">filter</link></literal>, <literal><link linkend="deanimate-gifs">deanimate-gifs</link></literal>
4420 and <literal><link linkend="kill-popups">kill-popups</link></literal> actions to work,
4421 <application>Privoxy</application> needs access to the uncompressed data.
4422 Unfortunately, <application>Privoxy</application> can't yet(!) uncompress, filter, and
4423 re-compress the content on the fly. So if you want to ensure that all websites, including
4424 those that normally compress, can be filtered, you need to use this action.
4427 This will slow down transfers from those websites, though. If you use any of the above-mentioned
4428 actions, you will typically want to use <literal>prevent-compression</literal> in conjunction
4432 Note that some (rare) ill-configured sites don't handle requests for uncompressed
4433 documents correctly (they send an empty document body). If you use <literal>prevent-compression</literal>
4434 per default, you'll have to add exceptions for those sites. See the example for how to do that.
4440 <term>Example usage (sections):</term>
4443 <screen># Set default:
4445 {+prevent-compression}
4448 # Make exceptions for ill sites:
4450 {-prevent-compression}
4452 www.pclinuxonline.com</screen>
4461 <!-- ~~~~~ New section ~~~~~ -->
4462 <sect3 renderas="sect4" id="overwrite-last-modified">
4463 <title>overwrite-last-modified</title>
4469 <term>Typical use:</term>
4471 <para>Prevent yet another way to track the user's steps between sessions.</para>
4476 <term>Effect:</term>
4479 Deletes the <quote>Last-Modified:</quote> HTTP server header or modifies its value.
4486 <!-- Boolean, Parameterized, Multi-value -->
4488 <para>Parameterized.</para>
4493 <term>Parameter:</term>
4496 One of the keywords: <quote>block</quote>, <quote>reset-to-request-time</quote>
4497 and <quote>randomize</quote>
4506 Removing the <quote>Last-Modified:</quote> header is useful for filter
4507 testing, where you want to force a real reload instead of getting status
4508 code <quote>304</quote>, which would cause the browser to reuse the old
4509 version of the page.
4512 The <quote>randomize</quote> option overwrites the value of the
4513 <quote>Last-Modified:</quote> header with a randomly chosen time
4514 between the original value and the current time. In theory the server
4515 could send each document with a different <quote>Last-Modified:</quote>
4516 header to track visits without using cookies. <quote>Randomize</quote>
4517 makes it impossible and the browser can still revalidate cached documents.
4520 <quote>reset-to-request-time</quote> overwrites the value of the
4521 <quote>Last-Modified:</quote> header with the current time. You could use
4522 this option together with
4523 <literal><link linkend="hide-if-modified-since">hided-if-modified-since</link></literal>
4524 to further customize your random range.
4527 The preferred parameter here is <quote>randomize</quote>. It is safe
4528 to use, as long as the time settings are more or less correct.
4529 If the server sets the <quote>Last-Modified:</quote> header to the time
4530 of the request, the random range becomes zero and the value stays the same.
4531 Therefore you should later randomize it a second time with
4532 <literal><link linkend="hide-if-modified-since">hided-if-modified-since</link></literal>,
4536 It is also recommended to use this action together with
4537 <literal><link linkend="crunch-if-none-match">crunch-if-none-match</link></literal>.
4543 <term>Example usage:</term>
4546 <screen># Let the browser revalidate without being tracked across sessions
4547 {+hide-if-modified-since {-1}\
4548 +overwrite-last-modified {randomize}\
4549 +crunch-if-none-match}
4558 <!-- ~~~~~ New section ~~~~~ -->
4559 <sect3 renderas="sect4" id="redirect">
4560 <title>redirect</title>
4566 <term>Typical use:</term>
4569 Redirect requests to other sites.
4575 <term>Effect:</term>
4578 Convinces the browser that the requested document has been moved
4579 to another location and the browser should get it from there.
4586 <!-- Boolean, Parameterized, Multi-value -->
4588 <para>Parameterized</para>
4593 <term>Parameter:</term>
4605 This action is useful to replace whole documents with your own
4606 ones. For that to work, they have to be available on another server.
4609 You can do the same by combining the actions
4610 <literal><link linkend="block">block</link></literal>,
4611 <literal><link linkend="handle-as-image">handle-as-image</link></literal> and
4612 <literal><link linkend="set-image-blocker">set-image-blocker{URL}</link></literal>.
4613 It doesn't sound right for non-image documents, and that's why this action
4617 This action will be ignored if you use it together with
4618 <literal><link linkend="block">block</link></literal>.
4624 <term>Example usage:</term>
4627 <screen># Replace example.com's style sheet with another one
4628 {+redirect{http://localhost/css-replacements/example.com.css}}
4629 example.com/stylesheet.css</screen>
4638 <!-- ~~~~~ New section ~~~~~ -->
4639 <sect3 renderas="sect4" id="send-vanilla-wafer">
4640 <title>send-vanilla-wafer</title>
4644 <term>Typical use:</term>
4647 Feed log analysis scripts with useless data.
4653 <term>Effect:</term>
4656 Sends a cookie with each request stating that you do not accept any copyright
4657 on cookies sent to you, and asking the site operator not to track you.
4664 <!-- Boolean, Parameterized, Multi-value -->
4666 <para>Boolean.</para>
4671 <term>Parameter:</term>
4683 The vanilla wafer is a (relatively) unique header and could conceivably be used to track you.
4686 This action is rarely used and not enabled in the default configuration.
4692 <term>Example usage:</term>
4695 <screen>+send-vanilla-wafer</screen>
4704 <!-- ~~~~~ New section ~~~~~ -->
4705 <sect3 renderas="sect4" id="send-wafer">
4706 <title>send-wafer</title>
4710 <term>Typical use:</term>
4713 Send custom cookies or feed log analysis scripts with even more useless data.
4719 <term>Effect:</term>
4722 Sends a custom, user-defined cookie with each request.
4729 <!-- Boolean, Parameterized, Multi-value -->
4731 <para>Multi-value.</para>
4736 <term>Parameter:</term>
4739 A string of the form <quote><replaceable class="option">name</replaceable>=<replaceable
4740 class="parameter">value</replaceable></quote>.
4749 Being multi-valued, multiple instances of this action can apply to the same request,
4750 resulting in multiple cookies being sent.
4753 This action is rarely used and not enabled in the default configuration.
4758 <term>Example usage (section):</term>
4761 <screen>{+send-wafer{UsingPrivoxy=true}}
4762 my-internal-testing-server.void</screen>
4770 <!-- ~~~~~ New section ~~~~~ -->
4771 <sect3 renderas="sect4" id="session-cookies-only">
4772 <title>session-cookies-only</title>
4776 <term>Typical use:</term>
4779 Allow only temporary <quote>session</quote> cookies (for the current
4780 browser session <emphasis>only</emphasis>).
4786 <term>Effect:</term>
4789 Deletes the <quote>expires</quote> field from <quote>Set-Cookie:</quote>
4790 server headers. Most browsers will not store such cookies permanently and
4791 forget them in between sessions.
4798 <!-- Boolean, Parameterized, Multi-value -->
4800 <para>Boolean.</para>
4805 <term>Parameter:</term>
4817 This is less strict than <literal><link linkend="crunch-incoming-cookies">crunch-incoming-cookies</link></literal> /
4818 <literal><link linkend="crunch-outgoing-cookies">crunch-outgoing-cookies</link></literal> and allows you to browse
4819 websites that insist or rely on setting cookies, without compromising your privacy too badly.
4822 Most browsers will not permanently store cookies that have been processed by
4823 <literal>session-cookies-only</literal> and will forget about them between sessions.
4824 This makes profiling cookies useless, but won't break sites which require cookies so
4825 that you can log in for transactions. This is generally turned on for all
4826 sites, and is the recommended setting.
4829 It makes <emphasis>no sense at all</emphasis> to use <literal>session-cookies-only</literal>
4830 together with <literal><link linkend="crunch-incoming-cookies">crunch-incoming-cookies</link></literal> or
4831 <literal><link linkend="crunch-outgoing-cookies">crunch-outgoing-cookies</link></literal>. If you do, cookies
4832 will be plainly killed.
4835 Note that it is up to the browser how it handles such cookies without an <quote>expires</quote>
4836 field. If you use an exotic browser, you might want to try it out to be sure.
4839 This setting also has no effect on cookies that may have been stored
4840 previously by the browser before starting <application>Privoxy</application>.
4841 These would have to be removed manually.
4844 <application>Privoxy</application> also uses
4845 the <link linkend="filter-content-cookies">content-cookies filter</link>
4846 to block some types of cookies. Content cookies are not effected by
4847 <literal>session-cookies-only</literal>.
4853 <term>Example usage:</term>
4856 <screen>+session-cookies-only</screen>
4864 <!-- ~~~~~ New section ~~~~~ -->
4865 <sect3 renderas="sect4" id="set-image-blocker">
4866 <title>set-image-blocker</title>
4870 <term>Typical use:</term>
4872 <para>Choose the replacement for blocked images</para>
4877 <term>Effect:</term>
4880 This action alone doesn't do anything noticeable. If <emphasis>both</emphasis>
4881 <literal><link linkend="block">block</link></literal> <emphasis>and</emphasis> <literal><link
4882 linkend="handle-as-image">handle-as-image</link></literal> <emphasis>also</emphasis>
4883 apply, i.e. if the request is to be blocked as an image,
4884 <emphasis>then</emphasis> the parameter of this action decides what will be
4885 sent as a replacement.
4892 <!-- Boolean, Parameterized, Multi-value -->
4894 <para>Parameterized.</para>
4899 <term>Parameter:</term>
4904 <quote>pattern</quote> to send a built-in checkerboard pattern image. The image is visually
4905 decent, scales very well, and makes it obvious where banners were busted.
4910 <quote>blank</quote> to send a built-in transparent image. This makes banners disappear
4911 completely, but makes it hard to detect where <application>Privoxy</application> has blocked
4912 images on a given page and complicates troubleshooting if <application>Privoxy</application>
4913 has blocked innocent images, like navigation icons.
4918 <quote><replaceable class="parameter">target-url</replaceable></quote> to
4919 send a redirect to <replaceable class="parameter">target-url</replaceable>. You can redirect
4920 to any image anywhere, even in your local filesystem via <quote>file:///</quote> URL.
4921 (But note that not all browsers support redirecting to a local file system).
4924 A good application of redirects is to use special <application>Privoxy</application>-built-in
4925 URLs, which send the built-in images, as <replaceable class="parameter">target-url</replaceable>.
4926 This has the same visual effect as specifying <quote>blank</quote> or <quote>pattern</quote> in
4927 the first place, but enables your browser to cache the replacement image, instead of requesting
4928 it over and over again.
4939 The URLs for the built-in images are <quote>http://config.privoxy.org/send-banner?type=<replaceable
4940 class="parameter">type</replaceable></quote>, where <replaceable class="parameter">type</replaceable> is
4941 either <quote>blank</quote> or <quote>pattern</quote>.
4944 There is a third (advanced) type, called <quote>auto</quote>. It is <emphasis>NOT</emphasis> to be
4945 used in <literal>set-image-blocker</literal>, but meant for use from <link linkend="filter-file">filters</link>.
4946 Auto will select the type of image that would have applied to the referring page, had it been an image.
4952 <term>Example usage:</term>
4958 <screen>+set-image-blocker{pattern}</screen>
4961 Redirect to the BSD devil:
4964 <screen>+set-image-blocker{http://www.freebsd.org/gifs/dae_up3.gif}</screen>
4967 Redirect to the built-in pattern for better caching:
4970 <screen>+set-image-blocker{http://config.privoxy.org/send-banner?type=pattern}</screen>
4978 <!-- ~~~~~ New section ~~~~~ -->
4979 <sect3 renderas="sect4" id="treat-forbidden-connects-like-blocks">
4980 <title>treat-forbidden-connects-like-blocks</title>
4986 <term>Typical use:</term>
4988 <para>Block forbidden connects with an easy to find error message.</para>
4993 <term>Effect:</term>
4996 If this action is enabled, <application>Privoxy</application> no longer
4997 makes a difference between forbidden connects and ordinary blocks.
5004 <!-- Boolean, Parameterized, Multi-value -->
5006 <para>Boolean</para>
5011 <term>Parameter:</term>
5021 By default <application>Privoxy</application> answers
5022 <link linkend="limit-connect">forbidden <quote>Connect</quote> requests</link>
5023 with a short error message inside the headers. If the browser doesn't display
5024 headers (most don't), you just see an empty page.
5027 With this action enabled, <application>Privoxy</application> displays
5028 the message that is used for ordinary blocks instead. If you decide
5029 to make an exception for the page in question, you can do so by
5030 following the <quote>See why</quote> link.
5033 For <quote>Connect</quote> requests the clients tell
5034 <application>Privoxy</application> which host they are interested
5035 in, but not which document they plan to get later. As a result, the
5036 <quote>Go there anyway</quote> link becomes rather useless:
5037 it lets the client request the home page of the forbidden host
5038 through unencrypted HTTP, still using the port of the last request.
5041 If you previously configured <application>Privoxy</application> to do the
5042 request through a SSL tunnel, everything will work. Most likely you haven't
5043 and the server will responds with an error message because it is expecting
5050 <term>Example usage:</term>
5053 <screen>+treat-forbidden-connects-like-blocks</screen>
5061 <!-- ~~~~~ New section ~~~~~ -->
5063 <title>Summary</title>
5065 Note that many of these actions have the potential to cause a page to
5066 misbehave, possibly even not to display at all. There are many ways
5067 a site designer may choose to design his site, and what HTTP header
5068 content, and other criteria, he may depend on. There is no way to have hard
5069 and fast rules for all sites. See the <link
5070 linkend="ACTIONSANAT">Appendix</link> for a brief example on troubleshooting
5076 <!-- ~~~~~ New section ~~~~~ -->
5077 <sect2 id="aliases">
5078 <title>Aliases</title>
5080 Custom <quote>actions</quote>, known to <application>Privoxy</application>
5081 as <quote>aliases</quote>, can be defined by combining other actions.
5082 These can in turn be invoked just like the built-in actions.
5083 Currently, an alias name can contain any character except space, tab,
5085 <quote>{</quote> and <quote>}</quote>, but we <emphasis>strongly
5086 recommend</emphasis> that you only use <quote>a</quote> to <quote>z</quote>,
5087 <quote>0</quote> to <quote>9</quote>, <quote>+</quote>, and <quote>-</quote>.
5088 Alias names are not case sensitive, and are not required to start with a
5089 <quote>+</quote> or <quote>-</quote> sign, since they are merely textually
5093 Aliases can be used throughout the actions file, but they <emphasis>must be
5094 defined in a special section at the top of the file!</emphasis>
5095 And there can only be one such section per actions file. Each actions file may
5096 have its own alias section, and the aliases defined in it are only visible
5100 There are two main reasons to use aliases: One is to save typing for frequently
5101 used combinations of actions, the other one is a gain in flexibility: If you
5102 decide once how you want to handle shops by defining an alias called
5103 <quote>shop</quote>, you can later change your policy on shops in
5104 <emphasis>one</emphasis> place, and your changes will take effect everywhere
5105 in the actions file where the <quote>shop</quote> alias is used. Calling aliases
5106 by their purpose also makes your actions files more readable.
5109 Currently, there is one big drawback to using aliases, though:
5110 <application>Privoxy</application>'s built-in web-based action file
5111 editor honors aliases when reading the actions files, but it expands
5112 them before writing. So the effects of your aliases are of course preserved,
5113 but the aliases themselves are lost when you edit sections that use aliases
5115 This is likely to change in future versions of <application>Privoxy</application>.
5119 Now let's define some aliases...
5124 # Useful custom aliases we can use later.
5126 # Note the (required!) section header line and that this section
5127 # must be at the top of the actions file!
5131 # These aliases just save typing later:
5132 # (Note that some already use other aliases!)
5134 +crunch-all-cookies = +<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> +<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link>
5135 -crunch-all-cookies = -<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> -<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link>
5136 block-as-image = +block +handle-as-image
5137 mercy-for-cookies = -crunch-all-cookies -<link linkend="SESSION-COOKIES-ONLY">session-cookies-only</link> -<link linkend="FILTER-CONTENT-COOKIES">filter{content-cookies}</link>
5139 # These aliases define combinations of actions
5140 # that are useful for certain types of sites:
5142 fragile = -<link linkend="BLOCK">block</link> -<link linkend="FILTER">filter</link> -crunch-all-cookies -<link linkend="FAST-REDIRECTS">fast-redirects</link> -<link linkend="HIDE-REFERER">hide-referrer</link> -<link linkend="KILL-POPUPS">kill-popups</link>
5143 shop = -crunch-all-cookies -<link linkend="FILTER-ALL-POPUPS">filter{all-popups}</link> -<link linkend="KILL-POPUPS">kill-popups</link>
5145 # Short names for other aliases, for really lazy people ;-)
5147 c0 = +crunch-all-cookies
5148 c1 = -crunch-all-cookies</screen>
5152 ...and put them to use. These sections would appear in the lower part of an
5153 actions file and define exceptions to the default actions (as specified further
5154 up for the <quote>/</quote> pattern):
5159 # These sites are either very complex or very keen on
5160 # user data and require minimal interference to work:
5163 .office.microsoft.com
5164 .windowsupdate.microsoft.com
5168 # Allow cookies (for setting and retrieving your customer data)
5172 .worldpay.com # for quietpc.com
5175 # These shops require pop-ups:
5177 {shop -kill-popups -filter{all-popups}}
5179 .overclockers.co.uk</screen>
5183 Aliases like <quote>shop</quote> and <quote>fragile</quote> are often used for
5184 <quote>problem</quote> sites that require some actions to be disabled
5185 in order to function properly.
5191 <!-- ~~~~~ New section ~~~~~ -->
5192 <sect2 id="act-examples">
5193 <title>Actions Files Tutorial</title>
5195 The above chapters have shown <link linkend="actions-file">which actions files
5196 there are and how they are organized</link>, how actions are <link
5197 linkend="actions">specified</link> and <link linkend="actions-apply">applied
5198 to URLs</link>, how <link linkend="af-patterns">patterns</link> work, and how to
5199 define and use <link linkend="aliases">aliases</link>. Now, let's look at an
5200 example <filename>default.action</filename> and <filename>user.action</filename>
5201 file and see how all these pieces come together:
5204 <sect3><title>default.action</title>
5207 Every config file should start with a short comment stating its purpose:
5211 <screen># Sample default.action file <ijbswa-developers@lists.sourceforge.net></screen>
5215 Then, since this is the <filename>default.action</filename> file, the
5216 first section is a special section for internal use that you needn't
5217 change or worry about:
5222 ##########################################################################
5223 # Settings -- Don't change! For internal Privoxy use ONLY.
5224 ##########################################################################
5227 for-privoxy-version=3.0</screen>
5231 After that comes the (optional) alias section. We'll use the example
5232 section from the above <link linkend="aliases">chapter on aliases</link>,
5233 that also explains why and how aliases are used:
5238 ##########################################################################
5240 ##########################################################################
5243 # These aliases just save typing later:
5244 # (Note that some already use other aliases!)
5246 +crunch-all-cookies = +<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> +<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link>
5247 -crunch-all-cookies = -<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> -<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link>
5248 block-as-image = +block +handle-as-image
5249 mercy-for-cookies = -crunch-all-cookies -<link linkend="SESSION-COOKIES-ONLY">session-cookies-only</link> -<link linkend="FILTER-CONTENT-COOKIES">filter{content-cookies}</link>
5251 # These aliases define combinations of actions
5252 # that are useful for certain types of sites:
5254 fragile = -<link linkend="BLOCK">block</link> -<link linkend="FILTER">filter</link> -crunch-all-cookies -<link linkend="FAST-REDIRECTS">fast-redirects</link> -<link linkend="HIDE-REFERER">hide-referrer</link> -<link linkend="KILL-POPUPS">kill-popups</link>
5255 shop = -crunch-all-cookies -<link linkend="FILTER-ALL-POPUPS">filter{all-popups}</link> -<link linkend="KILL-POPUPS">kill-popups</link></screen>
5259 Now come the regular sections, i.e. sets of actions, accompanied
5260 by URL patterns to which they apply. Remember <emphasis>all actions
5261 are disabled when matching starts</emphasis>, so we have to explicitly
5262 enable the ones we want.
5266 The first regular section is probably the most important. It has only
5267 one pattern, <quote><literal>/</literal></quote>, but this pattern
5268 <link linkend="af-patterns">matches all URLs</link>. Therefore, the
5269 set of actions used in this <quote>default</quote> section <emphasis>will
5270 be applied to all requests as a start</emphasis>. It can be partly or
5271 wholly overridden by later matches further down this file, or in user.action,
5272 but it will still be largely responsible for your overall browsing
5277 Again, at the start of matching, all actions are disabled, so there is
5278 no real need to disable any actions here, but we will do that nonetheless,
5279 to have a complete listing for your reference. (Remember: a <quote>+</quote>
5280 preceding the action name enables the action, a <quote>-</quote> disables!).
5281 Also note how this long line has been made more readable by splitting it into
5282 multiple lines with line continuation.
5287 ##########################################################################
5288 # "Defaults" section:
5289 ##########################################################################
5291 -<link linkend="ADD-HEADER">add-header</link> \
5292 -<link linkend="BLOCK">block</link> \
5293 -<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> \
5294 -<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link> \
5295 +<link linkend="DEANIMATE-GIFS">deanimate-gifs</link> \
5296 -<link linkend="DOWNGRADE-HTTP-VERSION">downgrade-http-version</link> \
5297 +<link linkend="FAST-REDIRECTS">fast-redirects</link> \
5298 +<link linkend="FILTER-JS-ANNOYANCES">filter{js-annoyances}</link> \
5299 -<link linkend="FILTER-JS-EVENTS">filter{js-events}</link> \
5300 +<link linkend="FILTER-HTML-ANNOYANCES">filter{html-annoyances}</link> \
5301 -<link linkend="FILTER-CONTENT-COOKIES">filter{content-cookies}</link> \
5302 +<link linkend="FILTER-REFRESH-TAGS">filter{refresh-tags}</link> \
5303 +<link linkend="FILTER-UNSOLICITED-POPUPS">filter{unsolicited-popups}</link> \
5304 -<link linkend="FILTER-ALL-POPUPS">filter{all-popups}</link> \
5305 +<link linkend="FILTER-IMG-REORDER">filter{img-reorder}</link> \
5306 +<link linkend="FILTER-BANNERS-BY-SIZE">filter{banners-by-size}</link> \
5307 -<link linkend="FILTER-BANNERS-BY-LINK">filter{banners-by-link}</link> \
5308 +<link linkend="FILTER-WEBBUGS">filter{webbugs}</link> \
5309 -<link linkend="FILTER-TINY-TEXTFORMS">filter{tiny-textforms}</link> \
5310 +<link linkend="FILTER-JUMPING-WINDOWS">filter{jumping-windows}</link> \
5311 -<link linkend="FILTER-FRAMESET-BORDERS">filter{frameset-borders}</link> \
5312 -<link linkend="FILTER-DEMORONIZER">filter{demoronizer}</link> \
5313 -<link linkend="FILTER-SHOCKWAVE-FLASH">filter{shockwave-flash}</link> \
5314 -<link linkend="FILTER-QUICKTIME-KIOSKMODE">filter{quicktime-kioskmode}</link> \
5315 -<link linkend="FILTER-FUN">filter{fun}</link> \
5316 -<link linkend="FILTER-CRUDE-PARENTAL">filter{crude-parental}</link> \
5317 +<link linkend="FILTER-IE-EXPLOITS">filter{ie-exploits}</link> \
5318 -<link linkend="HANDLE-AS-IMAGE">handle-as-image</link> \
5319 +<link linkend="HIDE-FORWARDED-FOR-HEADERS">hide-forwarded-for-headers</link> \
5320 +<link linkend="HIDE-FROM-HEADER">hide-from-header{block}</link> \
5321 +<link linkend="HIDE-REFERER">hide-referrer{forge}</link> \
5322 -<link linkend="HIDE-USER-AGENT">hide-user-agent</link> \
5323 -<link linkend="KILL-POPUPS">kill-popups</link> \
5324 -<link linkend="LIMIT-CONNECT">limit-connect</link> \
5325 +<link linkend="PREVENT-COMPRESSION">prevent-compression</link> \
5326 -<link linkend="SEND-VANILLA-WAFER">send-vanilla-wafer</link> \
5327 -<link linkend="SEND-WAFER">send-wafer</link> \
5328 +<link linkend="SESSION-COOKIES-ONLY">session-cookies-only</link> \
5329 +<link linkend="SET-IMAGE-BLOCKER">set-image-blocker{pattern}</link> \
5331 / # forward slash will match *all* potential URL patterns.</screen>
5335 The default behavior is now set. Note that some actions, like not hiding
5336 the user agent, are part of a <quote>general policy</quote> that applies
5337 universally and won't get any exceptions defined later. Other choices,
5338 like not blocking (which is <emphasis>understandably</emphasis> the
5339 default!) need exceptions, i.e. we need to specify explicitly what we
5340 want to block in later sections.
5344 The first of our specialized sections is concerned with <quote>fragile</quote>
5345 sites, i.e. sites that require minimum interference, because they are either
5346 very complex or very keen on tracking you (and have mechanisms in place that
5347 make them unusable for people who avoid being tracked). We will simply use
5348 our pre-defined <literal>fragile</literal> alias instead of stating the list
5349 of actions explicitly:
5354 ##########################################################################
5355 # Exceptions for sites that'll break under the default action set:
5356 ##########################################################################
5358 # "Fragile" Use a minimum set of actions for these sites (see alias above):
5361 .office.microsoft.com # surprise, surprise!
5362 .windowsupdate.microsoft.com</screen>
5366 Shopping sites are not as fragile, but they typically
5367 require cookies to log in, and pop-up windows for shopping
5368 carts or item details. Again, we'll use a pre-defined alias:
5377 .worldpay.com # for quietpc.com
5379 .scan.co.uk</screen>
5382 <!-- No longer needed BEGIN OF COMMENTED OUT BLOCK
5385 Then, there are sites which rely on pop-up windows (yuck!) to work.
5386 Since we made pop-up-killing our default above, we need to make exceptions
5387 now. <ulink url="http://www.mozilla.org/">Mozilla</ulink> users, who
5388 can turn on smart handling of unwanted pop-ups in their browsers, can
5390 -<literal><link linkend="FILTER-ALL-POPUPS">filter{popups}</link></literal> (and
5391 -<literal><link linkend="KILL-POPUPS">kill-popups</link></literal>) above
5392 and hence don't need this section. Anyway, disabling an already disabled
5393 action doesn't hurt, so we'll define our exceptions regardless of what was
5394 chosen in the defaults section:
5399 # These sites require pop-ups too :(
5401 { -<link linkend="KILL-POPUPS">kill-popups</link> -<link linkend="FILTER-ALL-POPUPS">filter{popups}</link> }
5404 .deutsche-bank-24.de</screen>
5407 END OF COMMENTED OUT BLOCK -->
5410 The <literal><link linkend="FAST-REDIRECTS">fast-redirects</link></literal>
5411 action, which we enabled per default above, breaks some sites. So disable
5412 it for popular sites where we know it misbehaves:
5417 { -<link linkend="FAST-REDIRECTS">fast-redirects</link> }
5421 .altavista.com/.*(like|url|link):http
5422 .altavista.com/trans.*urltext=http
5423 .nytimes.com</screen>
5427 It is important that <application>Privoxy</application> knows which
5428 URLs belong to images, so that <emphasis>if</emphasis> they are to
5429 be blocked, a substitute image can be sent, rather than an HTML page.
5430 Contacting the remote site to find out is not an option, since it
5431 would destroy the loading time advantage of banner blocking, and it
5432 would feed the advertisers (in terms of money <emphasis>and</emphasis>
5433 information). We can mark any URL as an image with the <literal><link
5434 linkend="handle-as-image">handle-as-image</link></literal> action,
5435 and marking all URLs that end in a known image file extension is a
5441 ##########################################################################
5443 ##########################################################################
5445 # Define which file types will be treated as images, in case they get
5446 # blocked further down this file:
5448 { +<link linkend="HANDLE-AS-IMAGE">handle-as-image</link> }
5449 /.*\.(gif|jpe?g|png|bmp|ico)$</screen>
5453 And then there are known banner sources. They often use scripts to
5454 generate the banners, so it won't be visible from the URL that the
5455 request is for an image. Hence we block them <emphasis>and</emphasis>
5456 mark them as images in one go, with the help of our
5457 <literal>block-as-image</literal> alias defined above. (We could of
5458 course just as well use <literal>+<link linkend="block">block</link>
5459 +<link linkend="handle-as-image">handle-as-image</link></literal> here.)
5460 Remember that the type of the replacement image is chosen by the
5461 <literal><link linkend="set-image-blocker">set-image-blocker</link></literal>
5462 action. Since all URLs have matched the default section with its
5463 <literal>+<link linkend="set-image-blocker">set-image-blocker</link>{pattern}</literal>
5464 action before, it still applies and needn't be repeated:
5469 # Known ad generators:
5474 .ad.*.doubleclick.net
5475 .a.yimg.com/(?:(?!/i/).)*$
5476 .a[0-9].yimg.com/(?:(?!/i/).)*$
5483 One of the most important jobs of <application>Privoxy</application>
5484 is to block banners. A huge bunch of them are already <quote>blocked</quote>
5485 by the <literal><link linkend="filter">filter</link>{banners-by-size}</literal>
5486 action, which we enabled above, and which deletes the references to banner
5487 images from the pages while they are loaded, so the browser doesn't request
5488 them anymore, and hence they don't need to be blocked here. But this naturally
5489 doesn't catch all banners, and some people choose not to use filters, so we
5490 need a comprehensive list of patterns for banner URLs here, and apply the
5491 <literal><link linkend="block">block</link></literal> action to them.
5494 First comes a bunch of generic patterns, which do most of the work, by
5495 matching typical domain and path name components of banners. Then comes
5496 a list of individual patterns for specific sites, which is omitted here
5497 to keep the example short:
5502 ##########################################################################
5503 # Block these fine banners:
5504 ##########################################################################
5505 { <link linkend="BLOCK">+block</link> }
5513 /.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?)
5514 /(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/
5516 # Site-specific patterns (abbreviated):
5518 .hitbox.com</screen>
5522 You wouldn't believe how many advertisers actually call their banner
5523 servers ads.<replaceable>company</replaceable>.com, or call the directory
5524 in which the banners are stored simply <quote>banners</quote>. So the above
5525 generic patterns are surprisingly effective.
5528 But being very generic, they necessarily also catch URLs that we don't want
5529 to block. The pattern <literal>.*ads.</literal> e.g. catches
5530 <quote>nasty-<emphasis>ads</emphasis>.nasty-corp.com</quote> as intended,
5531 but also <quote>downlo<emphasis>ads</emphasis>.sourcefroge.net</quote> or
5532 <quote><emphasis>ads</emphasis>l.some-provider.net.</quote> So here come some
5533 well-known exceptions to the <literal>+<link linkend="BLOCK">block</link></literal>
5537 Note that these are exceptions to exceptions from the default! Consider the URL
5538 <quote>downloads.sourcefroge.net</quote>: Initially, all actions are deactivated,
5539 so it wouldn't get blocked. Then comes the defaults section, which matches the
5540 URL, but just deactivates the <literal><link linkend="BLOCK">block</link></literal>
5541 action once again. Then it matches <literal>.*ads.</literal>, an exception to the
5542 general non-blocking policy, and suddenly
5543 <literal><link linkend="BLOCK">+block</link></literal> applies. And now, it'll match
5544 <literal>.*loads.</literal>, where <literal><link linkend="BLOCK">-block</link></literal>
5545 applies, so (unless it matches <emphasis>again</emphasis> further down) it ends up
5546 with no <literal><link linkend="BLOCK">block</link></literal> action applying.
5551 ##########################################################################
5552 # Save some innocent victims of the above generic block patterns:
5553 ##########################################################################
5557 { -<link linkend="BLOCK">block</link> }
5558 adv[io]*. # (for advogato.org and advice.*)
5559 adsl. # (has nothing to do with ads)
5560 ad[ud]*. # (adult.* and add.*)
5561 .edu # (universities don't host banners (yet!))
5562 .*loads. # (downloads, uploads etc)
5570 www.globalintersec.com/adv # (adv = advanced)
5571 www.ugu.com/sui/ugu/adv</screen>
5575 Filtering source code can have nasty side effects,
5576 so make an exception for our friends at sourceforge.net,
5577 and all paths with <quote>cvs</quote> in them. Note that
5578 <literal>-<link linkend="FILTER">filter</link></literal>
5579 disables <emphasis>all</emphasis> filters in one fell swoop!
5584 # Don't filter code!
5586 { -<link linkend="FILTER">filter</link> }
5588 .sourceforge.net</screen>
5592 The actual <filename>default.action</filename> is of course more
5593 comprehensive, but we hope this example made clear how it works.
5598 <sect3><title>user.action</title>
5601 So far we are painting with a broad brush by setting general policies,
5602 which would be a reasonable starting point for many people. Now,
5603 you might want to be more specific and have customized rules that
5604 are more suitable to your personal habits and preferences. These would
5605 be for narrowly defined situations like your ISP or your bank, and should
5606 be placed in <filename>user.action</filename>, which is parsed after all other
5607 actions files and hence has the last word, over-riding any previously
5608 defined actions. <filename>user.action</filename> is also a
5609 <emphasis>safe</emphasis> place for your personal settings, since
5610 <filename>default.action</filename> is actively maintained by the
5611 <application>Privoxy</application> developers and you'll probably want
5612 to install updated versions from time to time.
5616 So let's look at a few examples of things that one might typically do in
5617 <filename>user.action</filename>:
5621 <!-- brief sample user.action here -->
5625 # My user.action file. <fred@foobar.com></screen>
5629 As <link linkend="aliases">aliases</link> are local to the actions
5630 file that they are defined in, you can't use the ones from
5631 <filename>default.action</filename>, unless you repeat them here:
5636 # Aliases are local to the file they are defined in.
5637 # (Re-)define aliases for this file:
5641 # These aliases just save typing later, and the alias names should
5642 # be self explanatory.
5644 +crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies
5645 -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
5646 allow-all-cookies = -crunch-all-cookies -session-cookies-only
5647 allow-popups = -filter{all-popups} -kill-popups
5648 +block-as-image = +block +handle-as-image
5649 -block-as-image = -block
5651 # These aliases define combinations of actions that are useful for
5652 # certain types of sites:
5654 fragile = -block -crunch-all-cookies -filter -fast-redirects -hide-referrer -kill-popups
5655 shop = -crunch-all-cookies allow-popups
5657 # Allow ads for selected useful free sites:
5659 allow-ads = -block -filter{banners-by-size} -filter{banners-by-link}</screen>
5665 Say you have accounts on some sites that you visit regularly, and
5666 you don't want to have to log in manually each time. So you'd like
5667 to allow persistent cookies for these sites. The
5668 <literal>allow-all-cookies</literal> alias defined above does exactly
5669 that, i.e. it disables crunching of cookies in any direction, and the
5670 processing of cookies to make them only temporary.
5675 { allow-all-cookies }
5681 .redhat.com</screen>
5685 Your bank is allergic to some filter, but you don't know which, so you disable them all:
5690 { -<link linkend="FILTER">filter</link> }
5691 .your-home-banking-site.com</screen>
5695 Some file types you may not want to filter for various reasons:
5700 # Technical documentation is likely to contain strings that might
5701 # erroneously get altered by the JavaScript-oriented filters:
5706 # And this stupid host sends streaming video with a wrong MIME type,
5707 # so that Privoxy thinks it is getting HTML and starts filtering:
5709 stupid-server.example.com/</screen>
5713 Example of a simple <link linkend="BLOCK">block</link> action. Say you've
5714 seen an ad on your favourite page on example.com that you want to get rid of.
5715 You have right-clicked the image, selected <quote>copy image location</quote>
5716 and pasted the URL below while removing the leading http://, into a
5717 <literal>{ +block }</literal> section. Note that <literal>{ +handle-as-image
5718 }</literal> need not be specified, since all URLs ending in
5719 <literal>.gif</literal> will be tagged as images by the general rules as set
5720 in default.action anyway:
5725 { +<link linkend="BLOCK">block</link> }
5726 www.example.com/nasty-ads/sponsor.gif
5727 another.popular.site.net/more/junk/here/</screen>
5731 The URLs of dynamically generated banners, especially from large banner
5732 farms, often don't use the well-known image file name extensions, which
5733 makes it impossible for <application>Privoxy</application> to guess
5734 the file type just by looking at the URL.
5735 You can use the <literal>+block-as-image</literal> alias defined above for
5737 Note that objects which match this rule but then turn out NOT to be an
5738 image are typically rendered as a <quote>broken image</quote> icon by the
5739 browser. Use cautiously.
5747 ar.atwola.com/</screen>
5751 Now you noticed that the default configuration breaks Forbes Magazine,
5752 but you were too lazy to find out which action is the culprit, and you
5753 were again too lazy to give <link linkend="contact">feedback</link>, so
5754 you just used the <literal>fragile</literal> alias on the site, and
5755 -- <emphasis>whoa!</emphasis> -- it worked. The <literal>fragile</literal>
5756 aliases disables those actions that are most likely to break a site. Also,
5757 good for testing purposes to see if it is <application>Privoxy</application>
5758 that is causing the problem or not.
5764 .forbes.com</screen>
5768 You like the <quote>fun</quote> text replacements in <filename>default.filter</filename>,
5769 but it is disabled in the distributed actions file. (My colleagues on the team just
5770 don't have a sense of humour, that's why! ;-). So you'd like to turn it on in your private,
5771 update-safe config, once and for all:
5776 { +<link linkend="filter-fun">filter{fun}</link> }
5777 / # For ALL sites!</screen>
5781 Note that the above is not really a good idea: There are exceptions
5782 to the filters in <filename>default.action</filename> for things that
5783 really shouldn't be filtered, like code on CVS->Web interfaces. Since
5784 <filename>user.action</filename> has the last word, these exceptions
5785 won't be valid for the <quote>fun</quote> filtering specified here.
5789 You might also worry about how your favourite free websites are
5790 funded, and find that they rely on displaying banner advertisements
5791 to survive. So you might want to specifically allow banners for those
5792 sites that you feel provide value to you:
5804 Note that <literal>allow-ads</literal> has been aliased to
5805 <literal>-<link linkend="block">block</link></literal>,
5806 <literal>-<link linkend="filter-banners-by-size">filter{banners-by-size}</link></literal>, and
5807 <literal>-<link linkend="filter-banners-by-link">filter{banners-by-link}</link></literal> above.
5811 <filename>user.action</filename> is generally the best place to define
5812 exceptions and additions to the default policies of
5813 <filename>default.action</filename>. Some actions are safe to have their
5814 default policies set here though. So let's set a default policy to have a
5815 <quote>blank</quote> image as opposed to the checkerboard pattern for
5816 <emphasis>ALL</emphasis> sites. <quote>/</quote> of course matches all URL
5822 { +<link linkend="set-image-blocker">set-image-blocker{blank}</link> }
5823 / # ALL sites</screen>
5829 <!-- ~ End section ~ -->
5833 <!-- ~ End section ~ -->
5835 <!-- ~~~~~~~~ New section Header ~~~~~~~~~ -->
5837 <sect1 id="filter-file">
5838 <title>Filter Files</title>
5841 All text substitutions that can be invoked through the
5842 <literal><link linkend="filter">filter</link></literal> action which
5843 must first be defined in a <quote>filter file</quote>, such as
5844 <filename>default.filter</filename>. Mulitple filter files can be
5845 defined through the <literal>
5846 <link linkend="filterfile">filterfile</link></literal> config
5847 option. The filters as supplied by the developers will be found in
5848 <filename>default.filter</filename>. It is recommended that any locally
5849 defined or modified filters go in a separately defined file such as
5850 <filename>user.filter</filename>.
5855 Typical reasons for doing these kinds of substitutions are to eliminate
5856 common annoyances in HTML and JavaScript, such as pop-up windows,
5857 exit consoles, crippled windows without navigation tools, the
5858 infamous <BLINK> tag etc, to suppress images with certain
5859 width and height attributes (standard banner sizes or web-bugs),
5860 or just to have fun. The possibilities are endless.
5864 Filtering works on any text-based document type, including
5865 HTML, JavaScript, CSS etc. (all <literal>text/*</literal>
5866 MIME types, <emphasis>except</emphasis> <literal>text/plain</literal>).
5867 Substitutions are made at the source level, so if you want to <quote>roll
5868 your own</quote> filters, you should be familiar with HTML syntax.
5872 Just like the <link linkend="actions-file">actions files</link>, the
5873 filter file is organized in sections, which are called <emphasis>filters</emphasis>
5874 here. Each filter consists of a heading line, that starts with the
5875 <emphasis>keyword</emphasis> <literal>FILTER:</literal>, followed by
5876 the filter's <emphasis>name</emphasis>, and a short (one line)
5877 <emphasis>description</emphasis> of what it does. Below that line
5878 come the <emphasis>jobs</emphasis>, i.e. lines that define the actual
5879 text substitutions. By convention, the name of a filter
5880 should describe what the filter <emphasis>eliminates</emphasis>. The
5881 comment is used in the <ulink url="http://config.privoxy.org/">web-based
5882 user interface</ulink>.
5886 Once a filter called <replaceable>name</replaceable> has been defined
5887 in the filter file, it can be invoked by using an action of the form
5888 +<literal><link linkend="filter">filter</link>{<replaceable>name</replaceable>}</literal>
5889 in any <link linkend="actions-file">actions file</link>.
5893 A filter header line for a filter called <quote>foo</quote> could look
5898 <screen>FILTER: foo Replace all "foo" with "bar"</screen>
5902 Below that line, and up to the next header line, come the jobs that
5903 define what text replacements the filter executes. They are specified
5904 in a syntax that imitates <ulink url="http://www.perl.org/">Perl</ulink>'s
5905 <literal>s///</literal> operator. If you are familiar with Perl, you
5906 will find this to be quite intuitive, and may want to look at the
5907 PCRS documentation for the subtle differences to Perl behaviour. Most
5908 notably, the non-standard option letter <literal>U</literal> is supported,
5909 which turns the default to ungreedy matching.
5913 If you are new to regular expressions, you might want to take a look at
5914 the <link linkend="regex">Appendix on regular expressions</link>, and
5915 see the <ulink url="http://perldoc.com/perl5.6.1/pod/perl.html">Perl
5917 <ulink url="http://perldoc.com/perl5.6.1/pod/perlop.html#s-PATTERN-REPLACEMENT-egimosx">the
5918 <literal>s///</literal> operator's syntax</ulink> and <ulink
5919 url="http://perldoc.com/perl5.6.1/pod/perlre.html">Perl-style regular
5920 expressions</ulink> in general.
5921 The below examples might also help to get you started.
5925 <!-- ~~~~~~~~ New section Header ~~~~~~~~~ -->
5927 <sect2><title>Filter File Tutorial</title>
5929 Now, let's complete our <quote>foo</quote> filter. We have already defined
5930 the heading, but the jobs are still missing. Since all it does is to replace
5931 <quote>foo</quote> with <quote>bar</quote>, there is only one (trivial) job
5936 <screen>s/foo/bar/</screen>
5940 But wait! Didn't the comment say that <emphasis>all</emphasis> occurrences
5941 of <quote>foo</quote> should be replaced? Our current job will only take
5942 care of the first <quote>foo</quote> on each page. For global substitution,
5943 we'll need to add the <literal>g</literal> option:
5947 <screen>s/foo/bar/g</screen>
5951 Our complete filter now looks like this:
5954 <screen>FILTER: foo Replace all "foo" with "bar"
5955 s/foo/bar/g</screen>
5959 Let's look at some real filters for more interesting examples. Here you see
5960 a filter that protects against some common annoyances that arise from JavaScript
5961 abuse. Let's look at its jobs one after the other:
5967 FILTER: js-annoyances Get rid of particularly annoying JavaScript abuse
5969 # Get rid of JavaScript referrer tracking. Test page: http://www.randomoddness.com/untitled.htm
5971 s|(<script.*)document\.referrer(.*</script>)|$1"Not Your Business!"$2|Usg</screen>
5975 Following the header line and a comment, you see the job. Note that it uses
5976 <literal>|</literal> as the delimiter instead of <literal>/</literal>, because
5977 the pattern contains a forward slash, which would otherwise have to be escaped
5978 by a backslash (<literal>\</literal>).
5982 Now, let's examine the pattern: it starts with the text <literal><script.*</literal>
5983 enclosed in parentheses. Since the dot matches any character, and <literal>*</literal>
5984 means: <quote>Match an arbitrary number of the element left of myself</quote>, this
5985 matches <quote><script</quote>, followed by <emphasis>any</emphasis> text, i.e.
5986 it matches the whole page, from the start of the first <script> tag.
5990 That's more than we want, but the pattern continues: <literal>document\.referrer</literal>
5991 matches only the exact string <quote>document.referrer</quote>. The dot needed to
5992 be <emphasis>escaped</emphasis>, i.e. preceded by a backslash, to take away its
5993 special meaning as a joker, and make it just a regular dot. So far, the meaning is:
5994 Match from the start of the first <script> tag in a the page, up to, and including,
5995 the text <quote>document.referrer</quote>, if <emphasis>both</emphasis> are present
5996 in the page (and appear in that order).
6000 But there's still more pattern to go. The next element, again enclosed in parentheses,
6001 is <literal>.*</script></literal>. You already know what <literal>.*</literal>
6002 means, so the whole pattern translates to: Match from the start of the first <script>
6003 tag in a page to the end of the last <script> tag, provided that the text
6004 <quote>document.referrer</quote> appears somewhere in between.
6008 This is still not the whole story, since we have ignored the options and the parentheses:
6009 The portions of the page matched by sub-patterns that are enclosed in parentheses, will be
6010 remembered and be available through the variables <literal>$1, $2, ...</literal> in
6011 the substitute. The <literal>U</literal> option switches to ungreedy matching, which means
6012 that the first <literal>.*</literal> in the pattern will only <quote>eat up</quote> all
6013 text in between <quote><script</quote> and the <emphasis>first</emphasis> occurrence
6014 of <quote>document.referrer</quote>, and that the second <literal>.*</literal> will
6015 only span the text up to the <emphasis>first</emphasis> <quote></script></quote>
6016 tag. Furthermore, the <literal>s</literal> option says that the match may span
6017 multiple lines in the page, and the <literal>g</literal> option again means that the
6018 substitution is global.
6022 So, to summarize, the pattern means: Match all scripts that contain the text
6023 <quote>document.referrer</quote>. Remember the parts of the script from
6024 (and including) the start tag up to (and excluding) the string
6025 <quote>document.referrer</quote> as <literal>$1</literal>, and the part following
6026 that string, up to and including the closing tag, as <literal>$2</literal>.
6030 Now the pattern is deciphered, but wasn't this about substituting things? So
6031 lets look at the substitute: <literal>$1"Not Your Business!"$2</literal> is
6032 easy to read: The text remembered as <literal>$1</literal>, followed by
6033 <literal>"Not Your Business!"</literal> (<emphasis>including</emphasis>
6034 the quotation marks!), followed by the text remembered as <literal>$2</literal>.
6035 This produces an exact copy of the original string, with the middle part
6036 (the <quote>document.referrer</quote>) replaced by <literal>"Not Your
6037 Business!"</literal>.
6041 The whole job now reads: Replace <quote>document.referrer</quote> by
6042 <literal>"Not Your Business!"</literal> wherever it appears inside a
6043 <script> tag. Note that this job won't break JavaScript syntax,
6044 since both the original and the replacement are syntactically valid
6045 string objects. The script just won't have access to the referrer
6046 information anymore.
6050 We'll show you two other jobs from the JavaScript taming department, but
6051 this time only point out the constructs of special interest:
6056 # The status bar is for displaying link targets, not pointless blahblah
6058 s/window\.status\s*=\s*(['"]).*?\1/dUmMy=1/ig</screen>
6062 <literal>\s</literal> stands for whitespace characters (space, tab, newline,
6063 carriage return, form feed), so that <literal>\s*</literal> means: <quote>zero
6064 or more whitespace</quote>. The <literal>?</literal> in <literal>.*?</literal>
6065 makes this matching of arbitrary text ungreedy. (Note that the <literal>U</literal>
6066 option is not set). The <literal>['"]</literal> construct means: <quote>a single
6067 <emphasis>or</emphasis> a double quote</quote>. Finally, <literal>\1</literal> is
6068 a backreference to the first parenthesis just like <literal>$1</literal> above,
6069 with the difference that in the <emphasis>pattern</emphasis>, a backslash indicates
6070 a backreference, whereas in the <emphasis>substitute</emphasis>, it's the dollar.
6074 So what does this job do? It replaces assignments of single- or double-quoted
6075 strings to the <quote>window.status</quote> object with a dummy assignment
6076 (using a variable name that is hopefully odd enough not to conflict with
6077 real variables in scripts). Thus, it catches many cases where e.g. pointless
6078 descriptions are displayed in the status bar instead of the link target when
6079 you move your mouse over links.
6084 # Kill OnUnload popups. Yummy. Test: http://www.zdnet.com/zdsubs/yahoo/tree/yfs.html
6086 s/(<body [^>]*)onunload(.*>)/$1never$2/iU</screen>
6091 <ulink url="http://www.w3.org/TR/2000/REC-DOM-Level-2-Events-20001113/events.html#Events-eventgroupings-htmlevents">OnUnload
6092 event binding</ulink> in the HTML DOM was a <emphasis>CRIME</emphasis>.
6093 When I close a browser window, I want it to close and die. Basta.
6094 This job replaces the <quote>onunload</quote> attribute in
6095 <quote><body></quote> tags with the dummy word <literal>never</literal>.
6096 Note that the <literal>i</literal> option makes the pattern matching
6097 case-insensitive. Also note that ungreedy matching alone doesn't always guarantee
6098 a minimal match: In the first parenthesis, we had to use <literal>[^>]*</literal>
6099 instead of <literal>.*</literal> to prevent the match from exceeding the
6100 <body> tag if it doesn't contain <quote>OnUnload</quote>, but the page's
6105 The last example is from the fun department:
6110 FILTER: fun Fun text replacements
6112 # Spice the daily news:
6114 s/microsoft(?!\.com)/MicroSuck/ig</screen>
6118 Note the <literal>(?!\.com)</literal> part (a so-called negative lookahead)
6119 in the job's pattern, which means: Don't match, if the string
6120 <quote>.com</quote> appears directly following <quote>microsoft</quote>
6121 in the page. This prevents links to microsoft.com from being trashed, while
6122 still replacing the word everywhere else.
6127 # Buzzword Bingo (example for extended regex syntax)
6129 s* industry[ -]leading \
6131 | customer[ -]focused \
6132 | market[ -]driven \
6133 | award[ -]winning # Comments are OK, too! \
6134 | high[ -]performance \
6135 | solutions[ -]based \
6139 *<font color="red"><b>BINGO!</b></font> \
6144 The <literal>x</literal> option in this job turns on extended syntax, and allows for
6145 e.g. the liberal use of (non-interpreted!) whitespace for nicer formatting.
6153 <!-- ~~~~~~~~ New section Header ~~~~~~~~~ -->
6155 <sect2 id="predefined-filters"><title>The Pre-defined Filters</title>
6159 Note each filter is also listed in the +filter action section above. Please
6160 keep these listings in sync.
6165 The distribution <filename>default.filter</filename> file contains a selection of
6166 pre-defined filters for your convenience:
6171 <term><emphasis>js-annoyances</emphasis></term>
6174 The purpose of this filter is to get rid of particularly annoying JavaScript abuse.
6179 replaces JavaScript references to the browser's referrer information
6180 with the string "Not Your Business!". This compliments the <literal><link
6181 linkend="hide-referrer">hide-referrer</link></literal> action on the content level.
6186 removes the bindings to the DOM's
6187 <ulink url="http://www.w3.org/TR/2000/REC-DOM-Level-2-Events-20001113/events.html#Events-eventgroupings-htmlevents">unload
6188 event</ulink> which we feel has no right to exist and is responsible for most <quote>exit consoles</quote>, i.e.
6189 nasty windows that pop up when you close another one.
6194 removes code that causes new windows to be opened with undesired properties, such as being
6195 full-screen, non-resizable, without location, status or menu bar etc.
6204 <term><emphasis>js-events</emphasis></term>
6207 This is a very radical measure. It removes virtually all JavaScript event bindings, which
6208 means that scripts can not react to user actions such as mouse movements or clicks, window
6209 resizing etc, anymore.
6212 We <emphasis>strongly discourage</emphasis> using this filter as a default since it breaks
6213 many legitimate scripts. It is meant for use only on extra-nasty sites (should you really
6220 <term><emphasis>html-annoyances</emphasis></term>
6223 This filter will undo many common instances of HTML based abuse.
6226 The <literal>BLINK</literal> and <literal>MARQUEE</literal> tags
6227 are neutralized (yeah baby!), and browser windows will be created as
6228 resizable (as of course they should be!), and will have location,
6229 scroll and menu bars -- even if specified otherwise.
6235 <term><emphasis>content-cookies</emphasis></term>
6238 Most cookies are set in the HTTP dialogue, where they can be intercepted
6240 <literal><link linkend="crunch-incoming-cookies">crunch-incoming-cookies</link></literal>
6241 and <literal><link linkend="crunch-outgoing-cookies">crunch-outgoing-cookies</link></literal>
6242 actions. But web sites increasingly make use of HTML meta tags and JavaScript
6243 to sneak cookies to the browser on the content level.
6246 This filter disables HTML and JavaScript code that reads or sets cookies. Use
6247 it wherever you would also use the cookie crunch actions.
6253 <term><emphasis>refresh tags</emphasis></term>
6256 Disable any refresh tags if the interval is greater than nine seconds (so
6257 that redirections done via refresh tags are not destroyed). This is useful
6258 for dial-on-demand setups, or for those who find this HTML feature
6265 <term><emphasis>unsolicited-popups</emphasis></term>
6268 This filter attempts to prevent only <quote>unsolicited</quote> pop-up
6269 windows from opening, yet still allow pop-up windows that the user
6270 has explicitly chosen to open. It was added in version 3.0.1,
6271 as an improvement over earlier such filters.
6274 Technical note: The filter works by redefining the window.open JavaScript
6275 function to a dummy function during the loading and rendering phase of each
6276 HTML page access, and restoring the function afterwards.
6282 <term><emphasis>all-popups</emphasis></term>
6285 Attempt to prevent <emphasis>all</emphasis> pop-up windows from opening.
6286 Note this should be used with more discretion than the above, since it is
6287 more likely to break some sites that require pop-ups for normal usage. Use
6294 <term><emphasis>img-reorder</emphasis></term>
6297 This is a helper filter that has no value if used alone. It makes the
6298 <literal>banners-by-size</literal> and <literal>banners-by-link</literal>
6299 (see below) filters more effective and should be enabled together with them.
6305 <term><emphasis>banners-by-size</emphasis></term>
6308 This filter removes image tags purely based on what size they are. Fortunately
6309 for us, many ads and banner images tend to conform to certain standardized
6310 sizes, which makes this filter quite effective for ad stripping purposes.
6313 Occasionally this filter will cause false positives on images that are not ads,
6314 but just happen to be of one of the standard banner sizes.
6320 <term><emphasis>banners-by-link</emphasis></term>
6323 This is an experimental filter that attempts to kill any banners if
6324 their URLs seem to point to known or suspected click trackers. It is currently
6325 not of much value and is not recommended for use by default.
6331 <term><emphasis>webbugs</emphasis></term>
6334 Webbugs are small, invisible images (technically 1X1 GIF images), that
6335 are used to track users across websites, and collect information on them.
6336 As an HTML page is loaded by the browser, an embedded image tag causes the
6337 browser to contact a third-party site, disclosing the tracking information
6338 through the requested URL and/or cookies for that third-party domain, without
6339 the use ever becoming aware of the interaction with the third-party site.
6340 HTML-ized spam also uses a similar technique to verify email addresses.
6343 This filter removes the HTML code that loads such <quote>webbugs</quote>.
6349 <term><emphasis>tiny-textforms</emphasis></term>
6352 A rather special-purpose filter that can be used to enlarge textareas (those
6353 multi-line text boxes in web forms) and turn off hard word wrap in them.
6354 It was written for the sourceforge.net tracker system where such boxes are
6355 a nuisance, but it can be handy on other sites, too.
6358 It is not recommended to use this filter as a default.
6364 <term><emphasis>jumping-windows</emphasis></term>
6367 Many consider windows that move, or resize themselves to be abusive. This filter
6368 neutralizes the related JavaScript code. Note that some sites might not display
6369 or behave as intended when using this filter.
6375 <term><emphasis>frameset-borders</emphasis></term>
6378 Some web designers seem to assume that everyone in the world will view their
6379 web sites using the same browser brand and version, screen resolution etc,
6380 because only that assumption could explain why they'd use static frame sizes,
6381 yet prevent their frames from being resized by the user, should they be too
6382 small to show their whole content.
6385 This filter removes the related HTML code. It should only be applied to sites
6392 <term><emphasis>demoronizer</emphasis></term>
6395 Many Microsoft products that generate HTML use non-standard extensions (read:
6396 violations) of the ISO 8859-1 aka Latin-1 character set. This can cause those
6397 HTML documents to display with errors on standard-compliant platforms.
6400 This filter translates the MS-only characters into Latin-1 equivalents.
6401 It is not necessary when using MS products, and will cause corruption of
6402 all documents that use 8-bit character sets other than Latin-1. It's mostly
6403 worthwhile for Europeans on non-MS platforms, if wierd garbage characters
6404 sometimes appear on some pages, or user agents that don't correct for this on
6407 My version of Mozilla (ancient) shows litte square boxes for quote
6408 characters, and apostrophes on moronized pages. So many pages have this, I
6409 can read them fine now. HB 08/27/06
6416 <term><emphasis>shockwave-flash</emphasis></term>
6419 A filter for shockwave haters. As the name suggests, this filter strips code
6420 out of web pages that is used to embed shockwave flash objects.
6428 <term><emphasis>quicktime-kioskmode</emphasis></term>
6431 Change HTML code that embeds Quicktime objects so that kioskmode, which
6432 prevents saving, is disabled.
6438 <term><emphasis>fun</emphasis></term>
6441 Text replacements for subversive browsing fun. Make fun of your favorite
6442 Monopolist or play buzzword bingo.
6448 <term><emphasis>crude-parental</emphasis></term>
6451 A demonstration-only filter that shows how <application>Privoxy</application>
6452 can be used to delete web content on a keyword basis.
6458 <term><emphasis>ie-exploits</emphasis></term>
6461 A collection of text replacements to disable malicious HTML and JavaScript
6462 code that exploits known security holes in Internet Explorer.
6465 Presently, it only protects against Nimda and a cross-site scripting bug, and
6466 would need active maintenance to provide more substantial protection.
6472 <term><emphasis>site-specifics</emphasis></term>
6475 Some web sites have very specific problems, the cure for which doesn't apply
6476 anywhere else, or could even cause damage on other sites.
6479 This is a collection of such site-specific cures which should only be applied
6480 to the sites they were intended for, which is what the supplied
6481 <filename>default.action</filename> file does. Users shouldn't need to change
6482 anything regarding this filter.
6487 <varlistentry id="filter-server-headers">
6488 <term><emphasis>filter-server-headers</emphasis></term>
6495 <varlistentry id="filter-client-headers">
6496 <term><emphasis>filter-client-headers</emphasis></term>
6505 <term><emphasis> </emphasis></term>
6519 <!-- ~ End section ~ -->
6523 <!-- ~~~~~ New section ~~~~~ -->
6525 <sect1 id="templates">
6526 <title>Templates</title>
6528 All <application>Privoxy</application> built-in pages, i.e. error pages such as the
6529 <ulink url="http://show-the-404-error.page"><quote>404 - No Such Domain</quote>
6530 error page</ulink>, the <ulink
6531 url="http://ads.bannerserver.example.com/nasty-ads/sponsor.html"><quote>BLOCKED</quote>
6533 and all pages of its <ulink url="http://config.privoxy.org/">web-based
6534 user interface</ulink>, are generated from <emphasis>templates</emphasis>.
6535 (<application>Privoxy</application> must be running for the above links to work as
6540 These templates are stored in a subdirectory of the <link linkend="confdir">configuration
6541 directory</link> called <filename>templates</filename>. On Unixish platforms,
6543 <ulink url="file:///etc/privoxy/templates/"><filename>/etc/privoxy/templates/</filename></ulink>.
6547 The templates are basically normal HTML files, but with place-holders (called symbols
6548 or exports), which <application>Privoxy</application> fills at run time. You can
6549 edit the templates with a normal text editor, should you want to customize them.
6550 (<emphasis>Not recommended for the casual user</emphasis>). Note that
6551 just like in configuration files, lines starting with <literal>#</literal> are
6552 ignored when the templates are filled in.
6556 The place-holders are of the form <literal>@name@</literal>, and you will
6557 find a list of available symbols, which vary from template to template,
6558 in the comments at the start of each file. Note that these comments are not
6559 always accurate, and that it's probably best to look at the existing HTML
6560 code to find out which symbols are supported and what they are filled in with.
6564 A special application of this substitution mechanism is to make whole
6565 blocks of HTML code disappear when a specific symbol is set. We use this
6566 for many purposes, one of them being to include the beta warning in all
6567 our user interface (CGI) pages when <application>Privoxy</application>
6568 is in an alpha or beta development stage:
6573 <!-- @if-unstable-start -->
6575 ... beta warning HTML code goes here ...
6577 <!-- if-unstable-end@ --></screen>
6581 If the "unstable" symbol is set, everything in between and including
6582 <literal>@if-unstable-start</literal> and <literal>if-unstable-end@</literal>
6583 will disappear, leaving nothing but an empty comment:
6587 <screen><!-- --></screen>
6591 There's also an if-then-else construct and an <literal>#include</literal>
6592 mechanism, but you'll sure find out if you are inclined to edit the
6597 All templates refer to a style located at
6598 <ulink url="http://config.privoxy.org/send-stylesheet"><literal>http://config.privoxy.org/send-stylesheet</literal></ulink>.
6599 This is, of course, locally served by <application>Privoxy</application>
6600 and the source for it can be found and edited in the
6601 <filename>cgi-style.css</filename> template.
6606 <!-- ~ End section ~ -->
6610 <!-- ~~~~~ New section ~~~~~ -->
6612 <sect1 id="contact"><title>Contacting the Developers, Bug Reporting and Feature
6615 <!-- Include contacting.sgml boilerplate: -->
6617 <!-- end boilerplate -->
6621 <!-- ~ End section ~ -->
6624 <!-- ~~~~~ New section ~~~~~ -->
6625 <sect1 id="copyright"><title><application>Privoxy</application> Copyright, License and History</title>
6627 <!-- Include copyright.sgml: -->
6629 <!-- end copyright -->
6631 <!-- ~~~~~ New section ~~~~~ -->
6632 <sect2><title>License</title>
6633 <!-- Include copyright.sgml: -->
6635 <!-- end copyright -->
6637 <!-- ~ End section ~ -->
6640 <!-- ~~~~~ New section ~~~~~ -->
6642 <sect2 id="history"><title>History</title>
6643 <!-- Include history.sgml: -->
6645 <!-- end history -->
6648 <sect2 id="authors"><title>Authors</title>
6649 <!-- Include p-authors.sgml: -->
6651 <!-- end authors -->
6656 <!-- ~ End section ~ -->
6659 <!-- ~~~~~ New section ~~~~~ -->
6660 <sect1 id="seealso"><title>See Also</title>
6661 <!-- Include seealso.sgml: -->
6663 <!-- end seealso -->
6668 <!-- ~~~~~ New section ~~~~~ -->
6669 <sect1 id="appendix"><title>Appendix</title>
6672 <!-- ~~~~~ New section ~~~~~ -->
6674 <title>Regular Expressions</title>
6676 <application>Privoxy</application> uses Perl-style <quote>regular
6677 expressions</quote> in its <link linkend="actions-file">actions
6678 files</link> and <link linkend="filter-file">filter file</link>,
6679 through the <ulink url="http://www.pcre.org/">PCRE</ulink> and
6682 <ulink url="http://www.oesterhelt.org/pcrs/">PCRS</ulink> libraries.
6684 <application>PCRS</application> libraries.
6688 If you are reading this, you probably don't understand what <quote>regular
6689 expressions</quote> are, or what they can do. So this will be a very brief
6690 introduction only. A full explanation would require a <ulink
6691 url="http://www.oreilly.com/catalog/regex/">book</ulink> ;-)
6695 Regular expressions provide a language to describe patterns that can be
6696 run against strings of characters (letter, numbers, etc), to see if they
6697 match the string or not. The patterns are themselves (sometimes complex)
6698 strings of literal characters, combined with wild-cards, and other special
6699 characters, called meta-characters. The <quote>meta-characters</quote> have
6700 special meanings and are used to build complex patterns to be matched against.
6701 Perl Compatible Regular Expressions are an especially convenient
6702 <quote>dialect</quote> of the regular expression language.
6706 To make a simple analogy, we do something similar when we use wild-card
6707 characters when listing files with the <command>dir</command> command in DOS.
6708 <literal>*.*</literal> matches all filenames. The <quote>special</quote>
6709 character here is the asterisk which matches any and all characters. We can be
6710 more specific and use <literal>?</literal> to match just individual
6711 characters. So <quote>dir file?.text</quote> would match
6712 <quote>file1.txt</quote>, <quote>file2.txt</quote>, etc. We are pattern
6713 matching, using a similar technique to <quote>regular expressions</quote>!
6717 Regular expressions do essentially the same thing, but are much, much more
6718 powerful. There are many more <quote>special characters</quote> and ways of
6719 building complex patterns however. Let's look at a few of the common ones,
6720 and then some examples:
6725 <emphasis>.</emphasis> - Matches any single character, e.g. <quote>a</quote>,
6726 <quote>A</quote>, <quote>4</quote>, <quote>:</quote>, or <quote>@</quote>.
6728 </simplelist></para>
6732 <emphasis>?</emphasis> - The preceding character or expression is matched ZERO or ONE
6735 </simplelist></para>
6739 <emphasis>+</emphasis> - The preceding character or expression is matched ONE or MORE
6742 </simplelist></para>
6746 <emphasis>*</emphasis> - The preceding character or expression is matched ZERO or MORE
6749 </simplelist></para>
6753 <emphasis>\</emphasis> - The <quote>escape</quote> character denotes that
6754 the following character should be taken literally. This is used where one of the
6755 special characters (e.g. <quote>.</quote>) needs to be taken literally and
6756 not as a special meta-character. Example: <quote>example\.com</quote>, makes
6757 sure the period is recognized only as a period (and not expanded to its
6758 meta-character meaning of any single character).
6760 </simplelist></para>
6764 <emphasis>[]</emphasis> - Characters enclosed in brackets will be matched if
6765 any of the enclosed characters are encountered. For instance, <quote>[0-9]</quote>
6766 matches any numeric digit (zero through nine). As an example, we can combine
6767 this with <quote>+</quote> to match any digit one of more times: <quote>[0-9]+</quote>.
6769 </simplelist></para>
6773 <emphasis>()</emphasis> - parentheses are used to group a sub-expression,
6774 or multiple sub-expressions.
6776 </simplelist></para>
6780 <emphasis>|</emphasis> - The <quote>bar</quote> character works like an
6781 <quote>or</quote> conditional statement. A match is successful if the
6782 sub-expression on either side of <quote>|</quote> matches. As an example:
6783 <quote>/(this|that) example/</quote> uses grouping and the bar character
6784 and would match either <quote>this example</quote> or <quote>that
6785 example</quote>, and nothing else.
6787 </simplelist></para>
6790 These are just some of the ones you are likely to use when matching URLs with
6791 <application>Privoxy</application>, and is a long way from a definitive
6792 list. This is enough to get us started with a few simple examples which may
6793 be more illuminating:
6797 <emphasis><literal>/.*/banners/.*</literal></emphasis> - A simple example
6798 that uses the common combination of <quote>.</quote> and <quote>*</quote> to
6799 denote any character, zero or more times. In other words, any string at all.
6800 So we start with a literal forward slash, then our regular expression pattern
6801 (<quote>.*</quote>) another literal forward slash, the string
6802 <quote>banners</quote>, another forward slash, and lastly another
6803 <quote>.*</quote>. We are building
6804 a directory path here. This will match any file with the path that has a
6805 directory named <quote>banners</quote> in it. The <quote>.*</quote> matches
6806 any characters, and this could conceivably be more forward slashes, so it
6807 might expand into a much longer looking path. For example, this could match:
6808 <quote>/eye/hate/spammers/banners/annoy_me_please.gif</quote>, or just
6809 <quote>/banners/annoying.html</quote>, or almost an infinite number of other
6810 possible combinations, just so it has <quote>banners</quote> in the path
6815 A now something a little more complex:
6819 <emphasis><literal>/.*/adv((er)?ts?|ertis(ing|ements?))?/</literal></emphasis> -
6820 We have several literal forward slashes again (<quote>/</quote>), so we are
6821 building another expression that is a file path statement. We have another
6822 <quote>.*</quote>, so we are matching against any conceivable sub-path, just so
6823 it matches our expression. The only true literal that <emphasis>must
6824 match</emphasis> our pattern is <application>adv</application>, together with
6825 the forward slashes. What comes after the <quote>adv</quote> string is the
6830 Remember the <quote>?</quote> means the preceding expression (either a
6831 literal character or anything grouped with <quote>(...)</quote> in this case)
6832 can exist or not, since this means either zero or one match. So
6833 <quote>((er)?ts?|ertis(ing|ements?))</quote> is optional, as are the
6834 individual sub-expressions: <quote>(er)</quote>,
6835 <quote>(ing|ements?)</quote>, and the <quote>s</quote>. The <quote>|</quote>
6836 means <quote>or</quote>. We have two of those. For instance,
6837 <quote>(ing|ements?)</quote>, can expand to match either <quote>ing</quote>
6838 <emphasis>OR</emphasis> <quote>ements?</quote>. What is being done here, is an
6839 attempt at matching as many variations of <quote>advertisement</quote>, and
6840 similar, as possible. So this would expand to match just <quote>adv</quote>,
6841 or <quote>advert</quote>, or <quote>adverts</quote>, or
6842 <quote>advertising</quote>, or <quote>advertisement</quote>, or
6843 <quote>advertisements</quote>. You get the idea. But it would not match
6844 <quote>advertizements</quote> (with a <quote>z</quote>). We could fix that by
6845 changing our regular expression to:
6846 <quote>/.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/</quote>, which would then match
6851 <emphasis><literal>/.*/advert[0-9]+\.(gif|jpe?g)</literal></emphasis> - Again
6852 another path statement with forward slashes. Anything in the square brackets
6853 <quote>[]</quote> can be matched. This is using <quote>0-9</quote> as a
6854 shorthand expression to mean any digit one through nine. It is the same as
6855 saying <quote>0123456789</quote>. So any digit matches. The <quote>+</quote>
6856 means one or more of the preceding expression must be included. The preceding
6857 expression here is what is in the square brackets -- in this case, any digit
6858 one through nine. Then, at the end, we have a grouping: <quote>(gif|jpe?g)</quote>.
6859 This includes a <quote>|</quote>, so this needs to match the expression on
6860 either side of that bar character also. A simple <quote>gif</quote> on one side, and the other
6861 side will in turn match either <quote>jpeg</quote> or <quote>jpg</quote>,
6862 since the <quote>?</quote> means the letter <quote>e</quote> is optional and
6863 can be matched once or not at all. So we are building an expression here to
6864 match image GIF or JPEG type image file. It must include the literal
6865 string <quote>advert</quote>, then one or more digits, and a <quote>.</quote>
6866 (which is now a literal, and not a special character, since it is escaped
6867 with <quote>\</quote>), and lastly either <quote>gif</quote>, or
6868 <quote>jpeg</quote>, or <quote>jpg</quote>. Some possible matches would
6869 include: <quote>//advert1.jpg</quote>,
6870 <quote>/nasty/ads/advert1234.gif</quote>,
6871 <quote>/banners/from/hell/advert99.jpg</quote>. It would not match
6872 <quote>advert1.gif</quote> (no leading slash), or
6873 <quote>/adverts232.jpg</quote> (the expression does not include an
6874 <quote>s</quote>), or <quote>/advert1.jsp</quote> (<quote>jsp</quote> is not
6875 in the expression anywhere).
6879 We are barely scratching the surface of regular expressions here so that you
6880 can understand the default <application>Privoxy</application>
6881 configuration files, and maybe use this knowledge to customize your own
6882 installation. There is much, much more that can be done with regular
6883 expressions. Now that you know enough to get started, you can learn more on
6888 More reading on Perl Compatible Regular expressions:
6889 <ulink url="http://www.perldoc.com/perl5.6/pod/perlre.html">http://www.perldoc.com/perl5.6/pod/perlre.html</ulink>
6893 For information on regular expression based substitutions and their applications
6894 in filters, please see the <link linkend="filter-file">filter file tutorial</link>
6899 <!-- ~ End section ~ -->
6902 <!-- ~~~~~ New section ~~~~~ -->
6904 <title><application>Privoxy</application>'s Internal Pages</title>
6907 Since <application>Privoxy</application> proxies each requested
6908 web page, it is easy for <application>Privoxy</application> to
6909 trap certain special URLs. In this way, we can talk directly to
6910 <application>Privoxy</application>, and see how it is
6911 configured, see how our rules are being applied, change these
6912 rules and other configuration options, and even turn
6913 <application>Privoxy's</application> filtering off, all with
6919 The URLs listed below are the special ones that allow direct access
6920 to <application>Privoxy</application>. Of course,
6921 <application>Privoxy</application> must be running to access these. If
6922 not, you will get a friendly error message. Internet access is not
6935 <ulink url="http://config.privoxy.org/">http://config.privoxy.org/</ulink>
6939 There is a shortcut: <ulink url="http://p.p/">http://p.p/</ulink> (But it
6940 doesn't provide a fall-back to a real page, in case the request is not
6941 sent through <application>Privoxy</application>)
6947 Show information about the current configuration, including viewing and
6948 editing of actions files:
6952 <ulink url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>
6959 Show the source code version numbers:
6963 <ulink url="http://config.privoxy.org/show-version">http://config.privoxy.org/show-version</ulink>
6970 Show the browser's request headers:
6974 <ulink url="http://config.privoxy.org/show-request">http://config.privoxy.org/show-request</ulink>
6981 Show which actions apply to a URL and why:
6985 <ulink url="http://config.privoxy.org/show-url-info">http://config.privoxy.org/show-url-info</ulink>
6992 Toggle Privoxy on or off. In this case, <quote>Privoxy</quote> continues
6993 to run, but only as a pass-through proxy, with no actions taking place:
6997 <ulink url="http://config.privoxy.org/toggle">http://config.privoxy.org/toggle</ulink>
7001 Short cuts. Turn off, then on:
7005 <ulink url="http://config.privoxy.org/toggle?set=disable">http://config.privoxy.org/toggle?set=disable</ulink>
7010 <ulink url="http://config.privoxy.org/toggle?set=enable">http://config.privoxy.org/toggle?set=enable</ulink>
7019 These may be bookmarked for quick reference. See next.
7023 <sect3 id="bookmarklets">
7024 <title>Bookmarklets</title>
7026 Below are some <quote>bookmarklets</quote> to allow you to easily access a
7027 <quote>mini</quote> version of some of <application>Privoxy's</application>
7028 special pages. They are designed for MS Internet Explorer, but should work
7029 equally well in Netscape, Mozilla, and other browsers which support
7030 JavaScript. They are designed to run directly from your bookmarks - not by
7031 clicking the links below (although that should work for testing).
7034 To save them, right-click the link and choose <quote>Add to Favorites</quote>
7035 (IE) or <quote>Add Bookmark</quote> (Netscape). You will get a warning that
7036 the bookmark <quote>may not be safe</quote> - just click OK. Then you can run the
7037 Bookmarklet directly from your favorites/bookmarks. For even faster access,
7038 you can put them on the <quote>Links</quote> bar (IE) or the <quote>Personal
7039 Toolbar</quote> (Netscape), and run them with a single click.
7048 url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&set=enabled','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Enable</ulink>
7055 url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&set=disabled','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Disable</ulink>
7062 url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&set=toggle','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Toggle Privoxy</ulink> (Toggles between enabled and disabled)
7069 url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y','ijbstatus','width=250,height=2,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy- View Status</ulink>
7075 <ulink url="javascript:w=Math.floor(screen.width/2);h=Math.floor(screen.height*0.9);void(window.open('http://www.privoxy.org/actions/index.php?url='+escape(location.href),'Feedback','screenx='+w+',width='+w+',height='+h+',scrollbars=yes,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Submit Actions File Feedback</ulink>
7081 <ulink url="javascript:void(window.open('http://config.privoxy.org/show-url-info?url='+escape(location.href),'Why').focus());">Privoxy - Why?</ulink>
7088 Credit: The site which gave us the general idea for these bookmarklets is
7089 <ulink url="http://www.bookmarklets.com/">www.bookmarklets.com</ulink>. They
7090 have more information about bookmarklets.
7099 <!-- ~~~~~ New section ~~~~~ -->
7101 <title>Chain of Events</title>
7103 Let's take a quick look at the basic sequence of events when a web page is
7104 requested by your browser and <application>Privoxy</application> is on duty:
7111 First, your web browser requests a web page. The browser knows to send
7112 the request to <application>Privoxy</application>, which will in turn,
7113 relay the request to the remote web server after passing the following
7119 <application>Privoxy</application> traps any request for its own internal CGI
7120 pages (e.g http://p.p/) and sends the CGI page back to the browser.
7125 Next, <application>Privoxy</application> checks to see if the URL
7127 linkend="BLOCK"><quote>+block</quote></link> patterns. If
7128 so, the URL is then blocked, and the remote web server will not be contacted.
7129 <link linkend="HANDLE-AS-IMAGE"><quote>+handle-as-image</quote></link>
7130 is then checked and if it does not match, an
7131 HTML <quote>BLOCKED</quote> page is sent back. Otherwise, if it does match,
7132 an image is returned. The type of image depends on the setting of <link
7133 linkend="SET-IMAGE-BLOCKER"><quote>+set-image-blocker</quote></link>
7134 (blank, checkerboard pattern, or an HTTP redirect to an image elsewhere).
7139 Untrusted URLs are blocked. If URLs are being added to the
7140 <filename>trust</filename> file, then that is done.
7145 If the URL pattern matches the <link
7146 linkend="FAST-REDIRECTS"><quote>+fast-redirects</quote></link> action,
7147 it is then processed. Unwanted parts of the requested URL are stripped.
7152 Now the rest of the client browser's request headers are processed. If any
7153 of these match any of the relevant actions (e.g. <link
7154 linkend="HIDE-USER-AGENT"><quote>+hide-user-agent</quote></link>,
7155 etc.), headers are suppressed or forged as determined by these actions and
7161 Now the web server starts sending its response back (i.e. typically a web page and related
7167 First, the server headers are read and processed to determine, among other
7168 things, the MIME type (document type) and encoding. The headers are then
7169 filtered as determined by the
7170 <link linkend="CRUNCH-INCOMING-COOKIES"><quote>+crunch-incoming-cookies</quote></link>,
7171 <link linkend="SESSION-COOKIES-ONLY"><quote>+session-cookies-only</quote></link>,
7172 and <link linkend="DOWNGRADE-HTTP-VERSION"><quote>+downgrade-http-version</quote></link>
7178 If the <link linkend="KILL-POPUPS"><quote>+kill-popups</quote></link>
7179 action applies, and it is an HTML or JavaScript document, the popup-code in the
7180 response is filtered on-the-fly as it is received.
7185 If a <link linkend="FILTER"><quote>+filter</quote></link>
7187 linkend="DEANIMATE-GIFS"><quote>+deanimate-gifs</quote></link>
7188 action applies (and the document type fits the action), the rest of the page is
7189 read into memory (up to a configurable limit). Then the filter rules (from
7190 <filename>default.filter</filename>) are processed against the buffered
7191 content. Filters are applied in the order they are specified in one of the
7192 filter files. Animated GIFs, if present, are
7193 reduced to either the first or last frame, depending on the action
7194 setting.The entire page, which is now filtered, is then sent by
7195 <application>Privoxy</application> back to your browser.
7198 If neither <link linkend="FILTER"><quote>+filter</quote></link>
7200 linkend="DEANIMATE-GIFS"><quote>+deanimate-gifs</quote></link>
7201 matches, then <application>Privoxy</application> passes the raw data through
7202 to the client browser as it becomes available.
7207 As the browser receives the now (probably filtered) page content, it
7208 reads and then requests any URLs that may be embedded within the page
7209 source, e.g. ad images, stylesheets, JavaScript, other HTML documents (e.g.
7210 frames), sounds, etc. For each of these objects, the browser issues a new
7211 request. And each such request is in turn processed as above. Note that a
7212 complex web page may have many such embedded URLs.
7222 <!-- ~~~~~ New section ~~~~~ -->
7223 <sect2 id="actionsanat">
7224 <title>Anatomy of an Action</title>
7227 The way <application>Privoxy</application> applies
7228 <link linkend="ACTIONS">actions</link> and <link linkend="FILTER">filters</link>
7229 to any given URL can be complex, and not always so
7230 easy to understand what is happening. And sometimes we need to be able to
7231 <emphasis>see</emphasis> just what <application>Privoxy</application> is
7232 doing. Especially, if something <application>Privoxy</application> is doing
7233 is causing us a problem inadvertently. It can be a little daunting to look at
7234 the actions and filters files themselves, since they tend to be filled with
7235 <link linkend="regex">regular expressions</link> whose consequences are not
7240 One quick test to see if <application>Privoxy</application> is causing a problem
7241 or not, is to disable it temporarily. This should be the first troubleshooting
7242 step. See <link linkend="bookmarklets">the Bookmarklets</link> section on a quick
7243 and easy way to do this (be sure to flush caches afterward!). Looking at the
7244 logs is a good idea too.
7248 <application>Privoxy</application> also provides the
7249 <ulink url="http://config.privoxy.org/show-url-info">http://config.privoxy.org/show-url-info</ulink>
7250 page that can show us very specifically how <application>actions</application>
7251 are being applied to any given URL. This is a big help for troubleshooting.
7255 First, enter one URL (or partial URL) at the prompt, and then
7256 <application>Privoxy</application> will tell us
7257 how the current configuration will handle it. This will not
7258 help with filtering effects (i.e. the <link
7259 linkend="FILTER"><quote>+filter</quote></link> action) from
7260 one of the filter files since this is handled very
7261 differently and not so easy to trap! It also will not tell you about any other
7262 URLs that may be embedded within the URL you are testing. For instance, images
7263 such as ads are expressed as URLs within the raw page source of HTML pages. So
7264 you will only get info for the actual URL that is pasted into the prompt area
7265 -- not any sub-URLs. If you want to know about embedded URLs like ads, you
7266 will have to dig those out of the HTML source. Use your browser's <quote>View
7267 Page Source</quote> option for this. Or right click on the ad, and grab the
7272 Let's try an example, <ulink url="http://google.com">google.com</ulink>,
7273 and look at it one section at a time:
7278 Matches for http://google.com:
7280 In file: default.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibutton>
7284 -crunch-outgoing-cookies
7285 -crunch-incoming-cookies
7286 +deanimate-gifs{last}
7287 -downgrade-http-version
7291 -filter{shockwave-flash}
7292 -filter{crude-parental}
7293 +filter{html-annoyances}
7294 +filter{js-annoyances}
7295 +filter{content-cookies}
7297 +filter{refresh-tags}
7299 +filter{banners-by-size}
7300 +hide-forwarded-for-headers
7301 +hide-from-header{block}
7302 +hide-referer{forge}
7307 +prevent-compression
7310 +session-cookies-only
7311 +set-image-blocker{pattern} }
7314 { -session-cookies-only }
7320 In file: user.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibutton>
7321 (no matches in this file)
7326 This tells us how we have defined our
7327 <link linkend="ACTIONS"><quote>actions</quote></link>, and
7328 which ones match for our example, <quote>google.com</quote>. The first listing
7329 is any matches for the <filename>standard.action</filename> file. No hits at
7330 all here on <quote>standard</quote>. Then next is <quote>default</quote>, or
7331 our <filename>default.action</filename> file. The large, multi-line listing,
7332 is how the actions are set to match for all URLs, i.e. our default settings.
7333 If you look at your <quote>actions</quote> file, this would be the section
7334 just below the <quote>aliases</quote> section near the top. This will apply to
7335 all URLs as signified by the single forward slash at the end of the listing
7336 -- <quote>/</quote>.
7340 But we can define additional actions that would be exceptions to these general
7341 rules, and then list specific URLs (or patterns) that these exceptions would
7342 apply to. Last match wins. Just below this then are two explicit matches for
7343 <quote>.google.com</quote>. The first is negating our previous cookie setting,
7345 linkend="SESSION-COOKIES-ONLY"><quote>+session-cookies-only</quote></link>
7346 (i.e. not persistent). So we will allow persistent cookies for google. The
7347 second turns <emphasis>off</emphasis> any
7349 linkend="FAST-REDIRECTS"><quote>+fast-redirects</quote></link>
7350 action, allowing this to take place unmolested. Note that there is a leading
7351 dot here -- <quote>.google.com</quote>. This will match any hosts and
7352 sub-domains, in the google.com domain also, such as
7353 <quote>www.google.com</quote>. So, apparently, we have these two actions
7354 defined somewhere in the lower part of our <filename>default.action</filename>
7355 file, and <quote>google.com</quote> is referenced somewhere in these latter
7360 Then, for our <filename>user.action</filename> file, we again have no hits.
7364 And finally we pull it all together in the bottom section and summarize how
7365 <application>Privoxy</application> is applying all its <quote>actions</quote>
7366 to <quote>google.com</quote>:
7377 -crunch-outgoing-cookies
7378 -crunch-incoming-cookies
7379 +deanimate-gifs{last}
7380 -downgrade-http-version
7384 -filter{shockwave-flash}
7385 -filter{crude-parental}
7386 +filter{html-annoyances}
7387 +filter{js-annoyances}
7388 +filter{content-cookies}
7390 +filter{refresh-tags}
7392 +filter{banners-by-size}
7393 +hide-forwarded-for-headers
7394 +hide-from-header{block}
7395 +hide-referer{forge}
7400 +prevent-compression
7403 -session-cookies-only
7404 +set-image-blocker{pattern}
7409 Notice the only difference here to the previous listing, is to
7410 <quote>fast-redirects</quote> and <quote>session-cookies-only</quote>,
7411 which are actived specifically for this site in our configuration.
7415 Now another example, <quote>ad.doubleclick.net</quote>:
7421 { +block +handle-as-image }
7424 { +block +handle-as-image }
7427 { +block +handle-as-image }
7433 We'll just show the interesting part here, the explicit matches. It is
7434 matched three different times. Each as an <quote>+block +handle-as-image</quote>,
7435 which is the expanded form of one of our aliases that had been defined as:
7436 <quote>+imageblock</quote>. (<link
7437 linkend="ALIASES"><quote>Aliases</quote></link> are defined in
7438 the first section of the actions file and typically used to combine more
7443 Any one of these would have done the trick and blocked this as an unwanted
7444 image. This is unnecessarily redundant since the last case effectively
7445 would also cover the first. No point in taking chances with these guys
7446 though ;-) Note that if you want an ad or obnoxious
7447 URL to be invisible, it should be defined as <quote>ad.doubleclick.net</quote>
7448 is done here -- as both a <link
7449 linkend="BLOCK"><quote>+block</quote></link>
7450 <emphasis>and</emphasis> an
7452 linkend="HANDLE-AS-IMAGE"><quote>+handle-as-image</quote></link>.
7453 The custom alias <quote>+imageblock</quote> just simplifies the process and make
7458 One last example. Let's try <quote>http://www.rhapsodyk.net/adsl/HOWTO/</quote>.
7459 This one is giving us problems. We are getting a blank page. Hmmm ...
7465 Matches for http://www.rhapsodyk.net/adsl/HOWTO/:
7467 In file: default.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibutton>
7471 -crunch-incoming-cookies
7472 -crunch-outgoing-cookies
7474 -downgrade-http-version
7476 +filter{html-annoyances}
7477 +filter{js-annoyances}
7478 +filter{kill-popups}
7481 +filter{banners-by-size}
7484 +hide-forwarded-for-headers
7485 +hide-from-header{block}
7486 +hide-referer{forge}
7490 +prevent-compression
7493 +session-cookies-only
7494 +set-image-blocker{blank} }
7497 { +block +handle-as-image }
7503 Ooops, the <quote>/adsl/</quote> is matching <quote>/ads</quote>! But
7504 we did not want this at all! Now we see why we get the blank page. We could
7505 now add a new action below this that explicitly does <emphasis>not</emphasis>
7506 block (<quote>{-block}</quote>) paths with <quote>adsl</quote>. There are
7507 various ways to handle such exceptions. Example:
7519 Now the page displays ;-) Be sure to flush your browser's caches when
7520 making such changes. Or, try using <literal>Shift+Reload</literal>.
7524 But now what about a situation where we get no explicit matches like
7531 { +block +handle-as-image }
7537 That actually was very telling and pointed us quickly to where the problem
7538 was. If you don't get this kind of match, then it means one of the default
7539 rules in the first section is causing the problem. This would require some
7540 guesswork, and maybe a little trial and error to isolate the offending rule.
7541 One likely cause would be one of the <quote>{+filter}</quote> actions. These
7542 tend to be harder to troubleshoot. Try adding the URL for the site to one of
7543 aliases that turn off <quote>+filter</quote>:
7551 .worldpay.com # for quietpc.com
7559 <quote>{shop}</quote> is an <quote>alias</quote> that expands to
7560 <quote>{ -filter -session-cookies-only }</quote>.
7561 Or you could do your own exception to negate filtering:
7574 This would turn off all filtering for that site. This would probably be most
7575 appropriately put in <filename>user.action</filename>, for local site
7580 Images that are inexplicably being blocked, may well be hitting the
7581 <quote>+filter{banners-by-size}</quote> rule, which assumes
7582 that images of certain sizes are ad banners (works well most of the time
7583 since these tend to be standardized).
7587 <quote>{fragile}</quote> is an alias that disables most actions. This can be
7588 used as a last resort for problem sites. Remember to flush caches! If this
7589 still does not work, you will have to go through the remaining actions one by
7590 one to find which one(s) is causing the problem.
7599 This program is free software; you can redistribute it
7600 and/or modify it under the terms of the GNU General
7601 Public License as published by the Free Software
7602 Foundation; either version 2 of the License, or (at
7603 your option) any later version.
7605 This program is distributed in the hope that it will
7606 be useful, but WITHOUT ANY WARRANTY; without even the
7607 implied warranty of MERCHANTABILITY or FITNESS FOR A
7608 PARTICULAR PURPOSE. See the GNU General Public
7609 License for more details.
7611 The GNU General Public License should be included with
7612 this file. If not, you can view it at
7613 http://www.gnu.org/copyleft/gpl.html
7614 or write to the Free Software Foundation, Inc., 59
7615 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
7617 $Log: user-manual.sgml,v $
7618 Revision 2.13 2006/08/22 11:04:59 hal9
7619 Silence warnings and errors. This should build now. New filters were only
7620 stubbed in. More to be done.
7622 Revision 2.12 2006/08/14 08:40:39 fabiankeil
7623 Documented new actions that were part of
7624 the "minor Privoxy improvements".
7626 Revision 2.11 2006/07/18 14:48:51 david__schmidt
7627 Reorganizing the repository: swapping out what was HEAD (the old 3.1 branch)
7628 with what was really the latest development (the v_3_0_branch branch)
7630 Revision 1.123.2.43 2005/05/23 09:59:10 hal9
7633 Revision 1.123.2.42 2004/12/04 14:39:57 hal9
7634 Fix two minor typos per bug SF report.
7636 Revision 1.123.2.41 2004/03/23 12:58:42 oes
7639 Revision 1.123.2.40 2004/02/27 12:48:49 hal9
7640 Add comment re: redirecting to local file system for set-image-blocker may
7641 is dependent on browser.
7643 Revision 1.123.2.39 2004/01/30 22:31:40 oes
7644 Added a hint re bookmarklets to Quickstart section
7646 Revision 1.123.2.38 2004/01/30 16:47:51 oes
7647 Some minor clarifications
7649 Revision 1.123.2.37 2004/01/29 22:36:11 hal9
7650 Updates for no longer filtering text/plain, and demoronizer default settings,
7651 and copyright notice dates.
7653 Revision 1.123.2.36 2003/12/10 02:26:26 hal9
7654 Changed the demoronizer filter description.
7656 Revision 1.123.2.35 2003/11/06 13:36:37 oes
7657 Updated link to nightly CVS tarball
7659 Revision 1.123.2.34 2003/06/26 23:50:16 hal9
7660 Add a small bit on filtering and problems re: source code being corrupted.
7662 Revision 1.123.2.33 2003/05/08 18:17:33 roro
7663 Use apt-get instead of dpkg to install Debian package, which is more
7664 solid, uses the correct and most recent Debian version automatically.
7666 Revision 1.123.2.32 2003/04/11 03:13:57 hal9
7667 Add small note about only one filterfile (as opposed to multiple actions
7670 Revision 1.123.2.31 2003/03/26 02:03:43 oes
7671 Updated hard-coded copyright dates
7673 Revision 1.123.2.30 2003/03/24 12:58:56 hal9
7674 Add new section on Predefined Filters.
7676 Revision 1.123.2.29 2003/03/20 02:45:29 hal9
7677 More problems with \-\-chroot causing markup problems :(
7679 Revision 1.123.2.28 2003/03/19 00:35:24 hal9
7680 Manual edit of revision log because 'chroot' (even inside a comment) was
7681 causing Docbook to hang here (due to double hyphen and the processor thinking
7684 Revision 1.123.2.27 2003/03/18 19:37:14 oes
7685 s/Advanced|Radical/Adventuresome/g to avoid complaints re fun filter
7687 Revision 1.123.2.26 2003/03/17 16:50:53 oes
7688 Added documentation for new chroot option
7690 Revision 1.123.2.25 2003/03/15 18:36:55 oes
7691 Adapted to the new filters
7693 Revision 1.123.2.24 2002/11/17 06:41:06 hal9
7694 Move default profiles table from FAQ to U-M, and other minor related changes.
7697 Revision 1.123.2.23 2002/10/21 02:32:01 hal9
7698 Updates to the user.action examples section. A few new ones.
7700 Revision 1.123.2.22 2002/10/12 00:51:53 hal9
7701 Add demoronizer to filter section.
7703 Revision 1.123.2.21 2002/10/10 04:09:35 hal9
7704 s/Advanced/Radical/ and added very brief note.
7706 Revision 1.123.2.20 2002/10/10 03:49:21 hal9
7707 Add notes to session-cookies-only and Quickstart about pre-existing
7708 cookies. Also, note content-cookies work differently.
7710 Revision 1.123.2.19 2002/09/26 01:25:36 hal9
7711 More explanation on Privoxy patterns, more on content-cookies and SSL.
7713 Revision 1.123.2.18 2002/08/22 23:47:58 hal9
7714 Add 'Documentation' to Privoxy Menu shot in Configuration section to match
7717 Revision 1.123.2.17 2002/08/18 01:13:05 hal9
7718 Spell checked (only one typo this time!).
7720 Revision 1.123.2.16 2002/08/09 19:20:54 david__schmidt
7721 Update to Mac OSX startup script name
7723 Revision 1.123.2.15 2002/08/07 17:32:11 oes
7724 Converted some internal links from ulink to link for PDF creation; no content changed
7726 Revision 1.123.2.14 2002/08/06 09:16:13 oes
7727 Nits re: actions file download
7729 Revision 1.123.2.13 2002/08/02 18:23:19 g_sauthoff
7730 Just 2 small corrections to the Gentoo sections
7732 Revision 1.123.2.12 2002/08/02 18:17:21 g_sauthoff
7733 Added 2 Gentoo sections
7735 Revision 1.123.2.11 2002/07/26 15:20:31 oes
7736 - Added version info to title
7737 - Added info on new filters
7738 - Revised parts of the filter file tutorial
7739 - Added info on where to get updated actions files
7741 Revision 1.123.2.10 2002/07/25 21:42:29 hal9
7742 Add brief notes on not proxying non-HTTP protocols.
7744 Revision 1.123.2.9 2002/07/11 03:40:28 david__schmidt
7746 Updated Mac OSX sections due to installation location change
7748 Revision 1.123.2.8 2002/06/09 16:36:32 hal9
7749 Clarifications on filtering and MIME. Hardcode 'latest release' in index.html.
7751 Revision 1.123.2.7 2002/06/09 00:29:34 hal9
7752 Touch ups on filtering, in actions section and Anatomy.
7754 Revision 1.123.2.6 2002/06/06 23:11:03 hal9
7755 Fix broken link. Linkchecked all docs.
7757 Revision 1.123.2.5 2002/05/29 02:01:02 hal9
7758 This is break out of the entire config section from u-m, so it can
7759 eventually be used to generate the comments, etc in the main config file
7760 so that these are in sync with each other.
7762 Revision 1.123.2.4 2002/05/27 03:28:45 hal9
7763 Ooops missed something from David.
7765 Revision 1.123.2.3 2002/05/27 03:23:17 hal9
7766 Fix FIXMEs for OS2 and OSX startup. Fix Redhat typos (should be Red Hat).
7767 That's a wrap, I think.
7769 Revision 1.123.2.2 2002/05/26 19:02:09 hal9
7770 Move Amiga stuff around to take of FIXME in start up section.
7772 Revision 1.123.2.1 2002/05/26 17:04:25 hal9
7773 -Spellcheck, very minor edits, and sync across branches
7775 Revision 1.123 2002/05/24 23:19:23 hal9
7776 Include new image (Proxy setup). More fun with guibutton.
7777 Minor corrections/clarifications here and there.
7779 Revision 1.122 2002/05/24 13:24:08 oes
7780 Added Bookmarklet for one-click pre-filled access to show-url-info
7782 Revision 1.121 2002/05/23 23:20:17 oes
7783 - Changed more (all?) references to actions to the
7784 <literal><link> style.
7785 - Small fixes in the actions chapter
7786 - Small clarifications in the quickstart to ad blocking
7787 - Removed <emphasis> from <title>s since the new doc CSS
7788 renders them red (bad in TOC).
7790 Revision 1.120 2002/05/23 19:16:43 roro
7791 Correct Debian specials (installation and startup).
7793 Revision 1.119 2002/05/22 17:17:05 oes
7796 Revision 1.118 2002/05/21 04:54:55 hal9
7797 -New Section: Quickstart to Ad Blocking
7798 -Reformat Actions Anatomy to match new CGI layout
7800 Revision 1.117 2002/05/17 13:56:16 oes
7801 - Reworked & extended Templates chapter
7802 - Small changes to Regex appendix
7803 - #included authors.sgml into (C) and hist chapter
7805 Revision 1.116 2002/05/17 03:23:46 hal9
7806 Fixing merge conflict in Quickstart section.
7808 Revision 1.115 2002/05/16 16:25:00 oes
7809 Extended the Filter File chapter & minor fixes
7811 Revision 1.114 2002/05/16 09:42:50 oes
7812 More ulink->link, added some hints to Quickstart section
7814 Revision 1.113 2002/05/15 21:07:25 oes
7815 Extended and further commented the example actions files
7817 Revision 1.112 2002/05/15 03:57:14 hal9
7818 Spell check. A few minor edits here and there for better syntax and
7821 Revision 1.111 2002/05/14 23:01:36 oes
7824 Revision 1.110 2002/05/14 19:10:45 oes
7825 Restored alphabetical order of actions
7827 Revision 1.109 2002/05/14 17:23:11 oes
7828 Renamed the prevent-*-cookies actions, extended aliases section and moved it before the example AFs
7830 Revision 1.108 2002/05/14 15:29:12 oes
7831 Completed proofreading the actions chapter
7833 Revision 1.107 2002/05/12 03:20:41 hal9
7834 Small clarifications for 127.0.0.1 vs localhost for listen-address since this
7835 apparently an important distinction for some OS's.
7837 Revision 1.106 2002/05/10 01:48:20 hal9
7838 This is mostly proposed copyright/licensing additions and changes. Docs
7839 are still GPL, but licensing and copyright are more visible. Also, copyright
7840 changed in doc header comments (eliminate references to JB except FAQ).
7842 Revision 1.105 2002/05/05 20:26:02 hal9
7843 Sorting out license vs copyright in these docs.
7845 Revision 1.104 2002/05/04 08:44:45 swa
7848 Revision 1.103 2002/05/04 00:40:53 hal9
7849 -Remove the TOC first page kludge. It's fixed proper now in ldp.dsl.in.
7850 -Some minor additions to Quickstart.
7852 Revision 1.102 2002/05/03 17:46:00 oes
7853 Further proofread & reactivated short build instructions
7855 Revision 1.101 2002/05/03 03:58:30 hal9
7856 Move the user-manual config directive to top of section. Add note about
7857 Privoxy needing read permissions for configs, and write for logs.
7859 Revision 1.100 2002/04/29 03:05:55 hal9
7860 Add clarification on differences of new actions files.
7862 Revision 1.99 2002/04/28 16:59:05 swa
7863 more structure in starting section
7865 Revision 1.98 2002/04/28 05:43:59 hal9
7866 This is the break up of configuration.html into multiple files. This
7867 will probably break links elsewhere :(
7869 Revision 1.97 2002/04/27 21:04:42 hal9
7870 -Rewrite of Actions File example.
7871 -Add section for user-manual directive in config.
7873 Revision 1.96 2002/04/27 05:32:00 hal9
7874 -Add short section to Filter Files to tie in with +filter action.
7875 -Start rewrite of examples in Actions Examples (not finished).
7877 Revision 1.95 2002/04/26 17:23:29 swa
7878 bookmarks cleaned, changed structure of user manual, screen and programlisting cleanups, and numerous other changes that I forgot
7880 Revision 1.94 2002/04/26 05:24:36 hal9
7881 -Add most of Andreas suggestions to Chain of Events section.
7882 -A few other minor corrections and touch up.
7884 Revision 1.92 2002/04/25 18:55:13 hal9
7885 More catchups on new actions files, and new actions names.
7886 Other assorted cleanups, and minor modifications.
7888 Revision 1.91 2002/04/24 02:39:31 hal9
7889 Add 'Chain of Events' section.
7891 Revision 1.90 2002/04/23 21:41:25 hal9
7892 Linuxconf is deprecated on RH, substitute chkconfig.
7894 Revision 1.89 2002/04/23 21:05:28 oes
7895 Added hint for startup on Red Hat
7897 Revision 1.88 2002/04/23 05:37:54 hal9
7898 Add AmigaOS install stuff.
7900 Revision 1.87 2002/04/23 02:53:15 david__schmidt
7901 Updated OSX installation section
7902 Added a few English tweaks here an there
7904 Revision 1.86 2002/04/21 01:46:32 hal9
7905 Re-write actions section.
7907 Revision 1.85 2002/04/18 21:23:23 hal9
7908 Fix ugly typo (mine).
7910 Revision 1.84 2002/04/18 21:17:13 hal9
7911 Spell Redhat correctly (ie Red Hat). A few minor grammar corrections.
7913 Revision 1.83 2002/04/18 18:21:12 oes
7914 Added RPM install detail
7916 Revision 1.82 2002/04/18 12:04:50 oes
7919 Revision 1.81 2002/04/18 11:50:24 oes
7920 Extended Install section - needs fixing by packagers
7922 Revision 1.80 2002/04/18 10:45:19 oes
7923 Moved text to buildsource.sgml, renamed some filters, details
7925 Revision 1.79 2002/04/18 03:18:06 hal9
7926 Spellcheck, and minor touchups.
7928 Revision 1.78 2002/04/17 18:04:16 oes
7931 Revision 1.77 2002/04/17 13:51:23 oes
7932 Proofreading, part one
7934 Revision 1.76 2002/04/16 04:25:51 hal9
7935 -Added 'Note to Upgraders' and re-ordered the 'Quickstart' section.
7936 -Note about proxy may need requests to re-read config files.
7938 Revision 1.75 2002/04/12 02:08:48 david__schmidt
7939 Remove OS/2 building info... it is already in the developer-manual
7941 Revision 1.74 2002/04/11 00:54:38 hal9
7942 Add small section on submitting actions.
7944 Revision 1.73 2002/04/10 18:45:15 swa
7947 Revision 1.72 2002/04/10 04:06:19 hal9
7948 Added actions feedback to Bookmarklets section
7950 Revision 1.71 2002/04/08 22:59:26 hal9
7951 Version update. Spell chkconfig correctly :)
7953 Revision 1.70 2002/04/08 20:53:56 swa
7956 Revision 1.69 2002/04/06 05:07:29 hal9
7957 -Add privoxy-man-page.sgml, for man page.
7958 -Add authors.sgml for AUTHORS (and p-authors.sgml)
7959 -Reworked various aspects of various docs.
7960 -Added additional comments to sub-docs.
7962 Revision 1.68 2002/04/04 18:46:47 swa
7963 consistent look. reuse of copyright, history et. al.
7965 Revision 1.67 2002/04/04 17:27:57 swa
7966 more single file to be included at multiple points. make maintaining easier
7968 Revision 1.66 2002/04/04 06:48:37 hal9
7969 Structural changes to allow for conditional inclusion/exclusion of content
7970 based on entity toggles, e.g. 'entity % p-not-stable "INCLUDE"'. And
7971 definition of internal entities, e.g. 'entity p-version "2.9.13"' that will
7972 eventually be set by Makefile.
7973 More boilerplate text for use across multiple docs.
7975 Revision 1.65 2002/04/03 19:52:07 swa
7976 enhance squid section due to user suggestion
7978 Revision 1.64 2002/04/03 03:53:43 hal9
7979 A few minor bug fixes, and touch ups. Ready for review.
7981 Revision 1.63 2002/04/01 16:24:49 hal9
7982 Define entities to include boilerplate text. See doc/source/*.
7984 Revision 1.62 2002/03/30 04:15:53 hal9
7985 - Fix privoxy.org/config links.
7986 - Paste in Bookmarklets from Toggle page.
7987 - Move Quickstart nearer top, and minor rework.
7989 Revision 1.61 2002/03/29 01:31:08 hal9
7992 Revision 1.60 2002/03/27 01:57:34 hal9
7993 Added more to Anatomy section.
7995 Revision 1.59 2002/03/27 00:54:33 hal9
7996 Touch up intro for new name.
7998 Revision 1.58 2002/03/26 22:29:55 swa
7999 we have a new homepage!
8001 Revision 1.57 2002/03/24 20:33:30 hal9
8002 A few minor catch ups with name change.
8004 Revision 1.56 2002/03/24 16:17:06 swa
8005 configure needs to be generated.
8007 Revision 1.55 2002/03/24 16:08:08 swa
8008 we are too lazy to make a block-built
8009 privoxy logo. hence removed the option.
8011 Revision 1.54 2002/03/24 15:46:20 swa
8012 name change related issue.
8014 Revision 1.53 2002/03/24 11:51:00 swa
8015 name change. changed filenames.
8017 Revision 1.52 2002/03/24 11:01:06 swa
8020 Revision 1.51 2002/03/23 15:13:11 swa
8021 renamed every reference to the old name with foobar.
8022 fixed "application foobar application" tag, fixed
8023 "the foobar" with "foobar". left junkbustser in cvs
8024 comments and remarks to history untouched.
8026 Revision 1.50 2002/03/23 05:06:21 hal9
8029 Revision 1.49 2002/03/21 17:01:05 hal9
8030 New section in Appendix.
8032 Revision 1.48 2002/03/12 06:33:01 hal9
8033 Catching up to Andreas and re_filterfile changes.
8035 Revision 1.47 2002/03/11 13:13:27 swa
8036 correct feedback channels
8038 Revision 1.46 2002/03/10 00:51:08 hal9
8039 Added section on JB internal pages in Appendix.
8041 Revision 1.45 2002/03/09 17:43:53 swa
8044 Revision 1.44 2002/03/09 17:08:48 hal9
8045 New section on Jon's actions file editor, and move some stuff around.
8047 Revision 1.43 2002/03/08 00:47:32 hal9
8048 Added imageblock{pattern}.
8050 Revision 1.42 2002/03/07 18:16:55 swa
8053 Revision 1.41 2002/03/07 16:46:43 hal9
8054 Fix a few markup problems for jade.
8056 Revision 1.40 2002/03/07 16:28:39 swa
8057 provide correct feedback channels
8059 Revision 1.39 2002/03/06 16:19:28 hal9
8060 Note on perceived filtering slowdown per FR.
8062 Revision 1.38 2002/03/05 23:55:14 hal9
8063 Stupid I did it again. Double hyphen in comment breaks jade.
8065 Revision 1.37 2002/03/05 23:53:49 hal9
8066 jade barfs on '- -' embedded in comments. - -user option broke it.
8068 Revision 1.36 2002/03/05 22:53:28 hal9
8069 Add new - - user option.
8071 Revision 1.35 2002/03/05 00:17:27 hal9
8072 Added section on command line options.
8074 Revision 1.34 2002/03/04 19:32:07 oes
8075 Changed default port to 8118
8077 Revision 1.33 2002/03/03 19:46:13 hal9
8078 Emphasis on where/how to report bugs, etc
8080 Revision 1.32 2002/03/03 09:26:06 joergs
8081 AmigaOS changes, config is now loaded from PROGDIR: instead of
8082 AmiTCP:db/junkbuster/ if no configuration file is specified on the
8085 Revision 1.31 2002/03/02 22:45:52 david__schmidt
8088 Revision 1.30 2002/03/02 22:00:14 hal9
8089 Updated 'New Features' list. Ran through spell-checker.
8091 Revision 1.29 2002/03/02 20:34:07 david__schmidt
8092 Update OS/2 build section
8094 Revision 1.28 2002/02/24 14:34:24 jongfoster
8095 Formatting changes. Now changing the doctype to DocBook XML 4.1
8096 will work - no other changes are needed.
8098 Revision 1.27 2002/01/11 14:14:32 hal9
8099 Added a very short section on Templates
8101 Revision 1.26 2002/01/09 20:02:50 hal9
8102 Fix bug re: auto-detect config file changes.
8104 Revision 1.25 2002/01/09 18:20:30 hal9
8105 Touch ups for *.action files.
8107 Revision 1.24 2001/12/02 01:13:42 hal9
8110 Revision 1.23 2001/12/02 00:20:41 hal9
8111 Updates for recent changes.
8113 Revision 1.22 2001/11/05 23:57:51 hal9
8114 Minor update for startup now daemon mode.
8116 Revision 1.21 2001/10/31 21:11:03 hal9
8117 Correct 2 minor errors
8119 Revision 1.18 2001/10/24 18:45:26 hal9
8120 *** empty log message ***
8122 Revision 1.17 2001/10/24 17:10:55 hal9
8123 Catching up with Jon's recent work, and a few other things.
8125 Revision 1.16 2001/10/21 17:19:21 swa
8126 wrong url in documentation
8128 Revision 1.15 2001/10/14 23:46:24 hal9
8129 Various minor changes. Fleshed out SEE ALSO section.
8131 Revision 1.13 2001/10/10 17:28:33 hal9
8134 Revision 1.12 2001/09/28 02:57:04 hal9
8137 Revision 1.11 2001/09/28 02:25:20 hal9
8140 Revision 1.9 2001/09/27 23:50:29 hal9
8141 A few changes. A short section on regular expression in appendix.
8143 Revision 1.8 2001/09/25 00:34:59 hal9
8144 Some additions, and re-arranging.
8146 Revision 1.7 2001/09/24 14:31:36 hal9
8149 Revision 1.6 2001/09/24 14:10:32 hal9
8150 Including David's OS/2 installation instructions.
8152 Revision 1.2 2001/09/13 15:27:40 swa
8155 Revision 1.1 2001/09/12 15:36:41 swa
8156 source files for junkbuster documentation
8158 Revision 1.3 2001/09/10 17:43:59 swa
8159 first proposal of a structure.
8161 Revision 1.2 2001/06/13 14:28:31 swa
8162 docs should have an author.
8164 Revision 1.1 2001/06/13 14:20:37 swa
8165 first import of project's documentation for the webserver.