1 <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V3.1//EN" [
2 <!entity % dummy "IGNORE">
3 <!entity supported SYSTEM "supported.sgml">
4 <!entity newfeatures SYSTEM "newfeatures.sgml">
5 <!entity p-intro SYSTEM "privoxy.sgml">
6 <!entity seealso SYSTEM "seealso.sgml">
7 <!entity buildsource SYSTEM "buildsource.sgml">
8 <!entity contacting SYSTEM "contacting.sgml">
9 <!entity history SYSTEM "history.sgml">
10 <!entity copyright SYSTEM "copyright.sgml">
11 <!entity license SYSTEM "license.sgml">
12 <!entity p-authors SYSTEM "p-authors.sgml">
13 <!entity config SYSTEM "p-config.sgml">
14 <!entity p-version "2.9.15">
15 <!entity p-status "beta">
16 <!entity % p-authors-formal "INCLUDE"> <!-- include additional text, etc -->
17 <!entity % p-not-stable "INCLUDE">
18 <!entity % p-stable "IGNORE">
19 <!entity % p-text "IGNORE"> <!-- define we are not a text only doc -->
20 <!entity % p-doc "INCLUDE"> <!-- and we are a formal doc -->
21 <!entity % p-readme "IGNORE">
22 <!entity % user-man "IGNORE">
23 <!entity % config-file "IGNORE">
24 <!entity % p-supp-userman "IGNORE"> <!-- Omit some from supported.sgml -->
25 <!entity my-copy "©"> <!-- kludge for docbook2man -->
26 <!entity % draft "IGNORE"> <!-- WIP stuff -->
29 File : $Source: /cvsroot/ijbswa/current/doc/source/user-manual.sgml,v $
32 This file belongs into
33 ijbswa.sourceforge.net:/home/groups/i/ij/ijbswa/htdocs/
35 $Id: user-manual.sgml,v 1.123.2.9 2002/07/11 03:40:28 david__schmidt Exp $
37 Copyright (C) 2001, 2002 Privoxy Developers <developers@privoxy.org>
40 ========================================================================
41 NOTE: Please read developer-manual/documentation.html before touching
42 anything in this, or other Privoxy documentation.
43 ========================================================================
50 <title>Privoxy User Manual</title>
54 <!-- Completely the wrong markup, but very little is allowed -->
55 <!-- in this part of an article. FIXME -->
56 <link linkend="copyright">Copyright</link> &my-copy; 2001, 2002 by
57 <ulink url="http://www.privoxy.org">Privoxy Developers</ulink>
61 <pubdate>$Id: user-manual.sgml,v 1.123.2.9 2002/07/11 03:40:28 david__schmidt Exp $</pubdate>
65 Note: the following should generate a separate page, and a live link to it,
66 all nicely done. But it doesn't for some mysterious reason. Please leave
67 commented unless it can be fixed proper. For the time being, the
68 copyright/license declarations will be in their own sgml.
75 <holder>Privoxy Developers</holder>
78 <legalnotice id="legalnotice">
80 text goes here ........
92 This is here to keep vim syntax file from breaking :/
93 If I knew enough to fix it, I would.
94 PLEASE DO NOT REMOVE! HB: hal@foobox.net
100 The <citetitle>User Manual</citetitle> gives users information on how to
101 install, configure and use <ulink
102 url="http://www.privoxy.org/"><application>Privoxy</application></ulink>.
105 <!-- Include privoxy.sgml boilerplate: -->
107 <!-- end privoxy.sgml -->
110 You can find the latest version of the <citetitle>User Manual</citetitle> at <ulink
111 url="http://www.privoxy.org/user-manual/">http://www.privoxy.org/user-manual/</ulink>.
112 Please see the <ulink url="contact.html">Contact section</ulink> on how to
113 contact the developers.
117 <!-- Feel free to send a note to the developers at <email>ijbswa-developers@lists.sourceforge.net</email>. -->
123 <!-- ~~~~~ New section ~~~~~ -->
124 <sect1 label="1" id="introduction"><title>Introduction</title>
126 This documentation is included with the current &p-status; version of
127 <application>Privoxy</application>, v.&p-version;<![%p-not-stable;[,
128 and is mostly complete at this point. The most up to date reference for the
129 time being is still the comments in the source files and in the individual
130 configuration files. Development of version 3.0 is currently nearing
131 completion, and includes many significant changes and enhancements over
132 earlier versions. The target release date for
133 stable v3.0 is <quote>soon</quote> ;-)]]>.
136 <!-- include only in non-stable versions -->
139 Since this is a &p-status; version, not all new features are well tested. This
140 documentation may be slightly out of sync as a result (especially with
141 CVS sources). And there <emphasis>may be</emphasis> bugs, though hopefully
146 <!-- ~~~~~ New section ~~~~~ -->
147 <sect2 id="features"><title>Features</title>
149 In addition to <application>Internet Junkbuster's</application> traditional
150 features of ad and banner blocking and cookie management,
151 <application>Privoxy</application> provides new features<![%p-not-stable;[,
152 some of them currently under development]]>:
154 <!-- Include newfeatures.sgml boilerplate here: -->
156 <!-- end boilerplate -->
161 <!-- ~ End section ~ -->
164 <!-- ~~~~~ New section ~~~~~ -->
165 <sect1 id="installation"><title>Installation</title>
168 <application>Privoxy</application> is available both in convenient pre-compiled
169 packages for a wide range of operating systems, and as raw source code.
170 For most users, we recommend using the packages, which can be downloaded from our
171 <ulink url="http://sourceforge.net/projects/ijbswa/">Privoxy Project
176 Note: If you have a previous <application>Junkbuster</application> or
177 <application>Privoxy</application> installation on your system, you
178 will need to remove it. On some platforms, this may be done for you as part
179 of their installation procedure. (See below for your platform). In any case
180 <emphasis>be sure to backup your old configuration if it is valuable to
181 you.</emphasis> See the <link linkend="upgradersnote">note to
182 upgraders</link> section below.
185 <!-- ~~~~~ New section ~~~~~ -->
186 <sect2 id="installation-packages"><title>Binary Packages</title>
188 How to install the binary packages depends on your operating system:
191 <!-- ~~~~~ New section ~~~~~ -->
192 <sect3 id="installation-pack-rpm"><title>Red Hat, SuSE and Conectiva RPMs</title>
195 RPMs can be installed with <literal>rpm -Uvh privoxy-&p-version;-1.rpm</literal>,
196 and will use <filename>/etc/privoxy</filename> for the location
197 of configuration files.
201 Note that on Red Hat, <application>Privoxy</application> will
202 <emphasis>not</emphasis> be automatically started on system boot. You will
203 need to enable that using <command>chkconfig</command>,
204 <command>ntsysv</command>, or similar methods. Note that SuSE will
205 automatically start Privoxy in the boot process.
209 If you have problems with failed dependencies, try rebuilding the SRC RPM:
210 <literal>rpm --rebuild privoxy-&p-version;-1.src.rpm</literal>. This
211 will use your locally installed libraries and RPM version.
215 Also note that if you have a <application>Junkbuster</application> RPM installed
216 on your system, you need to remove it first, because the packages conflict.
217 Otherwise, RPM will try to remove <application>Junkbuster</application>
218 automatically, before installing <application>Privoxy</application>.
222 <!-- ~~~~~ New section ~~~~~ -->
223 <sect3 id="installation-deb"><title>Debian</title>
225 DEBs can be installed with <literal>dpkg -i
226 privoxy_&p-version;-1.deb</literal>, and will use
227 <filename>/etc/privoxy</filename> for the location of configuration
232 <!-- ~~~~~ New section ~~~~~ -->
233 <sect3 id="installation-pack-win"><title>Windows</title>
236 Just double-click the installer, which will guide you through
237 the installation process. You will find the configuration files
238 in the same directory as you installed Privoxy in. We do not
239 use the registry of Windows.
243 <!-- ~~~~~ New section ~~~~~ -->
244 <sect3 id="installation-pack-bintgz"><title>Solaris, NetBSD, FreeBSD, HP-UX</title>
247 Create a new directory, <literal>cd</literal> to it, then unzip and
248 untar the archive. For the most part, you'll have to figure out where
249 things go. <!-- FIXME, more info needed? -->
253 <!-- ~~~~~ New section ~~~~~ -->
254 <sect3 id="installation-os2"><title>OS/2</title>
257 First, make sure that no previous installations of
258 <application>Junkbuster</application> and / or
259 <application>Privoxy</application> are left on your
260 system. Check that no <application>Junkbuster</application>
261 or <application>Privoxy</application> objects are in
267 Then, just double-click the WarpIN self-installing archive, which will
268 guide you through the installation process. A shadow of the
269 <application>Privoxy</application> executable will be placed in your
270 startup folder so it will start automatically whenever OS/2 starts.
274 The directory you choose to install <application>Privoxy</application>
275 into will contain all of the configuration files.
279 <!-- ~~~~~ New section ~~~~~ -->
280 <sect3 id="installation-mac"><title>Mac OSX</title>
282 Unzip the downloaded package (you can either double-click on the file
283 in the finder, or on the desktop if you downloaded it there). The
284 Privoxy.pkg package should appear after unzipping. Then,
285 double-click on that Privoxy.pkg package installer icon and follow
286 the installation process.
287 <application>Privoxy</application> will be installed in the folder
288 <literal>/Library/Privoxy</literal>.
289 It will run automatically whenever you start up. To prevent it from
290 running automatically, remove or rename the folder
291 <literal>/Library/StartupItems/Privoxy</literal>.
294 To run Privoxy by hand, double-click on
295 <literal>RunPrivoxy.command</literal>.
296 To run Privoxy from Terminal, execute
297 <literal>/Library/Privoxy/RunPrivoxy.command</literal>.
301 <!-- ~~~~~ New section ~~~~~ -->
302 <sect3 id="installation-amiga"><title>AmigaOS</title>
304 Copy and then unpack the <filename>lha</filename> archive to a suitable location.
305 All necessary files will be installed into <application>Privoxy</application>
306 directory, including all configuration and log files. To uninstall, just
307 remove this directory.
312 <!-- ~~~~~ New section ~~~~~ -->
313 <sect2 id="installation-source"><title>Building from Source</title>
316 The most convenient way to obtain the <application>Privoxy</application> sources
317 is to download the source tarball from our <ulink url="http://sf.net/projects/ijbswa/">project
322 If you like to live on the bleeding edge and are not afraid of using
323 possibly unstable development versions, you can check out the up-to-the-minute
324 version directly from <ulink url="http://sourceforge.net/cvs/?group_id=11118">the
325 CVS repository</ulink> or simply download <ulink
326 url="http://cvs.sourceforge.net/cvstarballs/ijbswa-cvsroot.tar.gz">the nightly CVS
330 <!-- include buildsource.sgml boilerplate: -->
332 <!-- end boilerplate -->
338 <!-- ~ End section ~ -->
340 <!-- ~~~~~ New section ~~~~~ -->
341 <sect1 id="upgradersnote">
342 <title>Note to Upgraders</title>
344 There are very significant changes from earlier
345 <application>Junkbuster</application> versions to the current
346 <application>Privoxy</application>. The number, names, syntax, and
347 purposes of configuration files have substantially changed.
348 <application>Junkbuster 2.0.x</application> configuration
349 files will not migrate, <application>Junkbuster 2.9.x</application>
350 and <application>Privoxy</application> configurations will need to be
351 ported. The functionalities of the old <filename>blockfile</filename>,
352 <filename>cookiefile</filename> and <filename>imagelist</filename>
353 are now combined into the <link linkend="actions-file"><quote>actions
354 files</quote></link>.
355 <filename>default.action</filename>, is the main actions file. Local
356 exceptions should best be put into <filename>user.action</filename>.
359 A <link linkend="filter-file"><quote>filter file</quote></link> (typically
360 <filename>default.filter</filename>) is new as of <application>Privoxy
361 2.9.x</application>, and provides some of the new sophistication (explained
362 below). <filename>config</filename> is much the same as before.
365 If upgrading from a 2.0.x version, you will have to use the new config
366 files, and possibly adapt any personal rules from your older files.
367 When porting personal rules over from the old <filename>blockfile</filename>
368 to the new actions files, please note that even the pattern syntax has
369 changed. If upgrading from 2.9.x development versions, it is still
370 recommended to use the new configuration files.
373 A quick list of things to be aware of before upgrading:
381 The default listening port is now 8118 due to a conflict with another
387 Some installers may remove earlier versions completely. Save any
388 important configuration files!
393 <application>Privoxy</application> is controllable with a web browser
394 at the special URL: <ulink
395 url="http://config.privoxy.org/">http://config.privoxy.org/</ulink>
396 (Shortcut: <ulink url="http://p.p/">http://p.p/</ulink>). Many
397 aspects of configuration can be done here, including temporarily disabling
398 <application>Privoxy</application>.
403 The primary configuration files for cookie management, ad and banner
404 blocking, and many other aspects of <application>Privoxy</application>
405 configuration are the <link linkend="actions-file">actions
406 files</link>. It is strongly recommended to become familiar with the new
407 actions concept below, before modifying these files. Locally defined rules
408 should go into <filename>user.action</filename>.
413 <!-- I think it is best to keep this somewhat vague, in case -->
414 <!-- the situation changes under our feet. -->
415 Some installers may not automatically start
416 <application>Privoxy</application> after installation.
424 <!-- ~~~~~ New section ~~~~~ -->
425 <sect1 id="quickstart"><title>Quickstart to Using <application>Privoxy</application></title>
431 If upgrading, from versions before 2.9.16, please back up any configuration
432 files. See the <link linkend="upgradersnote">Note to Upgraders</link> Section.
438 Install <application>Privoxy</application>. See the <link
439 linkend="installation">Installation Section</link> below for platform specific
446 Advanced users and those who want to offer <application>Privoxy</application>
447 service to more than just their local machine should check the <link
448 linkend="config">main config file</link>, especially the <link
449 linkend="access-control">security-relevant</link> options. These are
456 Start <application>Privoxy</application>, if the installation program has
457 not done this already (may vary according to platform). See the section
458 <link linkend="startup">Starting <application>Privoxy</application></link>.
464 Set your browser to use <application>Privoxy</application> as HTTP and
465 HTTPS proxy by setting the proxy configuration for address of
466 <literal>127.0.0.1</literal> and port <literal>8118</literal>.
467 (<application>Junkbuster</application> and earlier versions of
468 <application>Privoxy</application> used port 8000.) See the section <link
469 linkend="startup">Starting <application>Privoxy</application></link> below
470 for more details on this.
476 Flush your browser's disk and memory caches, to remove any cached ad images.
482 A default installation should provide a reasonable starting point for
483 most. There will undoubtedly be occasions where you will want to adjust the
484 configuration, but that can be dealt with as the need arises. Little
485 to no initial configuration is required in most cases.
488 See the <link linkend="configuration">Configuration section</link> for more
489 configuration options, and how to customize your installation.
490 <![%draft;[ You might also want to look at the <link
491 linkend="quickstart-ad-blocking">next section</link> for a quick
492 introduction to how <application>Privoxy</application> blocks ads and
499 If you experience ads that slipped through, innocent images that are
500 blocked, or otherwise feel the need to fine-tune
501 <application>Privoxy's</application> behaviour, take a look at the <link
502 linkend="actions-file">actions files</link>. As a quick start, you might
503 find the <link linkend="act-examples">richly commented examples</link>
504 helpful. You can also view and edit the actions files through the <ulink
505 url="http://config.privoxy.org">web-based user interface</ulink>. The
506 Appendix <quote><link linkend="actionsanat">Anatomy of an
507 Action</link></quote> has hints how to debug actions that
508 <quote>misbehave</quote>.
514 Please see the section <link linkend="contact">Contacting the
515 Developers</link> on how to report bugs or problems with websites or to get
522 Now enjoy surfing with enhanced comfort and privacy!
530 <!-- ~~~~~ New section ~~~~~ -->
532 <sect2 id="quickstart-ad-blocking">
533 <title>Quickstart to Ad Blocking</title>
535 NOTE: This section is deliberately redundant for those that don't
536 want to read the whole thing (which is getting lengthy).
539 Ad blocking is but one of <application>Privoxy's</application>
540 array of features. Many of these features are for the technically minded advanced
541 user. But, ad and banner blocking is surely common ground for everybody.
544 This section will provide a quick summary of ad blocking so
545 you can get up to speed quickly without having to read the more extensive
546 information provided below, though this is highly recommended.
549 First a bit of a warning ... blocking ads is much like blocking SPAM: the
550 more aggressive you are about it, the more likely you are to block
551 things that were not intended. So there is a trade off here. If you want
552 extreme ad free browsing, be prepared to deal with more
553 <quote>problem</quote> sites, and to spend more time adjusting the
554 configuration to solve these unintended consequences. In short, there is
555 not an easy way to eliminate <emphasis>all</emphasis> ads. Either take
556 the easy way and settle for <emphasis>most</emphasis> ads blocked with the
557 default configuration, or jump in and tweak it for your personal surfing
558 habits and preferences.
561 Secondly, a brief explanation of <application>Privoxy's </application>
562 <quote>actions</quote>. <quote>Actions</quote> in this context, are
563 the directives we use to tell <application>Privoxy</application> to perform
564 some task relating to HTTP transactions (i.e. web browsing). We tell
565 <application>Privoxy</application> to take some <quote>action</quote>. Each
566 action has a unique name and function. While there are many potential
567 <application>actions</application> in <application>Privoxy's</application>
568 arsenal, only a few are used for ad blocking. <link
569 linkend="actions">Actions</link>, and <link linkend="actions-file">action
570 configuration files</link>, are explained in depth below.
573 Actions are specified in <application>Privoxy's</application> configuration,
574 followed by one or more URLs to which the action should apply. URLs
575 can actually be URL type <link linkend="af-patterns">patterns</link> that use
576 wildcards so they can apply potentially to a range of similar URLs. The
577 actions, together with the URL patterns are called a section.
580 When you connect to a website, the full URL will either match one or more
581 of the sections as defined in <application>Privoxy's</application> configuration,
582 or not. If so, then <application>Privoxy</application> will perform the
583 respective actions. If not, then nothing special happens. Furthermore, web
584 pages may contain embedded, secondary URLs that your web browser will
585 use to load additional components of the page, as it parses the
586 original page's HTML content. An ad image for instance, is just an URL
587 embedded in the page somewhere. The image itself may be on the same server,
588 or a server somewhere else on the Internet. Complex web pages will have many
593 The actions we need to know about for ad blocking are: <literal><link
594 linkend="block">block</link></literal>, <literal><link
595 linkend="handle-as-image">handle-as-image</link></literal>, and
596 <literal><link linkend="set-image-blocker">set-image-blocker</link></literal>:
604 <literal><link linkend="block">block</link></literal> - this action stops
605 any contact between your browser and any URL patterns that match this
606 action's configuration. It can be used for blocking ads, but also anything
607 that is determined to be unwanted. By itself, it simply stops any
608 communication with the remote server and sends <application>Privoxy</application>'s
609 own built-in BLOCKED page instead to let you now what has happened.
615 <literal><link linkend="handle-as-image">handle-as-image</link></literal> -
616 tells <application>Privoxy</application> to treat this URL as an image.
617 <application>Privoxy</application>'s default configuration already does this
618 for all common image types (e.g. GIF), but there are many situations where this
619 is not so easy to determine. So we'll force it in these cases. This is particularly
620 important for ad blocking, since only if we know that it's an image of
621 some kind, can we replace it with an image of our choosing, instead of the
622 <application>Privoxy</application> BLOCKED page (which would only result in
623 a <quote>broken image</quote> icon). There are some limitations to this
624 though. For instance, you can't just brute-force an image substitution for
625 an entire HTML page in most situations.
632 linkend="set-image-blocker">set-image-blocker</link></literal> - tells
633 <application>Privoxy</application> what to display in place of an ad image that
634 has hit a block rule. For this to come into play, the URL must match a
635 <literal><link linkend="block">block</link></literal> action somewhere in the
636 configuration, <emphasis>and</emphasis>, it must also match an
637 <literal><link linkend="handle-as-image">handle-as-image</link></literal> action.
640 The configuration options on what to display instead of the ad are:
644 <emphasis>pattern</emphasis> - a checkerboard pattern, so that an ad
645 replacement is obvious. This is the default.
650 <emphasis>blank</emphasis> - A very small empty GIF image is displayed.
651 This is the so-called <quote>invisible</quote> configuration option.
656 <emphasis>http://<URL></emphasis> - A redirect to any image anywhere
657 of the user's choosing (advanced usage).
666 The quickest way to adjust any of these settings is with your browser through
667 the special <application>Privoxy</application> editor at <ulink
668 url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>
669 (shortcut: <ulink url="http://p.p/">http://p.p/show-status</ulink>). This
670 is an internal page, and does not require Internet access. Select the
671 appropriate <quote>actions</quote> file, and click
672 <quote><guibutton>Edit</guibutton></quote>. It is best to put personal or
673 local preferences in <filename>user.action</filename> since this is not
674 meant to be overwritten during upgrades, and will over-ride the settings in
675 other files. Here you can insert new <quote>actions</quote>, and URLs for ad
676 blocking or other purposes, and make other adjustments to the configuration.
677 <application>Privoxy</application> will detect these changes automatically.
681 A quick and simple step by step example:
689 Right click on the ad image to be blocked, then select
690 <quote><guimenuitem>Copy Link Location</guimenuitem></quote> from the
698 url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>
703 Find <filename>user.action</filename> in the top section, and click
704 on <quote><guibutton>Edit</guibutton></quote>:
707 <!-- image of editor and actions files selections -->
709 <figure pgwide="0" float="0"><title>Actions Files in Use</title>
712 <imagedata fileref="../images/files-in-use.jpg" format="jpg">
715 <phrase>[ Screenshot of Actions Files in Use ]</phrase>
724 You should have a section with only
725 <literal><link linkend="block">block</link></literal> listed under
726 <quote>Actions:</quote>.
727 If not, click a <quote><guibutton>Insert new section below</guibutton></quote>
728 button, and in the new section that just appeared, click the
729 <guibutton>Edit</guibutton> button right under the word <quote>Actions:</quote>.
730 This will bring up a list of all actions. Find
731 <literal><link linkend="block">block</link></literal> near the top, and click
732 in the <quote>Enabled</quote> column, then <quote><guibutton>Submit</guibutton></quote>
738 Now, in the <literal><link linkend="block">block</link></literal> actions section,
739 click the <quote><guibutton>Add</guibutton></quote> button, and paste the URL the
740 browser got from <quote><guimenuitem>Copy Link Location</guimenuitem></quote>.
741 Remove the <literal>http://</literal> at the beginning of the URL. Then, click
742 <quote><guibutton>Submit</guibutton></quote> (or
743 <quote><guibutton>OK</guibutton></quote> if in a pop-up window).
748 Now go back to the original page, and press <keycap>SHIFT-Reload</keycap>
749 (or flush all browser caches). The image should be gone now.
757 This is a very crude and simple example. There might be good reasons to use a
758 wildcard pattern match to include potentially similar images from the same
759 site. For a more extensive explanation of <quote>patterns</quote>, and
760 the entire actions concept, see <link linkend="actions-file">the Actions
765 For advanced users who want to hand edit their config files, you might want
766 to now go to the <link linkend="act-examples">Actions Files Tutorial</link>.
767 The ideas explained therein also apply to the web-based editor.
774 <!-- ~ End section ~ -->
777 <!-- ~~~~~ New section ~~~~~ -->
779 <title>Starting <application>Privoxy</application></title>
781 Before launching <application>Privoxy</application> for the first time, you
782 will want to configure your browser(s) to use
783 <application>Privoxy</application> as a HTTP and HTTPS proxy. The default is
784 127.0.0.1 (or localhost) for the proxy address, and port 8118 (earlier versions
785 used port 8000). This is the one configuration step that must be done!
788 Please note that <application>Privoxy</application> can only proxy HTTP and
789 HTTPS traffic. It will not work with FTP or other protocols.
792 <!-- image of Mozilla Proxy configuration -->
794 <figure pgwide="0" float="0"><title>Proxy Configuration (Mozilla)</title>
797 <imagedata fileref="../images/proxy_setup.jpg" format="jpg">
800 <phrase>[ Screenshot of Mozilla Proxy Configuration ]</phrase>
807 With <application>Netscape</application> (and
808 <application>Mozilla</application>), this can be set under:
812 <!-- Mix ascii and gui art, something for everybody -->
813 <!-- spacing on this is tricky -->
814 <guibutton>Edit</guibutton>
816 <guibutton>Preferences</guibutton>
818 <guibutton>Advanced</guibutton>
820 <guibutton>Proxies</guibutton>
822 <guibutton>HTTP Proxy</guibutton>
826 For <application>Internet Explorer</application>:
830 <!-- Mix ascii and gui art, something for everybody -->
831 <!-- spacing on this is tricky -->
832 <guibutton>Tools</guibutton>
834 <guibutton>Internet Properties</guibutton>
836 <guibutton>Connections</guibutton>
838 <guibutton>LAN Settings</guibutton>
842 Then, check <quote>Use Proxy</quote> and fill in the appropriate info
843 (Address: 127.0.0.1, Port: 8118). Include HTTPS (SSL), if you want HTTPS
848 After doing this, flush your browser's disk and memory caches to force a
849 re-reading of all pages and to get rid of any ads that may be cached. You
850 are now ready to start enjoying the benefits of using
851 <application>Privoxy</application>!
855 <application>Privoxy</application> is typically started by specifying the
856 main configuration file to be used on the command line. If no configuration
857 file is specified on the command line, <application>Privoxy</application>
858 will look for a file named <filename>config</filename> in the current
859 directory. Except on Win32 where it will try <filename>config.txt</filename>.
862 <sect2 id="start-redhat">
863 <title>Red Hat and Conectiva</title>
865 We use a script. Note that Red Hat does not start Privoxy upon booting per
866 default. It will use the file <filename>/etc/privoxy/config</filename> as
867 its main configuration file.
871 # /etc/rc.d/init.d/privoxy start
876 <sect2 id="start-debian">
877 <title>Debian</title>
879 We use a script. Note that Debian starts Privoxy upon booting per
880 default. It will use the file
881 <filename>/etc/privoxy/config</filename> as its main configuration
886 # /etc/init.d/privoxy start
891 <sect2 id="start-suse">
894 We use a script. It will use the file <filename>/etc/privoxy/config</filename>
895 as its main configuration file. Note that SuSE starts Privoxy upon booting
905 <sect2 id="start-windows">
906 <title>Windows</title>
908 Click on the Privoxy Icon to start Privoxy. If no configuration file is
909 specified on the command line, <application>Privoxy</application> will look
910 for a file named <filename>config.txt</filename>. Note that Windows will
911 automatically start Privoxy upon booting you PC.
915 <sect2 id="start-unices">
916 <title>Solaris, NetBSD, FreeBSD, HP-UX and others</title>
918 Example Unix startup command:
922 # /usr/sbin/privoxy /etc/privoxy/config
927 <sect2 id="start-os2">
930 During installation, <application>Privoxy</application> is configured to
931 start automatically when the system restarts. You can start it manually by
932 double-clicking on the <application>Privoxy</application> icon in the
933 <application>Privoxy</application> folder.
937 <sect2 id="start-macosx">
938 <title>Mac OSX</title>
940 During installation, <application>Privoxy</application> is configured to
941 start automatically when the system restarts. To run Privoxy by hand,
942 double-click on the <literal>RunPrivoxy.command</literal> icon in the
943 <literal>/Library/Privoxy</literal> folder. Or, type this command
948 /Library/Privoxy/RunPrivoxy.command
952 If you are not logged in as an administrator, you will be asked for the
953 administrator password when starting <application>Privoxy</application>
959 <sect2 id="start-amigaos">
960 <title>AmigaOS</title>
962 Start <application>Privoxy</application> (with RUN <>NIL:) in your
963 <filename>startnet</filename> script (AmiTCP), in
964 <filename>s:user-startup</filename> (RoadShow), as startup program in your
965 startup script (Genesis), or as startup action (Miami and MiamiDx).
966 <application>Privoxy</application> will automatically quit when you quit your
967 TCP/IP stack (just ignore the harmless warning your TCP/IP stack may display that
968 <application>Privoxy</application> is still running).
975 See the section <link linkend="cmdoptions">Command line options</link> for
979 must find a better place for this paragraph
982 The included default configuration files should give a reasonable starting
983 point. Most of the per site configuration is done in the
984 <ulink url="actions-file.html"><quote>actions</quote></ulink> files. These are
985 where various cookie actions are defined, ad and banner blocking, and other
986 aspects of <application>Privoxy</application> configuration. There are several
987 such files included, with varying levels of aggressiveness.
991 You will probably want to keep an eye out for sites for which you may prefer
992 persistent cookies, and add these to your actions configuration as needed. By
993 default, most of these will be accepted only during the current browser
994 session (aka <quote>session cookies</quote>), unless you add them to the
995 configuration. If you want the browser to handle this instead, you will need
996 to edit <filename>user.action</filename> (or through the web based interface)
997 and disable this feature. If you use more than one browser, it would make
998 more sense to let <application>Privoxy</application> handle this. In which
999 case, the browser(s) should be set to accept all cookies.
1003 Another feature where you will probably want to define exceptions for trusted
1004 sites is the popup-killing (through the <ulink
1005 url="actions-file.html#KILL-POPUPS"><quote>+kill-popups</quote></ulink> and
1007 url="actions-file.html#FILTER-POPUPS"><quote>+filter{popups}</quote></ulink>
1008 actions), because your favorite shopping, banking, or leisure site may need
1009 popups (explained below).
1013 <application>Privoxy</application> is HTTP/1.1 compliant, but not all of
1014 the optional 1.1 features are as yet supported. In the unlikely event that
1015 you experience inexplicable problems with browsers that use HTTP/1.1 per default
1016 (like <application>Mozilla</application> or recent versions of I.E.), you might
1017 try to force HTTP/1.0 compatibility. For Mozilla, look under <literal>Edit ->
1018 Preferences -> Debug -> Networking</literal>.
1019 Alternatively, set the <quote>+downgrade-http-version</quote> config option in
1020 <filename>default.action</filename> which will downgrade your browser's HTTP
1021 requests from HTTP/1.1 to HTTP/1.0 before processing them.
1025 After running <application>Privoxy</application> for a while, you can
1026 start to fine tune the configuration to suit your personal, or site,
1027 preferences and requirements. There are many, many aspects that can
1028 be customized. <quote>Actions</quote>
1029 can be adjusted by pointing your browser to
1030 <ulink url="http://config.privoxy.org/">http://config.privoxy.org/</ulink>
1031 (shortcut: <ulink url="http://p.p/">http://p.p/</ulink>),
1032 and then follow the link to <quote>View & Change the Current Configuration</quote>.
1033 (This is an internal page and does not require Internet access.)
1037 In fact, various aspects of <application>Privoxy</application>
1038 configuration can be viewed from this page, including
1039 current configuration parameters, source code version numbers,
1040 the browser's request headers, and <quote>actions</quote> that apply
1041 to a given URL. In addition to the actions file
1042 editor mentioned above, <application>Privoxy</application> can also
1043 be turned <quote>on</quote> and <quote>off</quote> (toggled) from this page.
1047 If you encounter problems, try loading the page without
1048 <application>Privoxy</application>. If that helps, enter the URL where
1049 you have the problems into <ulink url="http://p.p/show-url-info">the browser
1050 based rule tracing utility</ulink>. See which rules apply and why, and
1051 then try turning them off for that site one after the other, until the problem
1052 is gone. When you have found the culprit, you might want to turn the rest on
1057 If the above paragraph sounds gibberish to you, you might want to <ulink
1058 url="actions-file.html#ACTIONSFILE">read more about the actions concept</ulink>
1059 or even dive deep into the <ulink url="appendix.html#ACTIONSANAT">Appendix
1064 If you can't get rid of the problem at all, think you've found a bug in
1065 Privoxy, want to propose a new feature or smarter rules, please see the
1066 section <ulink url="contact.html"><quote>Contacting the
1067 Developers</quote></ulink> below.
1072 <!-- ~~~~~ New section ~~~~~ -->
1073 <sect2 id="cmdoptions">
1074 <title>Command Line Options</title>
1076 <application>Privoxy</application> may be invoked with the following
1077 command-line options:
1085 <emphasis>--version</emphasis>
1088 Print version info and exit. Unix only.
1093 <emphasis>--help</emphasis>
1096 Print short usage info and exit. Unix only.
1101 <emphasis>--no-daemon</emphasis>
1104 Don't become a daemon, i.e. don't fork and become process group
1105 leader, and don't detach from controlling tty. Unix only.
1110 <emphasis>--pidfile FILE</emphasis>
1114 On startup, write the process ID to <emphasis>FILE</emphasis>. Delete the
1115 <emphasis>FILE</emphasis> on exit. Failure to create or delete the
1116 <emphasis>FILE</emphasis> is non-fatal. If no <emphasis>FILE</emphasis>
1117 option is given, no PID file will be used. Unix only.
1122 <emphasis>--user USER[.GROUP]</emphasis>
1126 After (optionally) writing the PID file, assume the user ID of
1127 <emphasis>USER</emphasis>, and if included the GID of GROUP. Exit if the
1128 privileges are not sufficient to do so. Unix only.
1133 <emphasis>configfile</emphasis>
1136 If no <emphasis>configfile</emphasis> is included on the command line,
1137 <application>Privoxy</application> will look for a file named
1138 <quote>config</quote> in the current directory (except on Win32
1139 where it will look for <quote>config.txt</quote> instead). Specify
1140 full path to avoid confusion. If no config file is found,
1141 <application>Privoxy</application> will fail to start.
1152 <!-- ~ End section ~ -->
1155 <!-- ~~~~~ New section ~~~~~ -->
1156 <sect1 id="configuration"><title><application>Privoxy</application> Configuration</title>
1158 All <application>Privoxy</application> configuration is stored
1159 in text files. These files can be edited with a text editor.
1160 Many important aspects of <application>Privoxy</application> can
1161 also be controlled easily with a web browser.
1165 <!-- ~~~~~ New section ~~~~~ -->
1168 <title>Controlling <application>Privoxy</application> with Your Web Browser</title>
1170 <application>Privoxy</application>'s user interface can be reached through the special
1171 URL <ulink url="http://config.privoxy.org/">http://config.privoxy.org/</ulink>
1172 (shortcut: <ulink url="http://p.p/">http://p.p/</ulink>),
1173 which is a built-in page and works without Internet access.
1174 You will see the following section:
1178 <!-- Needs to be put in a table and colorized -->
1181 <bridgehead renderas="sect2"> Privoxy Menu</bridgehead>
1185 ▪ <ulink url="http://config.privoxy.org/show-status">View & change the current configuration</ulink>
1188 ▪ <ulink url="http://config.privoxy.org/show-version">View the source code version numbers</ulink>
1191 ▪ <ulink url="http://config.privoxy.org/show-request">View the request headers.</ulink>
1194 ▪ <ulink url="http://config.privoxy.org/show-url-info">Look up which actions apply to a URL and why</ulink>
1197 ▪ <ulink url="http://config.privoxy.org/toggle">Toggle Privoxy on or off</ulink>
1205 This should be self-explanatory. Note the first item leads to an editor for the
1206 <link linkend="actions-file">actions files</link>, which is where the ad, banner,
1207 cookie, and URL blocking magic is configured as well as other advanced features of
1208 <application>Privoxy</application>. This is an easy way to adjust various
1209 aspects of <application>Privoxy</application> configuration. The actions
1210 file, and other configuration files, are explained in detail below.
1214 <quote>Toggle Privoxy On or Off</quote> is handy for sites that might
1215 have problems with your current actions and filters. You can in fact use
1216 it as a test to see whether it is <application>Privoxy</application>
1217 causing the problem or not. <application>Privoxy</application> continues
1218 to run as a proxy in this case, but all manipulation is disabled, i.e.
1219 <application>Privoxy</application> acts like a normal forwarding proxy. There
1220 is even a toggle <link linkend="bookmarklets">Bookmarklet</link> offered, so
1221 that you can toggle <application>Privoxy</application> with one click from
1227 <!-- ~ End section ~ -->
1232 <!-- ~~~~~ New section ~~~~~ -->
1234 <sect2 id="confoverview">
1235 <title>Configuration Files Overview</title>
1237 For Unix, *BSD and Linux, all configuration files are located in
1238 <filename>/etc/privoxy/</filename> by default. For MS Windows, OS/2, and
1239 AmigaOS these are all in the same directory as the
1240 <application>Privoxy</application> executable. <![%p-not-stable;[ The name
1241 and number of configuration files has changed from previous versions, and is
1242 subject to change as development progresses.]]>
1246 The installed defaults provide a reasonable starting point, though
1247 some settings may be aggressive by some standards. For the time being, the
1248 principle configuration files are:
1256 The <link linkend="config">main configuration file</link> is named <filename>config</filename>
1257 on Linux, Unix, BSD, OS/2, and AmigaOS and <filename>config.txt</filename>
1258 on Windows. This is a required file.
1264 <filename>default.action</filename> (the main <link linkend="actions-file">actions file</link>)
1265 is used to define which <quote>actions</quote> relating to banner-blocking, images, pop-ups,
1266 content modification, cookie handling etc should be applied by default. It also defines many
1267 exceptions (both positive and negative) from this default set of actions that enable
1268 <application>Privoxy</application> to selectively eliminate the junk, and only the junk, on
1269 as many websites as possible.
1272 Multiple actions files may be defined in <filename>config</filename>. These
1273 are processed in the order they are defined. Local customizations and locally
1274 preferred exceptions to the default policies as defined in
1275 <filename>default.action</filename> (which you will most probably want
1276 to define sooner or later) are probably best applied in
1277 <filename>user.action</filename>, where you can preserve them across
1278 upgrades. <filename>standard.action</filename> is for
1279 <application>Privoxy's</application> internal use.
1282 There is also a web based editor that can be accessed from
1284 url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>
1286 url="http://p.p/show-status">http://p.p/show-status</ulink>) for the
1287 various actions files.
1293 <filename>default.filter</filename> (the <link linkend="filter-file">filter
1294 file</link>) can be used to re-write the raw page content, including
1295 viewable text as well as embedded HTML and JavaScript, and whatever else
1296 lurks on any given web page. The filtering jobs are only pre-defined here;
1297 whether to apply them or not is up to the actions files.
1305 All files use the <quote><literal>#</literal></quote> character to denote a
1306 comment (the rest of the line will be ignored) and understand line continuation
1307 through placing a backslash ("<literal>\</literal>") as the very last character
1308 in a line. If the <literal>#</literal> is preceded by a backslash, it looses
1309 its special function. Placing a <literal>#</literal> in front of an otherwise
1310 valid configuration line to prevent it from being interpreted is called "commenting
1315 The actions files and <filename>default.filter</filename>
1316 can use Perl style <link linkend="regex">regular expressions</link> for
1317 maximum flexibility.
1321 After making any changes, there is no need to restart
1322 <application>Privoxy</application> in order for the changes to take
1323 effect. <application>Privoxy</application> detects such changes
1324 automatically. Note, however, that it may take one or two additional
1325 requests for the change to take effect. When changing the listening address
1326 of <application>Privoxy</application>, these <quote>wake up</quote> requests
1327 must obviously be sent to the <emphasis>old</emphasis> listening address.
1332 While under development, the configuration content is subject to change.
1333 The below documentation may not be accurate by the time you read this.
1334 Also, what constitutes a <quote>default</quote> setting, may change, so
1335 please check all your configuration files on important issues.
1341 <!-- ~ End section ~ -->
1344 <!-- ~~~~~~~~ New section Header ~~~~~~~~~ -->
1346 <!-- **************************************************** -->
1347 <!-- Include config.sgml here -->
1348 <!-- This is where the entire config file is detailed. -->
1350 <!-- end include -->
1353 <!-- ~ End section ~ -->
1357 <!-- ~~~~~~~~ New section Header ~~~~~~~~~ -->
1359 <sect1 id="actions-file"><title>Actions Files</title>
1362 The actions files are used to define what actions
1363 <application>Privoxy</application> takes for which URLs, and thus determine
1364 how ad images, cookies and various other aspects of HTTP content and
1365 transactions are handled, and on which sites (or even parts thereof). There
1366 are three such files included with <application>Privoxy</application> (as of
1367 version 2.9.15), with differing purposes:
1374 <filename>default.action</filename> - is the primary action file
1375 that sets the initial values for all actions. It is intended to
1376 provide a base level of functionality for
1377 <application>Privoxy's</application> array of features. So it is
1378 a set of broad rules that should work reasonably well for users everywhere.
1379 This is the file that the developers are keeping updated, and making
1385 <filename>user.action</filename> - is intended to be for local site
1386 preferences and exceptions. As an example, if your ISP or your bank
1387 has specific requirements, and need special handling, this kind of
1388 thing should go here. This file will not be upgraded.
1393 <filename>standard.action</filename> - is used by the web based editor,
1394 to set various pre-defined sets of rules for the default actions section
1395 in <filename>default.action</filename>. These have increasing levels of
1396 aggressiveness <emphasis>and have no influence on your browsing unless
1397 you select them explicitly in the editor</emphasis>. It is not recommend
1405 The list of actions files to be used are defined in the main configuration
1406 file, and are processed in the order they are defined. The content of these
1407 can all be viewed and edited from <ulink
1408 url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>.
1412 An actions file typically has multiple sections. If you want to use
1413 <quote>aliases</quote> in an actions file, you have to place the (optional)
1414 <link linkend="aliases">alias section</link> at the top of that file.
1415 Then comes the default set of rules which will apply universally to all
1416 sites and pages (be <emphasis>very careful</emphasis> with using such a
1417 universal set in <filename>user.action</filename> or any other actions file after
1418 <filename>default.action</filename>, because it will override the result
1419 from consulting any previous file). And then below that,
1420 exceptions to the defined universal policies. You can regard
1421 <filename>user.action</filename> as an appendix to <filename>default.action</filename>,
1422 with the advantage that is a separate file, which makes preserving your
1423 personal settings across <application>Privoxy</application> upgrades easier.
1427 Actions can be used to block anything you want, including ads, banners, or
1428 just some obnoxious URL that you would rather not see. Cookies can be accepted
1429 or rejected, or accepted only during the current browser session (i.e. not
1430 written to disk), content can be modified, JavaScripts tamed, user-tracking
1431 fooled, and much more. See below for a <link linkend="actions">complete list
1435 <!-- ~~~~~ New section ~~~~~ -->
1437 <title>Finding the Right Mix</title>
1439 Note that some <link linkend="actions">actions</link>, like cookie suppression
1440 or script disabling, may render some sites unusable that rely on these
1441 techniques to work properly. Finding the right mix of actions is not always easy and
1442 certainly a matter of personal taste. In general, it can be said that the more
1443 <quote>aggressive</quote> your default settings (in the top section of the
1444 actions file) are, the more exceptions for <quote>trusted</quote> sites you
1445 will have to make later. If, for example, you want to kill popup windows per
1446 default, you'll have to make exceptions from that rule for sites that you
1447 regularly use and that require popups for actually useful content, like maybe
1448 your bank, favorite shop, or newspaper.
1452 We have tried to provide you with reasonable rules to start from in the
1453 distribution actions files. But there is no general rule of thumb on these
1454 things. There just are too many variables, and sites are constantly changing.
1455 Sooner or later you will want to change the rules (and read this chapter again :).
1459 <!-- ~~~~~ New section ~~~~~ -->
1461 <title>How to Edit</title>
1463 The easiest way to edit the actions files is with a browser by
1464 using our browser-based editor, which can be reached from <ulink
1465 url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>.
1466 The editor allows both fine-grained control over every single feature on a
1467 per-URL basis, and easy choosing from wholesale sets of defaults like
1468 <quote>Cautious</quote>, <quote>Medium</quote> or <quote>Advanced</quote>.
1472 If you prefer plain text editing to GUIs, you can of course also directly edit the
1473 the actions files. Look at <filename>default.action</filename> which is richly
1479 <sect2 id="actions-apply">
1480 <title>How Actions are Applied to URLs</title>
1482 Actions files are divided into sections. There are special sections,
1483 like the <quote><link linkend="aliases">alias</link></quote> sections which will
1484 be discussed later. For now let's concentrate on regular sections: They have a
1485 heading line (often split up to multiple lines for readability) which consist
1486 of a list of actions, separated by whitespace and enclosed in curly braces.
1487 Below that, there is a list of URL patterns, each on a separate line.
1491 To determine which actions apply to a request, the URL of the request is
1492 compared to all patterns in each action file file. Every time it matches, the list of
1493 applicable actions for the URL is incrementally updated, using the heading
1494 of the section in which the pattern is located. If multiple matches for
1495 the same URL set the same action differently, the last match wins. If not,
1496 the effects are aggregated. E.g. a URL might match a regular section with
1497 a heading line of <literal>{
1498 +<ulink url="actions-file.html#HANDLE-AS-IMAGE">handle-as-image</ulink> }</literal>,
1499 then later another one with just <literal>{
1500 +<ulink url="actions-file.html#BLOCK">block</ulink> }</literal>, resulting
1501 in <emphasis>both</emphasis> actions to apply.
1505 You can trace this process for any given URL by visiting <ulink
1506 url="http://config.privoxy.org/show-url-info">http://config.privoxy.org/show-url-info</ulink>.
1510 More detail on this is provided in the Appendix, <link linkend="ACTIONSANAT">
1511 Anatomy of an Action</link>.
1515 <!-- ~~~~~ New section ~~~~~ -->
1516 <sect2 id="af-patterns">
1517 <title>Patterns</title>
1519 Generally, a pattern has the form <literal><domain>/<path></literal>,
1520 where both the <literal><domain></literal> and <literal><path></literal>
1521 are optional. (This is why the pattern <literal>/</literal> matches all URLs).
1526 <term><literal>www.example.com/</literal></term>
1529 is a domain-only pattern and will match any request to <literal>www.example.com</literal>,
1530 regardless of which document on that server is requested.
1535 <term><literal>www.example.com</literal></term>
1538 means exactly the same. For domain-only patterns, the trailing <literal>/</literal> may
1544 <term><literal>www.example.com/index.html</literal></term>
1547 matches only the single document <literal>/index.html</literal>
1548 on <literal>www.example.com</literal>.
1553 <term><literal>/index.html</literal></term>
1556 matches the document <literal>/index.html</literal>, regardless of the domain,
1557 i.e. on <emphasis>any</emphasis> web server.
1562 <term><literal>index.html</literal></term>
1565 matches nothing, since it would be interpreted as a domain name and
1566 there is no top-level domain called <literal>.html</literal>.
1573 <!-- ~~~~~ New section ~~~~~ -->
1574 <sect3><title>The Domain Pattern</title>
1577 The matching of the domain part offers some flexible options: if the
1578 domain starts or ends with a dot, it becomes unanchored at that end.
1584 <term><literal>.example.com</literal></term>
1587 matches any domain that <emphasis>ENDS</emphasis> in
1588 <literal>.example.com</literal>
1593 <term><literal>www.</literal></term>
1596 matches any domain that <emphasis>STARTS</emphasis> with
1597 <literal>www.</literal>
1602 <term><literal>.example.</literal></term>
1605 matches any domain that <emphasis>CONTAINS</emphasis> <literal>.example.</literal>
1606 (Correctly speaking: It matches any FQDN that contains <literal>example</literal> as a domain.)
1613 Additionally, there are wild-cards that you can use in the domain names
1614 themselves. They work pretty similar to shell wild-cards: <quote>*</quote>
1615 stands for zero or more arbitrary characters, <quote>?</quote> stands for
1616 any single character, you can define character classes in square
1617 brackets and all of that can be freely mixed:
1622 <term><literal>ad*.example.com</literal></term>
1625 matches <quote>adserver.example.com</quote>,
1626 <quote>ads.example.com</quote>, etc but not <quote>sfads.example.com</quote>
1631 <term><literal>*ad*.example.com</literal></term>
1634 matches all of the above, and then some.
1639 <term><literal>.?pix.com</literal></term>
1642 matches <literal>www.ipix.com</literal>,
1643 <literal>pictures.epix.com</literal>, <literal>a.b.c.d.e.upix.com</literal> etc.
1648 <term><literal>www[1-9a-ez].example.c*</literal></term>
1651 matches <literal>www1.example.com</literal>,
1652 <literal>www4.example.cc</literal>, <literal>wwwd.example.cy</literal>,
1653 <literal>wwwz.example.com</literal> etc., but <emphasis>not</emphasis>
1654 <literal>wwww.example.com</literal>.
1662 <!-- ~ End section ~ -->
1665 <!-- ~~~~~ New section ~~~~~ -->
1666 <sect3><title>The Path Pattern</title>
1669 <application>Privoxy</application> uses Perl compatible regular expressions
1670 (through the <ulink url="http://www.pcre.org/">PCRE</ulink> library) for
1675 There is an <link linkend="regex">Appendix</link> with a brief quick-start into regular
1676 expressions, and full (very technical) documentation on PCRE regex syntax is available on-line
1677 at <ulink url="http://www.pcre.org/man.txt">http://www.pcre.org/man.txt</ulink>.
1678 You might also find the Perl man page on regular expressions (<literal>man perlre</literal>)
1679 useful, which is available on-line at <ulink
1680 url="http://www.perldoc.com/perl5.6/pod/perlre.html">http://www.perldoc.com/perl5.6/pod/perlre.html</ulink>.
1684 Note that the path pattern is automatically left-anchored at the <quote>/</quote>,
1685 i.e. it matches as if it would start with a <quote>^</quote> (regular expression speak
1686 for the beginning of a line).
1690 Please also note that matching in the path is <emphasis>CASE INSENSITIVE</emphasis>
1691 by default, but you can switch to case sensitive at any point in the pattern by using the
1692 <quote>(?-i)</quote> switch: <literal>www.example.com/(?-i)PaTtErN.*</literal> will match
1693 only documents whose path starts with <literal>PaTtErN</literal> in
1694 <emphasis>exactly</emphasis> this capitalization.
1700 <!-- ~ End section ~ -->
1703 <!-- ~~~~~ New section ~~~~~ -->
1705 <sect2 id="actions">
1706 <title>Actions</title>
1708 All actions are disabled by default, until they are explicitly enabled
1709 somewhere in an actions file. Actions are turned on if preceded with a
1710 <quote>+</quote>, and turned off if preceded with a <quote>-</quote>. So a
1711 <literal>+action</literal> means <quote>do that action</quote>, e.g.
1712 <literal>+block</literal> means <quote>please block URLs that match the
1713 following patterns</quote>, and <literal>-block</literal> means <quote>don't
1714 block URLs that match the following patterns, even if <literal>+block</literal>
1715 previously applied.</quote>
1720 Again, actions are invoked by placing them on a line, enclosed in curly braces and
1721 separated by whitespace, like in
1722 <literal>{+some-action -some-other-action{some-parameter}}</literal>,
1723 followed by a list of URL patterns, one per line, to which they apply.
1724 Together, the actions line and the following pattern lines make up a section
1725 of the actions file.
1729 There are three classes of actions:
1736 Boolean, i.e the action can only be <quote>enabled</quote> or
1737 <quote>disabled</quote>. Syntax:
1741 +<replaceable class="function">name</replaceable> # enable action <replaceable class="parameter">name</replaceable>
1742 -<replaceable class="function">name</replaceable> # disable action <replaceable class="parameter">name</replaceable></screen>
1745 Example: <literal>+block</literal>
1752 Parameterized, where some value is required in order to enable this type of action.
1757 +<replaceable class="function">name</replaceable>{<replaceable class="parameter">param</replaceable>} # enable action and set parameter to <replaceable class="parameter">param</replaceable>,
1758 # overwriting parameter from previous match if necessary
1759 -<replaceable class="function">name</replaceable> # disable action. The parameter can be omitted</screen>
1762 Note that if the URL matches multiple positive forms of a parameterized action,
1763 the last match wins, i.e. the params from earlier matches are simply ignored.
1766 Example: <literal>+hide-user-agent{ Mozilla 1.0 }</literal>
1772 Multi-value. These look exactly like parameterized actions,
1773 but they behave differently: If the action applies multiple times to the
1774 same URL, but with different parameters, <emphasis>all</emphasis> the parameters
1775 from <emphasis>all</emphasis> matches are remembered. This is used for actions
1776 that can be executed for the same request repeatedly, like adding multiple
1777 headers, or filtering through multiple filters. Syntax:
1781 +<replaceable class="function">name</replaceable>{<replaceable class="parameter">param</replaceable>} # enable action and add <replaceable class="parameter">param</replaceable> to the list of parameters
1782 -<replaceable class="function">name</replaceable>{<replaceable class="parameter">param</replaceable>} # remove the parameter <replaceable class="parameter">param</replaceable> from the list of parameters
1783 # If it was the last one left, disable the action.
1784 <replaceable class="parameter">-name</replaceable> # disable this action completely and remove all parameters from the list</screen>
1787 Examples: <literal>+add-header{X-Fun-Header: Some text}</literal> and
1788 <literal>+filter{html-annoyances}</literal>
1796 If nothing is specified in any actions file, no <quote>actions</quote> are
1797 taken. So in this case <application>Privoxy</application> would just be a
1798 normal, non-blocking, non-anonymizing proxy. You must specifically enable the
1799 privacy and blocking features you need (although the provided default actions
1800 files will give a good starting point).
1804 Later defined actions always over-ride earlier ones. So exceptions
1805 to any rules you make, should come in the latter part of the file (or
1806 in a file that is processed later when using multiple actions files). For
1807 multi-valued actions, the actions are applied in the order they are specified.
1808 Actions files are processed in the order they are defined in
1809 <filename>config</filename> (the default installation has three actions
1810 files). It also quite possible for any given URL pattern to match more than
1811 one pattern and thus more than one set of actions!
1814 <!-- start actions listing -->
1816 The list of valid <application>Privoxy</application> actions are:
1820 <!-- ********************************************************** -->
1821 <!-- Please note the below defined actions use id's that are -->
1822 <!-- probably linked from other places, so please don't change. -->
1824 <!-- ********************************************************** -->
1827 <!-- ~~~~~ New section ~~~~~ -->
1829 <sect3 renderas="sect4" id="add-header">
1830 <title>add-header</title>
1834 <term>Typical use:</term>
1836 <para>Confuse log analysis, custom applications</para>
1841 <term>Effect:</term>
1844 Sends a user defined HTTP header to the web server.
1851 <!-- boolean, parameterized, Multi-value -->
1853 <para>Multi-value.</para>
1858 <term>Parameter:</term>
1861 Any string value is possible. Validity of the defined HTTP headers is not checked.
1862 It is recommended that you use the <quote><literal>X-</literal></quote> prefix
1872 This action may be specified multiple times, in order to define multiple
1873 headers. This is rarely needed for the typical user. If you don't know what
1874 <quote>HTTP headers</quote> are, you definitely don't need to worry about this
1881 <term>Example usage:</term>
1884 <screen>+add-header{X-User-Tracking: sucks}</screen>
1892 <!-- ~~~~~ New section ~~~~~ -->
1893 <sect3 renderas="sect4" id="block">
1894 <title>block</title>
1898 <term>Typical use:</term>
1900 <para>Block ads or other obnoxious content</para>
1905 <term>Effect:</term>
1908 Requests for URLs to which this action applies are blocked, i.e. the requests are not
1909 forwarded to the remote server, but answered locally with a substitute page or image,
1910 as determined by the <literal><link linkend="handle-as-image">handle-as-image</link></literal>
1911 and <literal><link linkend="set-image-blocker">set-image-blocker</link></literal> actions.
1918 <!-- boolean, parameterized, Multi-value -->
1920 <para>Boolean.</para>
1925 <term>Parameter:</term>
1935 <application>Privoxy</application> sends a special <quote>BLOCKED</quote> page
1936 for requests to blocked pages. This page contains links to find out why the request
1937 was blocked, and a click-through to the blocked content (the latter only if compiled with the
1938 force feature enabled). The <quote>BLOCKED</quote> page adapts to the available
1939 screen space -- it displays full-blown if space allows, or miniaturized and text-only
1940 if loaded into a small frame or window. If you are using <application>Privoxy</application>
1941 right now, you can take a look at the
1942 <ulink url="http://ads.bannerserver.example.com/nasty-ads/sponsor.html"><quote>BLOCKED</quote>
1946 A very important exception occurs if <emphasis>both</emphasis>
1947 <literal>block</literal> and <literal><link linkend="handle-as-image">handle-as-image</link></literal>,
1948 apply to the same request: it will then be replaced by an image. If
1949 <literal><link linkend="set-image-blocker">set-image-blocker</link></literal>
1950 (see below) also applies, the type of image will be determined by its parameter,
1951 if not, the standard checkerboard pattern is sent.
1954 It is important to understand this process, in order
1955 to understand how <application>Privoxy</application> deals with
1956 ads and other unwanted content.
1959 The <literal><link linkend="filter">filter</link></literal>
1960 action can perform a very similar task, by <quote>blocking</quote>
1961 banner images and other content through rewriting the relevant URLs in the
1962 document's HTML source, so they don't get requested in the first place.
1963 Note that this is a totally different technique, and it's easy to confuse the two.
1969 <term>Example usage (section):</term>
1972 <screen>{+block} # Block and replace with "blocked" page
1973 .nasty-stuff.example.com
1975 {+block +handle-as-image} # Block and replace with image
1986 <!-- ~~~~~ New section ~~~~~ -->
1987 <sect3 renderas="sect4" id="crunch-incoming-cookies">
1988 <title>crunch-incoming-cookies</title>
1992 <term>Typical use:</term>
1995 Prevent the web server from setting any cookies on your system
2001 <term>Effect:</term>
2004 Deletes any <quote>Set-Cookie:</quote> HTTP headers from server replies.
2011 <!-- Boolean, Parameterized, Multi-value -->
2013 <para>Boolean.</para>
2018 <term>Parameter:</term>
2030 This action is only concerned with <emphasis>incoming</emphasis> cookies. For
2031 <emphasis>outgoing</emphasis> cookies, use
2032 <literal><link linkend="crunch-outgoing-cookies">crunch-outgoing-cookies</link></literal>.
2033 Use <emphasis>both</emphasis> to disable cookies completely.
2036 It makes <emphasis>no sense at all</emphasis> to use this action in conjunction
2037 with the <literal><link linkend="session-cookies-only">session-cookies-only</link></literal> action,
2038 since it would prevent the session cookies from being set.
2044 <term>Example usage:</term>
2047 <screen>+crunch-incoming-cookies</screen>
2055 <!-- ~~~~~ New section ~~~~~ -->
2056 <sect3 renderas="sect4" id="crunch-outgoing-cookies">
2057 <title>crunch-outgoing-cookies</title>
2061 <term>Typical use:</term>
2064 Prevent the web server from reading any cookies from your system
2070 <term>Effect:</term>
2073 Deletes any <quote>Cookie:</quote> HTTP headers from client requests.
2080 <!-- Boolean, Parameterized, Multi-value -->
2082 <para>Boolean.</para>
2087 <term>Parameter:</term>
2099 This action is only concerned with <emphasis>outgoing</emphasis> cookies. For
2100 <emphasis>incoming</emphasis> cookies, use
2101 <literal><link linkend="crunch-incoming-cookies">crunch-incoming-cookies</link></literal>.
2102 Use <emphasis>both</emphasis> to disable cookies completely.
2105 It makes <emphasis>no sense at all</emphasis> to use this action in conjunction
2106 with the <literal><link linkend="session-cookies-only">session-cookies-only</link></literal> action,
2107 since it would prevent the session cookies from being read.
2113 <term>Example usage:</term>
2116 <screen>+crunch-outgoing-cookies</screen>
2125 <!-- ~~~~~ New section ~~~~~ -->
2126 <sect3 renderas="sect4" id="deanimate-gifs">
2127 <title>deanimate-gifs</title>
2131 <term>Typical use:</term>
2133 <para>Stop those annoying, distracting animated GIF images.</para>
2138 <term>Effect:</term>
2141 De-animate GIF animations, i.e. reduce them to their first or last image.
2148 <!-- boolean, parameterized, Multi-value -->
2150 <para>Parameterized.</para>
2155 <term>Parameter:</term>
2158 <quote>last</quote> or <quote>first</quote>
2167 This will also shrink the images considerably (in bytes, not pixels!). If
2168 the option <quote>first</quote> is given, the first frame of the animation
2169 is used as the replacement. If <quote>last</quote> is given, the last
2170 frame of the animation is used instead, which probably makes more sense for
2171 most banner animations, but also has the risk of not showing the entire
2172 last frame (if it is only a delta to an earlier frame).
2175 You can safely use this action with patterns that will also match non-GIF
2176 objects, because no attempt will be made at anything that doesn't look like
2183 <term>Example usage:</term>
2186 <screen>+deanimate-gifs{last}</screen>
2193 <!-- ~~~~~ New section ~~~~~ -->
2194 <sect3 renderas="sect4" id="downgrade-http-version">
2195 <title>downgrade-http-version</title>
2199 <term>Typical use:</term>
2201 <para>Work around (very rare) problems with HTTP/1.1</para>
2206 <term>Effect:</term>
2209 Downgrades HTTP/1.1 client requests and server replies to HTTP/1.0.
2216 <!-- boolean, parameterized, Multi-value -->
2218 <para>Boolean.</para>
2223 <term>Parameter:</term>
2235 This is a left-over from the time when <application>Privoxy</application>
2236 didn't support important HTTP/1.1 features well. It is left here for the
2237 unlikely case that you experience HTTP/1.1 related problems with some server
2238 out there. Not all (optional) HTTP/1.1 features are supported yet, so there
2239 is a chance you might need this action.
2245 <term>Example usage (section):</term>
2248 <screen>{+downgrade-http-version}
2249 problem-host.example.com</screen>
2257 <!-- ~~~~~ New section ~~~~~ -->
2258 <sect3 renderas="sect4" id="fast-redirects">
2259 <title>fast-redirects</title>
2263 <term>Typical use:</term>
2265 <para>Fool some click-tracking scripts and speed up indirect links</para>
2270 <term>Effect:</term>
2273 Cut off all but the last valid URL from requests.
2280 <!-- boolean, parameterized, Multi-value -->
2282 <para>Boolean.</para>
2287 <term>Parameter:</term>
2299 Many sites, like yahoo.com, don't just link to other sites. Instead, they
2300 will link to some script on their own servers, giving the destination as a
2301 parameter, which will then redirect you to the final target. URLs
2302 resulting from this scheme typically look like:
2303 <emphasis>http://some.place/click-tracker.cgi?target=http://some.where.else</emphasis>.
2306 Sometimes, there are even multiple consecutive redirects encoded in the
2307 URL. These redirections via scripts make your web browsing more traceable,
2308 since the server from which you follow such a link can see where you go
2309 to. Apart from that, valuable bandwidth and time is wasted, while your
2310 browser ask the server for one redirect after the other. Plus, it feeds
2314 This feature is currently not very smart and is scheduled for improvement.
2315 It is likely to break some sites. You should expect to need possibly
2316 many exceptions to this action, if it is enabled by default in
2317 <filename>default.action</filename>. Some sites just don't work without
2324 <term>Example usage:</term>
2327 <screen>{+fast-redirects}</screen>
2336 <!-- ~~~~~ New section ~~~~~ -->
2337 <sect3 renderas="sect4" id="filter">
2338 <title>filter</title>
2342 <term>Typical use:</term>
2344 <para>Get rid of HTML and JavaScript annoyances, banner advertisements (by size), do fun text replacements, etc.</para>
2349 <term>Effect:</term>
2352 Text documents, including HTML and JavaScript, to which this action
2353 applies, are filtered on-the-fly through the specified regular expression
2354 based substitutions.
2361 <!-- boolean, parameterized, Multi-value -->
2363 <para>Parameterized.</para>
2368 <term>Parameter:</term>
2371 The name of a filter, as defined in the <link linkend="filter-file">filter file</link>
2372 (typically <filename>default.filter</filename>, set by the
2373 <literal><link linkend="filterfile">filterfile</link></literal>
2374 option in the <link linkend="config">config file</link>). Filtering
2375 can be completely disabled without the use of parameters.
2384 For your convenience, there are a number of pre-defined filters available
2385 in the distribution filter file that you can use. See the examples below for
2389 This is potentially a very powerful feature! But <quote>rolling your own</quote>
2390 filters requires a knowledge of regular expressions and HTML.
2393 Filtering requires buffering the page content, which may appear to
2394 slow down page rendering since nothing is displayed until all content has
2395 passed the filters. (It does not really take longer, but seems that way
2396 since the page is not incrementally displayed.) This effect will be more
2397 noticeable on slower connections.
2400 The amount of data that can be filtered is limited to the
2401 <literal><link linkend="buffer-limit">buffer-limit</link></literal>
2402 option in the main <link linkend="config">config file</link>. The
2403 default is 4096 KB (4 Megs). Once this limit is exceeded, the buffered
2404 data, and all pending data, is passed through unfiltered. Inappropriate
2405 MIME types are not filtered.
2408 At this time, <application>Privoxy</application> cannot (yet!) uncompress compressed
2409 documents. If you want filtering to work on all documents, even those that
2410 would normally be sent compressed, use the
2411 <literal><link linkend="prevent-compression">prevent-compression</link></literal>
2412 action in conjunction with <literal>filter</literal>.
2415 Filtering can achieve some of the same effects as the
2416 <literal><link linkend="block">block</link></literal>
2417 action, i.e. it can be used to block ads and banners. But the mechanism
2418 works quite differently. One effective use, is to block ad banners
2419 based on their size (see below), since many of these seem to be somewhat
2423 <link linkend="contact">Feedback</link> with suggestions for new or
2424 improved filters is particularly welcome!
2430 <term>Example usage (with filters from the distribution <filename>default.filter</filename> file):</term>
2433 <anchor id="filter-html-annoyances">
2434 <screen>+filter{html-annoyances} # Get rid of particularly annoying HTML abuse.</screen>
2437 <anchor id="filter-js-annoyances">
2438 <screen>+filter{js-annoyances} # Get rid of particularly annoying JavaScript abuse</screen>
2441 <anchor id="filter-banners-by-size">
2442 <screen>+filter{banners-by-size} # Kill banners based on their size for this page (<emphasis>very</emphasis> efficient!)</screen>
2445 <anchor id="filter-content-cookies">
2446 <screen>+filter{content-cookies} # Kill cookies that come sneaking in the HTML or JS content</screen>
2449 <anchor id="filter-popups">
2450 <screen>+filter{popups} # Kill all popups in JS and HTML</screen>
2453 <anchor id="filter-webbugs">
2454 <screen>+filter{webbugs} # Squish WebBugs (1x1 invisible GIFs used for user tracking)</screen>
2457 <anchor id="filter-fun">
2458 <screen>+filter{fun} # Text replacements for subversive browsing fun!</screen>
2461 <anchor id="filter-frameset-borders">
2462 <screen>+filter{frameset-borders} # Give frames a border and make them resizeable</screen>
2465 <anchor id="filter-refresh-tags">
2466 <screen>+filter{refresh-tags} # Kill automatic refresh tags (for dial-on-demand setups)</screen>
2469 <anchor id="filter-nimda">
2470 <screen>+filter{nimda} # Remove Nimda (virus) code.</screen>
2473 <anchor id="filter-shockwave-flash">
2474 <screen>+filter{shockwave-flash} # Kill embedded Shockwave Flash objects</screen>
2477 <anchor id="filter-crude-parental">
2478 <screen>+filter{crude-parental} # Kill all web pages that contain the words "sex" or "warez"</screen>
2486 <!-- ~~~~~ New section ~~~~~ -->
2487 <sect3 renderas="sect4" id="handle-as-image">
2488 <title>handle-as-image</title>
2492 <term>Typical use:</term>
2494 <para>Mark URLs as belonging to images (so they'll be replaced by images <emphasis>if they get blocked</emphasis>)</para>
2499 <term>Effect:</term>
2502 This action alone doesn't do anything noticeable. It just marks URLs as images.
2503 If the <literal><link linkend="block">block</link></literal> action <emphasis>also applies</emphasis>,
2504 the presence or absence of this mark decides whether an HTML <quote>blocked</quote>
2505 page, or a replacement image (as determined by the <literal><link
2506 linkend="set-image-blocker">set-image-blocker</link></literal> action) will be sent to the
2507 client as a substitute for the blocked content.
2514 <!-- Boolean, Parameterized, Multi-value -->
2516 <para>Boolean.</para>
2521 <term>Parameter:</term>
2533 The below generic example section is actually part of <filename>default.action</filename>.
2534 It marks all URLs with well-known image file name extensions as images and should
2538 Users will probably only want to use the handle-as-image action in conjunction with
2539 <literal><link linkend="block">block</link></literal>, to block sources of banners, whose URLs don't
2540 reflect the file type, like in the second example section.
2543 Note that you cannot treat HTML pages as images in most cases. For instance, (in-line) ad
2544 frames require an HTML page to be sent, or they won't display properly.
2545 Forcing <literal>handle-as-image</literal> in this situation will not replace the
2546 ad frame with an image, but lead to error messages.
2552 <term>Example usage (sections):</term>
2555 <screen># Generic image extensions:
2558 /.*\.(gif|jpg|jpeg|png|bmp|ico)$
2560 # These don't look like images, but they're banners and should be
2561 # blocked as images:
2563 {+block +handle-as-image}
2564 some.nasty-banner-server.com/junk.cgi?output=trash
2566 # Banner source! Who cares if they also have non-image content?
2576 <!-- ~~~~~ New section ~~~~~ -->
2577 <sect3 renderas="sect4" id="hide-forwarded-for-headers">
2578 <title>hide-forwarded-for-headers</title>
2582 <term>Typical use:</term>
2584 <para>Improve privacy by hiding the true source of the request</para>
2589 <term>Effect:</term>
2592 Deletes any existing <quote>X-Forwarded-for:</quote> HTTP header from client requests,
2593 and prevents adding a new one.
2600 <!-- Boolean, Parameterized, Multi-value -->
2602 <para>Boolean.</para>
2607 <term>Parameter:</term>
2619 It is fairly safe to leave this on.
2622 This action is scheduled for improvement: It should be able to generate forged
2623 <quote>X-Forwarded-for:</quote> headers using random IP addresses from a specified network,
2624 to make successive requests from the same client look like requests from a pool of different
2625 users sharing the same proxy.
2631 <term>Example usage:</term>
2634 <screen>+hide-forwarded-for-headers</screen>
2642 <!-- ~~~~~ New section ~~~~~ -->
2643 <sect3 renderas="sect4" id="hide-from-header">
2644 <title>hide-from-header</title>
2648 <term>Typical use:</term>
2650 <para>Keep your (old and ill) browser from telling web servers your email address</para>
2655 <term>Effect:</term>
2658 Deletes any existing <quote>From:</quote> HTTP header, or replaces it with the
2666 <!-- Boolean, Parameterized, Multi-value -->
2668 <para>Parameterized.</para>
2673 <term>Parameter:</term>
2676 Keyword: <quote>block</quote>, or any user defined value.
2685 The keyword <quote>block</quote> will completely remove the header
2686 (not to be confused with the <literal><link linkend="block">block</link></literal>
2690 Alternately, you can specify any value you prefer to be sent to the web
2691 server. If you do, it is a matter of fairness not to use any address that
2692 is actually used by a real person.
2695 This action is rarely needed, as modern web browsers don't send
2696 <quote>From:</quote> headers anymore.
2702 <term>Example usage:</term>
2705 <screen>+hide-from-header{block}</screen> or
2706 <screen>+hide-from-header{spam-me-senseless@sittingduck.example.com}</screen>
2714 <!-- ~~~~~ New section ~~~~~ -->
2715 <sect3 renderas="sect4" id="hide-referrer">
2716 <title>hide-referrer</title>
2717 <anchor id="hide-referer">
2720 <term>Typical use:</term>
2722 <para>Conceal which link you followed to get to a particular site</para>
2727 <term>Effect:</term>
2730 Deletes the <quote>Referer:</quote> (sic) HTTP header from the client request,
2731 or replaces it with a forged one.
2738 <!-- Boolean, Parameterized, Multi-value -->
2740 <para>Parameterized.</para>
2745 <term>Parameter:</term>
2749 <para><quote>block</quote> to delete the header completely.</para>
2752 <para><quote>forge</quote> to pretend to be coming from the homepage of the server we are talking to.</para>
2755 <para>Any other string to set a user defined referrer.</para>
2765 <quote>forge</quote> is the preferred option here, since some servers will
2766 not send images back otherwise, in an attempt to prevent their valuable
2767 content from being embedded elsewhere (and hence, without being surrounded
2768 by <emphasis>their</emphasis> banners).
2771 <literal>hide-referer</literal> is an alternate spelling of
2772 <literal>hide-referrer</literal> and the two can be can be freely
2773 substituted with each other. (<quote>referrer</quote> is the
2774 correct English spelling, however the HTTP specification has a bug - it
2775 requires it to be spelled as <quote>referer</quote>.)
2781 <term>Example usage:</term>
2784 <screen>+hide-referrer{forge}</screen> or
2785 <screen>+hide-referrer{http://www.yahoo.com/}</screen>
2793 <!-- ~~~~~ New section ~~~~~ -->
2794 <sect3 renderas="sect4" id="hide-user-agent">
2795 <title>hide-user-agent</title>
2799 <term>Typical use:</term>
2801 <para>Conceal your type of browser and client operating system</para>
2806 <term>Effect:</term>
2809 Replaces the value of the <quote>User-Agent:</quote> HTTP header
2810 in client requests with the specified value.
2817 <!-- Boolean, Parameterized, Multi-value -->
2819 <para>Parameterized.</para>
2824 <term>Parameter:</term>
2827 Any user-defined string.
2837 This breaks many web sites that depend on looking at this header in order
2838 to customize their content for different browsers (which, by the
2839 way, is <emphasis>NOT</emphasis> a <ulink
2840 url="http://www.javascriptkit.com/javaindex.shtml">smart way to do
2845 Using this action in multi-user setups or wherever different types of
2846 browsers will access the same <application>Privoxy</application> is
2847 <emphasis>not recommended</emphasis>. In single-user, single-browser
2848 setups, you might use it to delete your OS version information from
2849 the headers, because it is an invitation to exploit known bugs for your
2850 OS. It is also occasionally useful to forge this in order to access
2851 sites that won't let you in otherwise (though there may be a good
2852 reason in some cases). Example of this: some MSN sites will not
2853 let <application>Mozilla</application> enter, yet forging to a
2854 <application>Netscape 6.1</application> user-agent works just fine.
2855 (Must be just a silly MS goof, I'm sure :-).
2858 This action is scheduled for improvement.
2864 <term>Example usage:</term>
2867 <screen>+hide-user-agent{Netscape 6.1 (X11; I; Linux 2.4.18 i686)}</screen>
2875 <!-- ~~~~~ New section ~~~~~ -->
2876 <sect3 renderas="sect4" id="kill-popups">
2877 <title>kill-popups<anchor id="kill-popup"></title>
2881 <term>Typical use:</term>
2883 <para>Eliminate those annoying pop-up windows</para>
2888 <term>Effect:</term>
2891 While loading the document, replace JavaScript code that opens
2892 pop-up windows with (syntactically neutral) dummy code on the fly.
2899 <!-- Boolean, Parameterized, Multi-value -->
2901 <para>Boolean.</para>
2906 <term>Parameter:</term>
2918 This action is easily confused with the built-in, hardwired <literal><link linkend="filter">filter</link></literal>
2919 action, but there are important differences: For <literal>kill-popups</literal>,
2920 the document need not be buffered, so it can be incrementally rendered while
2921 downloading. But <literal>kill-popups</literal> doesn't catch as many pop-ups as
2923 linkend="filter">filter</link>{<replaceable>popups</replaceable>}</literal>
2927 Think of it as a fast and efficient replacement for a filter that you
2928 can use if you don't want any filtering at all. Note that it doesn't make
2929 sense to combine it with any <literal><link linkend="filter">filter</link></literal> action,
2930 since as soon as one <literal><link linkend="filter">filter</link></literal> applies,
2931 the whole document needs to be buffered anyway, which destroys the advantage of
2932 the <literal>kill-popups</literal> action over its filter equivalent.
2935 Killing all pop-ups is a dangerous business. Many shops and banks rely on
2936 pop-ups to display forms, shopping carts etc, and killing only the unwanted pop-ups
2937 would require artificial intelligence in <application>Privoxy</application>.
2938 If the only kind of pop-ups that you want to kill are exit consoles (those
2939 <emphasis>really nasty</emphasis> windows that appear when you close an other
2940 one), you might want to use
2942 linkend="filter">filter</link>{<replaceable>js-annoyances</replaceable>}</literal>
2948 An alternate spelling is <literal>+kill-popup</literal>, which is
2956 <term>Example usage:</term>
2958 <para><screen>+kill-popups</screen></para>
2965 <!-- ~~~~~ New section ~~~~~ -->
2966 <sect3 renderas="sect4" id="limit-connect">
2967 <title>limit-connect</title>
2971 <term>Typical use:</term>
2973 <para>Prevent abuse of <application>Privoxy</application> as a TCP proxy relay</para>
2978 <term>Effect:</term>
2981 Specifies to which ports HTTP CONNECT requests are allowable.
2988 <!-- Boolean, Parameterized, Multi-value -->
2990 <para>Parameterized.</para>
2995 <term>Parameter:</term>
2998 A comma-separated list of ports or port ranges (the latter using dashes, with the minimum
2999 defaulting to 0 and the maximum to 65K).
3008 By default, i.e. if no <literal>limit-connect</literal> action applies,
3009 <application>Privoxy</application> only allows HTTP CONNECT
3010 requests to port 443 (the standard, secure HTTPS port). Use
3011 <literal>limit-connect</literal> if more fine-grained control is desired
3012 for some or all destinations.
3015 The CONNECT methods exists in HTTP to allow access to secure websites
3016 (<quote>https://</quote> URLs) through proxies. It works very simply:
3017 the proxy connects to the server on the specified port, and then
3018 short-circuits its connections to the client and to the remote server.
3019 This can be a big security hole, since CONNECT-enabled proxies can be
3020 abused as TCP relays very easily.
3023 If you don't know what any of this means, there probably is no reason to
3024 change this one, since the default is already very restrictive.
3030 <term>Example usages:</term>
3032 <!-- I had trouble getting the spacing to look right in my browser -->
3033 <!-- I probably have the wrong font setup, bollocks. -->
3034 <!-- Apparently the emphasis tag uses a proportional font no matter what -->
3036 <screen>+limit-connect{443} # This is the default and need not be specified.
3037 +limit-connect{80,443} # Ports 80 and 443 are OK.
3038 +limit-connect{-3, 7, 20-100, 500-} # Ports less than 3, 7, 20 to 100 and above 500 are OK.
3039 +limit-connect{-} # All ports are OK (gaping security hole!)</screen>
3046 <!-- ~~~~~ New section ~~~~~ -->
3047 <sect3 renderas="sect4" id="prevent-compression">
3048 <title>prevent-compression</title>
3052 <term>Typical use:</term>
3055 Ensure that servers send the content uncompressed, so it can be
3056 passed through <literal><link linkend="filter">filter</link></literal>s
3062 <term>Effect:</term>
3065 Adds a header to the request that asks for uncompressed transfer.
3072 <!-- Boolean, Parameterized, Multi-value -->
3074 <para>Boolean.</para>
3079 <term>Parameter:</term>
3091 More and more websites send their content compressed by default, which
3092 is generally a good idea and saves bandwidth. But for the <literal><link
3093 linkend="filter">filter</link></literal>, <literal><link linkend="deanimate-gifs">deanimate-gifs</link></literal>
3094 and <literal><link linkend="kill-popups">kill-popups</link></literal> actions to work,
3095 <application>Privoxy</application> needs access to the uncompressed data.
3096 Unfortunately, <application>Privoxy</application> can't yet(!) uncompress, filter, and
3097 re-compress the content on the fly. So if you want to ensure that all websites, including
3098 those that normally compress, can be filtered, you need to use this action.
3101 This will slow down transfers from those websites, though. If you use any of the above-mentioned
3102 actions, you will typically want to use <literal>prevent-compression</literal> in conjunction
3106 Note that some (rare) ill-configured sites don't handle requests for uncompressed
3107 documents correctly (they send an empty document body). If you use <literal>prevent-compression</literal>
3108 per default, you'll have to add exceptions for those sites. See the example for how to do that.
3114 <term>Example usage (sections):</term>
3117 <screen># Set default:
3119 {+prevent-compression}
3122 # Make exceptions for ill sites:
3124 {-prevent-compression}
3126 www.pclinuxonline.com</screen>
3135 <!-- ~~~~~ New section ~~~~~ -->
3136 <sect3 renderas="sect4" id="send-vanilla-wafer">
3137 <title>send-vanilla-wafer</title>
3141 <term>Typical use:</term>
3144 Feed log analysis scripts with useless data.
3150 <term>Effect:</term>
3153 Sends a cookie with each request stating that you do not accept any copyright
3154 on cookies sent to you, and asking the site operator not to track you.
3161 <!-- Boolean, Parameterized, Multi-value -->
3163 <para>Boolean.</para>
3168 <term>Parameter:</term>
3180 The vanilla wafer is a (relatively) unique header and could conceivably be used to track you.
3183 This action is rarely used and not enabled in the default configuration.
3189 <term>Example usage:</term>
3192 <screen>+send-vanilla-wafer</screen>
3201 <!-- ~~~~~ New section ~~~~~ -->
3202 <sect3 renderas="sect4" id="send-wafer">
3203 <title>send-wafer</title>
3207 <term>Typical use:</term>
3210 Send custom cookies or feed log analysis scripts with even more useless data.
3216 <term>Effect:</term>
3219 Sends a custom, user-defined cookie with each request.
3226 <!-- Boolean, Parameterized, Multi-value -->
3228 <para>Multi-value.</para>
3233 <term>Parameter:</term>
3236 A string of the form <quote><replaceable class="option">name</replaceable>=<replaceable
3237 class="parameter">value</replaceable></quote>.
3246 Being multi-valued, multiple instances of this action can apply to the same request,
3247 resulting in multiple cookies being sent.
3250 This action is rarely used and not enabled in the default configuration.
3255 <term>Example usage (section):</term>
3258 <screen>{+send-wafer{UsingPrivoxy=true}}
3259 my-internal-testing-server.void</screen>
3267 <!-- ~~~~~ New section ~~~~~ -->
3268 <sect3 renderas="sect4" id="session-cookies-only">
3269 <title>session-cookies-only</title>
3273 <term>Typical use:</term>
3276 Allow only temporary <quote>session</quote> cookies (for the current browser session <emphasis>only</emphasis>).
3282 <term>Effect:</term>
3285 Deletes the <quote>expires</quote> field from <quote>Set-Cookie:</quote> server headers.
3286 Most browsers will not store such cookies permanently and forget them in between sessions.
3293 <!-- Boolean, Parameterized, Multi-value -->
3295 <para>Boolean.</para>
3300 <term>Parameter:</term>
3312 This is less strict than <literal><link linkend="crunch-incoming-cookies">crunch-incoming-cookies</link></literal> /
3313 <literal><link linkend="crunch-outgoing-cookies">crunch-outgoing-cookies</link></literal> and allows you to browse
3314 websites that insist or rely on setting cookies, without compromising your privacy too badly.
3317 Most browsers will not permanently store cookies that have been processed by
3318 <literal>session-cookies-only</literal> and will forget about them between sessions.
3319 This makes profiling cookies useless, but won't break sites which require cookies so
3320 that you can log in for transactions. This is generally turned on for all
3321 sites, and is the recommended setting.
3324 It makes <emphasis>no sense at all</emphasis> to use <literal>session-cookies-only</literal>
3325 together with <literal><link linkend="crunch-incoming-cookies">crunch-incoming-cookies</link></literal> or
3326 <literal><link linkend="crunch-outgoing-cookies">crunch-outgoing-cookies</link></literal>. If you do, cookies
3327 will be plainly killed.
3330 Note that it is up to the browser how it handles such cookies without an <quote>expires</quote>
3331 field. If you use an exotic browser, you might want to try it out to be sure.
3337 <term>Example usage:</term>
3340 <screen>+session-cookies-only</screen>
3348 <!-- ~~~~~ New section ~~~~~ -->
3349 <sect3 renderas="sect4" id="set-image-blocker">
3350 <title>set-image-blocker</title>
3354 <term>Typical use:</term>
3356 <para>Choose the replacement for blocked images</para>
3361 <term>Effect:</term>
3364 This action alone doesn't do anything noticeable. If <emphasis>both</emphasis>
3365 <literal><link linkend="block">block</link></literal> <emphasis>and</emphasis> <literal><link
3366 linkend="handle-as-image">handle-as-image</link></literal> <emphasis>also</emphasis>
3367 apply, i.e. if the request is to be blocked as an image,
3368 <emphasis>then</emphasis> the parameter of this action decides what will be
3369 sent as a replacement.
3376 <!-- Boolean, Parameterized, Multi-value -->
3378 <para>Parameterized.</para>
3383 <term>Parameter:</term>
3388 <quote>pattern</quote> to send a built-in checkerboard pattern image. The image is visually
3389 decent, scales very well, and makes it obvious where banners were busted.
3394 <quote>blank</quote> to send a built-in transparent image. This makes banners disappear
3395 completely, but makes it hard to detect where <application>Privoxy</application> has blocked
3396 images on a given page and complicates troubleshooting if <application>Privoxy</application>
3397 has blocked innocent images, like navigation icons.
3402 <quote><replaceable class="parameter">target-url</replaceable></quote> to
3403 send a redirect to <replaceable class="parameter">target-url</replaceable>. You can redirect
3404 to any image anywhere, even in your local filesystem (via <quote>file:///</quote> URL).
3407 A good application of redirects is to use special <application>Privoxy</application>-built-in
3408 URLs, which send the built-in images, as <replaceable class="parameter">target-url</replaceable>.
3409 This has the same visual effect as specifying <quote>blank</quote> or <quote>pattern</quote> in
3410 the first place, but enables your browser to cache the replacement image, instead of requesting
3411 it over and over again.
3422 The URLs for the built-in images are <quote>http://config.privoxy.org/send-banner?type=<replaceable
3423 class="parameter">type</replaceable></quote>, where <replaceable class="parameter">type</replaceable> is
3424 either <quote>blank</quote> or <quote>pattern</quote>.
3427 There is a third (advanced) type, called <quote>auto</quote>. It is <emphasis>NOT</emphasis> to be
3428 used in <literal>set-image-blocker</literal>, but meant for use from <link linkend="filter-file">filters</link>.
3429 Auto will select the type of image that would have applied to the referring page, had it been an image.
3435 <term>Example usage:</term>
3441 <screen>+set-image-blocker{pattern}</screen>
3444 Redirect to the BSD devil:
3447 <screen>+set-image-blocker{http://www.freebsd.org/gifs/dae_up3.gif}</screen>
3450 Redirect to the built-in pattern for better caching:
3453 <screen>+set-image-blocker{http://config.privoxy.org/send-banner?type=pattern}</screen>
3461 <!-- ~~~~~ New section ~~~~~ -->
3463 <title>Summary</title>
3465 Note that many of these actions have the potential to cause a page to
3466 misbehave, possibly even not to display at all. There are many ways
3467 a site designer may choose to design his site, and what HTTP header
3468 content, and other criteria, he may depend on. There is no way to have hard
3469 and fast rules for all sites. See the <link
3470 linkend="ACTIONSANAT">Appendix</link> for a brief example on troubleshooting
3476 <!-- ~~~~~ New section ~~~~~ -->
3477 <sect2 id="aliases">
3478 <title>Aliases</title>
3480 Custom <quote>actions</quote>, known to <application>Privoxy</application>
3481 as <quote>aliases</quote>, can be defined by combining other actions.
3482 These can in turn be invoked just like the built-in actions.
3483 Currently, an alias name can contain any character except space, tab,
3485 <quote>{</quote> and <quote>}</quote>, but we <emphasis>strongly
3486 recommend</emphasis> that you only use <quote>a</quote> to <quote>z</quote>,
3487 <quote>0</quote> to <quote>9</quote>, <quote>+</quote>, and <quote>-</quote>.
3488 Alias names are not case sensitive, and are not required to start with a
3489 <quote>+</quote> or <quote>-</quote> sign, since they are merely textually
3493 Aliases can be used throughout the actions file, but they <emphasis>must be
3494 defined in a special section at the top of the file!</emphasis>
3495 And there can only be one such section per actions file. Each actions file may
3496 have its own alias section, and the aliases defined in it are only visible
3500 There are two main reasons to use aliases: One is to save typing for frequently
3501 used combinations of actions, the other one is a gain in flexibility: If you
3502 decide once how you want to handle shops by defining an alias called
3503 <quote>shop</quote>, you can later change your policy on shops in
3504 <emphasis>one</emphasis> place, and your changes will take effect everywhere
3505 in the actions file where the <quote>shop</quote> alias is used. Calling aliases
3506 by their purpose also makes your actions files more readable.
3509 Currently, there is one big drawback to using aliases, though:
3510 <application>Privoxy</application>'s built-in web-based action file
3511 editor honors aliases when reading the actions files, but it expands
3512 them before writing. So the effects of your aliases are of course preserved,
3513 but the aliases themselves are lost when you edit sections that use aliases
3515 This is likely to change in future versions of <application>Privoxy</application>.
3519 Now let's define some aliases...
3524 # Useful custom aliases we can use later.
3526 # Note the (required!) section header line and that this section
3527 # must be at the top of the actions file!
3531 # These aliases just save typing later:
3532 # (Note that some already use other aliases!)
3534 +crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies
3535 -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
3536 block-as-image = +block +handle-as-image
3537 mercy-for-cookies = -crunch-all-cookies -session-cookies-only
3539 # These aliases define combinations of actions
3540 # that are useful for certain types of sites:
3542 fragile = -block -crunch-all-cookies -filter -fast-redirects -hide-referer -kill-popups
3543 shop = -crunch-all-cookies -filter{popups} -kill-popups
3545 # Short names for other aliases, for really lazy people ;-)
3547 c0 = +crunch-all-cookies
3548 c1 = -crunch-all-cookies</screen>
3552 ...and put them to use. These sections would appear in the lower part of an
3553 actions file and define exceptions to the default actions (as specified further
3554 up for the <quote>/</quote> pattern):
3559 # These sites are either very complex or very keen on
3560 # user data and require minimal interference to work:
3563 .office.microsoft.com
3564 .windowsupdate.microsoft.com
3568 # Allow cookies (for setting and retrieving your customer data)
3572 .worldpay.com # for quietpc.com
3575 # These shops require pop-ups:
3577 {shop -kill-popups -filter{popups}}
3579 .overclockers.co.uk</screen>
3583 Aliases like <quote>shop</quote> and <quote>fragile</quote> are often used for
3584 <quote>problem</quote> sites that require some actions to be disabled
3585 in order to function properly.
3589 <!-- ~~~~~ New section ~~~~~ -->
3590 <sect2 id="act-examples">
3591 <title>Actions Files Tutorial</title>
3593 The above chapters have shown <link linkend="actions-file">which actions files
3594 there are and how they are organized</link>, how actions are <link
3595 linkend="actions">specified</link> and <link linkend="actions-apply">applied
3596 to URLs</link>, how <link linkend="af-patterns">patterns</link> work, and how to
3597 define and use <link linkend="aliases">aliases</link>. Now, let's look at an
3598 example <filename>default.action</filename> and <filename>user.action</filename>
3599 file and see how all these pieces come together:
3602 <sect3><title>default.action</title>
3605 Every config file should start with a short comment stating its purpose:
3609 <screen># Sample default.action file <developers@privoxy.org></screen>
3613 Then, since this is the <filename>default.action</filename> file, the
3614 first section is a special section for internal use that you needn't
3615 change or worry about:
3620 ##########################################################################
3621 # Settings -- Don't change! For internal Privoxy use ONLY.
3622 ##########################################################################
3625 for-privoxy-version=3.0</screen>
3629 After that comes the (optional) alias section. We'll use the example
3630 section from the above <link linkend="aliases">chapter on aliases</link>,
3631 that also explains why and how aliases are used:
3636 ##########################################################################
3638 ##########################################################################
3641 # These aliases just save typing later:
3642 # (Note that some already use other aliases!)
3644 +crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies
3645 -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
3646 block-as-image = +block +handle-as-image
3647 mercy-for-cookies = -crunch-all-cookies -session-cookies-only
3649 # These aliases define combinations of actions
3650 # that are useful for certain types of sites:
3652 fragile = -block -crunch-all-cookies -filter -fast-redirects -hide-referer -kill-popups
3653 shop = mercy-for-cookies -filter{popups} -kill-popups</screen>
3657 Now come the regular sections, i.e. sets of actions, accompanied
3658 by URL patterns to which they apply. Remember <emphasis>all actions
3659 are disabled when matching starts</emphasis>, so we have to explicitly
3660 enable the ones we want.
3664 The first regular section is probably the most important. It has only
3665 one pattern, <quote><literal>/</literal></quote>, but this pattern
3666 <link linkend="af-patterns">matches all URLs</link>. Therefore, the
3667 set of actions used in this <quote>default</quote> section <emphasis>will
3668 be applied to all requests as a start</emphasis>. It can be partly or
3669 wholly overridden by later matches further down this file, or in user.action,
3670 but it will still be largely responsible for your overall browsing
3675 Again, at the start of matching, all actions are disabled, so there is
3676 no real need to disable any actions here, but we will do that nonetheless,
3677 to have a complete listing for your reference. (Remember: a <quote>+</quote>
3678 preceding the action name enables the action, a <quote>-</quote> disables!).
3679 Also note how this long line has been made more readable by splitting it into
3680 multiple lines with line continuation.
3685 ##########################################################################
3686 # "Defaults" section:
3687 ##########################################################################
3689 -<link linkend="ADD-HEADER">add-header</link> \
3690 -<link linkend="BLOCK">block</link> \
3691 -<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> \
3692 -<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link> \
3693 +<link linkend="DEANIMATE-GIFS">deanimate-gifs</link> \
3694 -<link linkend="DOWNGRADE-HTTP-VERSION">downgrade-http-version</link> \
3695 +<link linkend="FAST-REDIRECTS">fast-redirects</link> \
3696 +<link linkend="FILTER-HTML-ANNOYANCES">filter{html-annoyances}</link> \
3697 +<link linkend="FILTER-JS-ANNOYANCES">filter{js-annoyances}</link> \
3698 -<link linkend="FILTER-CONTENT-COOKIES">filter{content-cookies}</link> \
3699 +<link linkend="FILTER-POPUPS">filter{popups}</link> \
3700 +<link linkend="FILTER-WEBBUGS">filter{webbugs}</link> \
3701 -<link linkend="FILTER-REFRESH-TAGS">filter{refresh-tags}</link> \
3702 -<link linkend="FILTER-FUN">filter{fun}</link> \
3703 +<link linkend="FILTER-NIMDA">filter{nimda}</link> \
3704 +<link linkend="FILTER-BANNERS-BY-SIZE">filter{banners-by-size}</link> \
3705 -<link linkend="FILTER-SHOCKWAVE-FLASH">filter{shockwave-flash}</link> \
3706 -<link linkend="FILTER-CRUDE-PARENTAL">filter{crude-parental}</link> \
3707 -<link linkend="HANDLE-AS-IMAGE">handle-as-image</link> \
3708 +<link linkend="HIDE-FORWARDED-FOR-HEADERS">hide-forwarded-for-headers</link> \
3709 +<link linkend="HIDE-FROM-HEADER">hide-from-header{block}</link> \
3710 +<link linkend="HIDE-REFERER">hide-referrer{forge}</link> \
3711 -<link linkend="HIDE-USER-AGENT">hide-user-agent</link> \
3712 -<link linkend="KILL-POPUPS">kill-popups</link> \
3713 -<link linkend="LIMIT-CONNECT">limit-connect</link> \
3714 +<link linkend="PREVENT-COMPRESSION">prevent-compression</link> \
3715 -<link linkend="SEND-VANILLA-WAFER">send-vanilla-wafer</link> \
3716 -<link linkend="SEND-WAFER">send-wafer</link> \
3717 +<link linkend="SESSION-COOKIES-ONLY">session-cookies-only</link> \
3718 +<link linkend="SET-IMAGE-BLOCKER">set-image-blocker{pattern}</link> \
3720 / # forward slash will match *all* potential URL patterns.</screen>
3724 The default behavior is now set. Note that some actions, like not hiding
3725 the user agent, are part of a <quote>general policy</quote> that applies
3726 universally and won't get any exceptions defined later. Other choices,
3727 like not blocking (which is <emphasis>understandably</emphasis> the
3728 default!) need exceptions, i.e. we need to specify explicitly what we
3729 want to block in later sections.
3730 We will also want to make exceptions from our general pop-up-killing,
3731 and use our defined aliases for that.
3735 The first of our specialized sections is concerned with <quote>fragile</quote>
3736 sites, i.e. sites that require minimum interference, because they are either
3737 very complex or very keen on tracking you (and have mechanisms in place that
3738 make them unusable for people who avoid being tracked). We will simply use
3739 our pre-defined <literal>fragile</literal> alias instead of stating the list
3740 of actions explicitly:
3745 ##########################################################################
3746 # Exceptions for sites that'll break under the default action set:
3747 ##########################################################################
3749 # "Fragile" Use a minimum set of actions for these sites (see alias above):
3752 .office.microsoft.com # surprise, surprise!
3753 .windowsupdate.microsoft.com</screen>
3757 Shopping sites are not as fragile, but they typically
3758 require cookies to log in, and pop-up windows for shopping
3759 carts or item details. Again, we'll use a pre-defined alias:
3768 .worldpay.com # for quietpc.com
3770 .scan.co.uk</screen>
3774 Then, there are sites which rely on pop-up windows (yuck!) to work.
3775 Since we made pop-up-killing our default above, we need to make exceptions
3776 now. <ulink url="http://www.mozilla.org/">Mozilla</ulink> users, who
3777 can turn on smart handling of unwanted pop-ups in their browsers, can
3779 -<literal><link linkend="FILTER-POPUPS">filter{popups}</link></literal> (and
3780 -<literal><link linkend="KILL-POPUPS">kill-popups</link></literal>) above
3781 and hence don't need this section. Anyway, disabling an already disabled
3782 action doesn't hurt, so we'll define our exceptions regardless of what was
3783 chosen in the defaults section:
3788 # These sites require pop-ups too :(
3790 { -<link linkend="KILL-POPUPS">kill-popups</link> -<link linkend="FILTER-POPUPS">filter{popups}</link> }
3793 .deutsche-bank-24.de</screen>
3797 The <literal><link linkend="FAST-REDIRECTS">fast-redirects</link></literal>
3798 action, which we enabled per default above, breaks some sites. So disable
3799 it for popular sites where we know it misbehaves:
3804 { -<link linkend="FAST-REDIRECTS">fast-redirects</link> }
3808 .altavista.com/.*(like|url|link):http
3809 .altavista.com/trans.*urltext=http
3810 .nytimes.com</screen>
3814 It is important that <application>Privoxy</application> knows which
3815 URLs belong to images, so that <emphasis>if</emphasis> they are to
3816 be blocked, a substitute image can be sent, rather than an HTML page.
3817 Contacting the remote site to find out is not an option, since it
3818 would destroy the loading time advantage of banner blocking, and it
3819 would feed the advertisers (in terms of money <emphasis>and</emphasis>
3820 information). We can mark any URL as an image with the <literal><link
3821 linkend="handle-as-image">handle-as-image</link></literal> action,
3822 and marking all URLs that end in a known image file extension is a
3828 ##########################################################################
3830 ##########################################################################
3832 # Define which file types will be treated as images, in case they get
3833 # blocked further down this file:
3835 { +<link linkend="HANDLE-AS-IMAGE">handle-as-image</link> }
3836 /.*\.(gif|jpe?g|png|bmp|ico)$</screen>
3840 And then there are known banner sources. They often use scripts to
3841 generate the banners, so it won't be visible from the URL that the
3842 request is for an image. Hence we block them <emphasis>and</emphasis>
3843 mark them as images in one go, with the help of our
3844 <literal>block-as-image</literal> alias defined above. (We could of
3845 course just as well use <literal>+<link linkend="block">block</link>
3846 +<link linkend="handle-as-image">handle-as-image</link></literal> here.)
3847 Remember that the type of the replacement image is chosen by the
3848 <literal><link linkend="set-image-blocker">set-image-blocker</link></literal>
3849 action. Since all URLs have matched the default section with its
3850 <literal>+<link linkend="set-image-blocker">set-image-blocker</link>{pattern}</literal>
3851 action before, it still applies and needn't be repeated:
3856 # Known ad generators:
3861 .ad.*.doubleclick.net
3862 .a.yimg.com/(?:(?!/i/).)*$
3863 .a[0-9].yimg.com/(?:(?!/i/).)*$
3870 One of the most important jobs of <application>Privoxy</application>
3871 is to block banners. A huge bunch of them are already <quote>blocked</quote>
3872 by the <literal><link linkend="filter">filter</link>{banners-by-size}</literal>
3873 action, which we enabled above, and which deletes the references to banner
3874 images from the pages while they are loaded, so the browser doesn't request
3875 them anymore, and hence they don't need to be blocked here. But this naturally
3876 doesn't catch all banners, and some people choose not to use filters, so we
3877 need a comprehensive list of patterns for banner URLs here, and apply the
3878 <literal><link linkend="block">block</link></literal> action to them.
3881 First comes a bunch of generic patterns, which do most of the work, by
3882 matching typical domain and path name components of banners. Then comes
3883 a list of individual patterns for specific sites, which is omitted here
3884 to keep the example short:
3889 ##########################################################################
3890 # Block these fine banners:
3891 ##########################################################################
3892 { <link linkend="BLOCK">+block</link> }
3900 /.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?)
3901 /(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/
3903 # Site-specific patterns (abbreviated):
3905 .hitbox.com</screen>
3909 You wouldn't believe how many advertisers actually call their banner
3910 servers ads.<replaceable>company</replaceable>.com, or call the directory
3911 in which the banners are stored simply <quote>banners</quote>. So the above
3912 generic patterns are surprisingly effective.
3915 But being very generic, they necessarily also catch URLs that we don't want
3916 to block. The pattern <literal>.*ads.</literal> e.g. catches
3917 <quote>nasty-<emphasis>ads</emphasis>.nasty-corp.com</quote> as intended,
3918 but also <quote>downlo<emphasis>ads</emphasis>.sourcefroge.net</quote> or
3919 <quote><emphasis>ads</emphasis>l.some-provider.net.</quote> So here come some
3920 well-known exceptions to the <literal>+<link linkend="BLOCK">block</link></literal>
3924 Note that these are exceptions to exceptions from the default! Consider the URL
3925 <quote>downloads.sourcefroge.net</quote>: Initially, all actions are deactivated,
3926 so it wouldn't get blocked. Then comes the defaults section, which matches the
3927 URL, but just deactivates the <literal><link linkend="BLOCK">block</link></literal>
3928 action once again. Then it matches <literal>.*ads.</literal>, an exception to the
3929 general non-blocking policy, and suddenly
3930 <literal><link linkend="BLOCK">+block</link></literal> applies. And now, it'll match
3931 <literal>.*loads.</literal>, where <literal><link linkend="BLOCK">-block</link></literal>
3932 applies, so (unless it matches <emphasis>again</emphasis> further down) it ends up
3933 with no <literal><link linkend="BLOCK">block</link></literal> action applying.
3938 ##########################################################################
3939 # Save some innocent victims of the above generic block patterns:
3940 ##########################################################################
3944 { -<link linkend="BLOCK">block</link> }
3945 adv[io]*. # (for advogato.org and advice.*)
3946 adsl. # (has nothing to do with ads)
3947 ad[ud]*. # (adult.* and add.*)
3948 .edu # (universities don't host banners (yet!))
3949 .*loads. # (downloads, uploads etc)
3957 www.globalintersec.com/adv # (adv = advanced)
3958 www.ugu.com/sui/ugu/adv</screen>
3962 Filtering source code can have nasty side effects,
3963 so make an exception for our friends at sourceforge.net,
3964 and all paths with <quote>cvs</quote> in them. Note that
3965 <literal>-<link linkend="FILTER">filter</link></literal>
3966 disables <emphasis>all</emphasis> filters in one fell swoop!
3971 # Don't filter code!
3973 { -<link linkend="FILTER">filter</link> }
3975 .sourceforge.net</screen>
3979 The actual <filename>default.action</filename> is of course more
3980 comprehensive, but we hope this example made clear how it works.
3985 <sect3><title>user.action</title>
3988 So far we are painting with a broad brush by setting general policies,
3989 which would be a reasonable starting point for many people. Now,
3990 you might want to be more specific and have customized rules that
3991 are more suitable to your personal habits and preferences. These would
3992 be for narrowly defined situations like your ISP or your bank, and should
3993 be placed in <filename>user.action</filename>, which is parsed after all other
3994 actions files and hence has the last word, over-riding any previously
3995 defined actions. <filename>user.action</filename> is also a
3996 <emphasis>safe</emphasis> place for your personal settings, since
3997 <filename>default.action</filename> is actively maintained by the
3998 <application>Privoxy</application> developers and you'll probably want
3999 to install updated versions from time to time.
4003 So let's look at a few examples of things that one might typically do in
4004 <filename>user.action</filename>:
4008 <!-- brief sample user.action here -->
4012 # My user.action file. <fred@foobar.com></screen>
4016 As <link linkend="aliases">aliases</link> are local to the actions
4017 file that they are defined in, you can't use the ones from
4018 <filename>default.action</filename>, unless you repeat them here:
4023 # (Re-)define aliases for this file:
4026 -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
4027 mercy-for-cookies = -crunch-all-cookies -session-cookies-only
4028 fragile = -block -crunch-all-cookies -filter -fast-redirects -hide-referer -kill-popups
4029 shop = mercy-for-cookies -filter{popups} -kill-popups
4030 allow-ads = -block -filter{banners-by-size} # (see below)</screen>
4035 Say you have accounts on some sites that you visit regularly, and
4036 you don't want to have to log in manually each time. So you'd like
4037 to allow persistent cookies for these sites. The
4038 <literal>mercy-for-cookies</literal> alias defined above does exactly
4039 that, i.e. it disables crunching of cookies in any direction, and
4040 processing of cookies to make them temporary.
4045 { mercy-for-cookies }
4050 .redhat.com</screen>
4054 Your bank needs popups and is allergic to some filter, but you don't
4055 know which, so you disable them all:
4060 { -<link linkend="FILTER">filter</link> -<link linkend="KILL-POPUPS">kill-popups</link> }
4061 .your-home-banking-site.com</screen>
4065 While browsing the web with <application>Privoxy</application> you
4066 noticed some ads that sneaked through, but you were too lazy to
4067 report them through our fine and easy <link linkend="contact">feedback</link>
4068 system, so you have added them here:
4073 { +<link linkend="BLOCK">block</link> }
4074 www.a-popular-site.com/some/unobvious/path
4075 another.popular.site.net/more/junk/here/</screen>
4079 Note that, assuming the banners in the above example have regular image
4080 extensions (most do),
4081 <literal>+<link linkend="HANDLE-AS-IMAGE">handle-as-image</link></literal>
4082 need not be specified, since all URLs ending in these extensions will
4083 already have been tagged as images in the relevant section of
4084 <filename>default.action</filename> by now.
4088 Then you noticed that the default configuration breaks Forbes Magazine,
4089 but you were too lazy to find out which action is the culprit, and you
4090 were again too lazy to give <link linkend="contact">feedback</link>, so
4091 you just used the <literal>fragile</literal> alias on the site, and
4092 -- whoa! -- it worked:
4098 .forbes.com</screen>
4102 You like the <quote>fun</quote> text replacements in <filename>default.filter</filename>,
4103 but it is disabled in the distributed actions file. (My colleagues on the team just
4104 don't have a sense of humour, that's why! ;-). So you'd like to turn it on in your private,
4105 update-safe config, once and for all:
4110 { +<link linkend="filter-fun">filter{fun}</link> }
4111 / # For ALL sites!</screen>
4115 Note that the above is not really a good idea: There are exceptions
4116 to the filters in <filename>default.action</filename> for things that
4117 really shouldn't be filtered, like code on CVS->Web interfaces. Since
4118 <filename>user.action</filename> has the last word, these exceptions
4119 won't be valid for the <quote>fun</quote> filtering specified here.
4123 Finally, you might think about how your favourite free websites are
4124 funded, and find that they rely on displaying banner advertisements
4125 to survive. So you might want to specifically allow banners for those
4126 sites that you feel provide value to you:
4138 Note that <literal>allow-ads</literal> has been aliased to
4139 <literal>-<link linkend="block">block</link></literal>
4140 <literal>-<link linkend="filter-banners-by-size">filter{banners-by-size}</link></literal>
4146 <!-- ~ End section ~ -->
4150 <!-- ~ End section ~ -->
4152 <!-- ~~~~~~~~ New section Header ~~~~~~~~~ -->
4154 <sect1 id="filter-file">
4155 <title>The Filter File</title>
4158 All text substitutions that can be invoked through the
4159 <literal><link linkend="filter">filter</link></literal> action
4160 must first be defined in the filter file, which is typically
4161 called <filename>default.filter</filename> and which can be
4162 selected through the <literal>
4163 <link linkend="filterfile">filterfile</link></literal> config
4168 Typical reasons for doing such substitutions are to eliminate
4169 common annoyances in HTML and JavaScript, such as pop-up windows,
4170 exit consoles, crippled windows without navigation tools, the
4171 infamous <BLINK> tag etc, to suppress images with certain
4172 width and height attributes (standard banner sizes or web-bugs),
4173 or just to have fun. The possibilities are endless.
4177 Filtering works on any text-based document type, including plain
4178 text, HTML, JavaScript, CSS etc. (all <literal>text/*</literal>
4179 MIME types). Substitutions are made at the source level, so if
4180 you want to <quote>roll your own</quote> filters, you should be
4181 familiar with HTML syntax.
4185 Just like the <link linkend="actions-file">actions files</link>, the
4186 filter file is organized in sections, which are called <emphasis>filters</emphasis>
4187 here. Each filter consists of a heading line, that starts with the
4188 <emphasis>keyword</emphasis> <literal>FILTER:</literal>, followed by
4189 the filter's <emphasis>name</emphasis>, and a short (one line)
4190 <emphasis>description</emphasis> of what it does. Below that line
4191 come the <emphasis>jobs</emphasis>, i.e. lines that define the actual
4192 text substitutions. By convention, the name of a filter
4193 should describe what the filter <emphasis>eliminates</emphasis>. The
4194 comment is used in the <ulink url="http://config.privoxy.org/">web-based
4195 user interface</ulink>.
4199 Once a filter called <replaceable>name</replaceable> has been defined
4200 in the filter file, it can be invoked by using an action of the form
4201 +<literal><link linkend="filter">filter</link>{<replaceable>name</replaceable>}</literal>
4202 in any <link linkend="actions-file">actions file</link>.
4206 A filter header line for a filter called <quote>foo</quote> could look
4211 <screen>FILTER: foo Replace all "foo" with "bar"</screen>
4215 Below that line, and up to the next header line, come the jobs that
4216 define what text replacements the filter executes. They are specified
4217 in a syntax that imitates <ulink url="http://www.perl.org/">Perl</ulink>'s
4218 <literal>s///</literal> operator. If you are familiar with Perl, you
4219 will find this to be quite intuitive, and may want to look at the
4220 <ulink url="http://www.oesterhelt.org/pcrs/pcrs.3.html">PCRS man page</ulink>
4221 for the subtle differences to Perl behaviour. Most notably, the non-standard
4222 option letter <literal>U</literal> is supported, which turns the default
4223 to ungreedy matching.
4227 If you are new to regular expressions, you might want to take a look at
4228 the <link linkend="regex">Appendix on regular expressions</link>, and
4229 see the <ulink url="http://perldoc.com/perl5.6.1/pod/perl.html">Perl
4231 <ulink url="http://perldoc.com/perl5.6.1/pod/perlop.html#s-PATTERN-REPLACEMENT-egimosx">the
4232 <literal>s///</literal> operator's syntax</ulink> and <ulink
4233 url="http://perldoc.com/perl5.6.1/pod/perlre.html">Perl-style regular
4234 expressions</ulink> in general.
4235 The below examples might also help to get you started.
4238 <!-- ~~~~~~~~ New section Header ~~~~~~~~~ -->
4240 <sect2><title>Filter File Tutorial</title>
4242 Now, let's complete our <quote>foo</quote> filter. We have already defined
4243 the heading, but the jobs are still missing. Since all it does is to replace
4244 <quote>foo</quote> with <quote>bar</quote>, there is only one (trivial) job
4249 <screen>s/foo/bar/</screen>
4253 But wait! Didn't the comment say that <emphasis>all</emphasis> occurrences
4254 of <quote>foo</quote> should be replaced? Our current job will only take
4255 care of the first <quote>foo</quote> on each page. For global substitution,
4256 we'll need to add the <literal>g</literal> option:
4260 <screen>s/foo/bar/g</screen>
4264 Our complete filter now looks like this:
4267 <screen>FILTER: foo Replace all "foo" with "bar"
4268 s/foo/bar/g</screen>
4272 Let's look at some real filters for more interesting examples. Here you see
4273 a filter that protects against some common annoyances that arise from JavaScript
4274 abuse. Let's look at its jobs one after the other:
4280 FILTER: js-annoyances Get rid of particularly annoying JavaScript abuse
4282 # Get rid of JavaScript referrer tracking. Test page: http://www.randomoddness.com/untitled.htm
4284 s|(<script.*)document\.referrer(.*</script>)|$1"Not Your Business!"$2|Usg</screen>
4288 Following the header line and a comment, you see the job. Note that it uses
4289 <literal>|</literal> as the delimiter instead of <literal>/</literal>, because
4290 the pattern contains a forward slash, which would otherwise have to be escaped
4291 by a backslash (<literal>\</literal>).
4295 Now, let's examine the pattern: it starts with the text <literal><script.*</literal>
4296 enclosed in parentheses. Since the dot matches any character, and <literal>*</literal>
4297 means: <quote>Match an arbitrary number of the element left of myself</quote>, this
4298 matches <quote><script</quote>, followed by <emphasis>any</emphasis> text, i.e.
4299 it matches the whole page, from the start of the first <script> tag.
4303 That's more than we want, but the pattern continues: <literal>document\.referrer</literal>
4304 matches only the exact string <quote>document.referrer</quote>. The dot needed to
4305 be <emphasis>escaped</emphasis>, i.e. preceded by a backslash, to take away its
4306 special meaning as a joker, and make it just a regular dot. So far, the meaning is:
4307 Match from the start of the first <script> tag in a the page, up to, and including,
4308 the text <quote>document.referrer</quote>, if <emphasis>both</emphasis> are present
4309 in the page (and appear in that order).
4313 But there's still more pattern to go. The next element, again enclosed in parentheses,
4314 is <literal>.*</script></literal>. You already know what <literal>.*</literal>
4315 means, so the whole pattern translates to: Match from the start of the first <script>
4316 tag in a page to the end of the last <script> tag, provided that the text
4317 <quote>document.referrer</quote> appears somewhere in between.
4321 This is still not the whole story, since we have ignored the options and the parentheses:
4322 The portions of the page matched by sub-patterns that are enclosed in parentheses, will be
4323 remembered and be available through the variables <literal>$1, $2, ...</literal> in
4324 the substitute. The <literal>U</literal> option switches to ungreedy matching, which means
4325 that the first <literal>.*</literal> in the pattern will only <quote>eat up</quote> all
4326 text in between <quote><script</quote> and the <emphasis>first</emphasis> occurrence
4327 of <quote>document.referrer</quote>, and that the second <literal>.*</literal> will
4328 only span the text up to the <emphasis>first</emphasis> <quote></script></quote>
4329 tag. Furthermore, the <literal>s</literal> option says that the match may span
4330 multiple lines in the page, and the <literal>g</literal> option again means that the
4331 substitution is global.
4335 So, to summarize, the pattern means: Match all scripts that contain the text
4336 <quote>document.referrer</quote>. Remember the parts of the script from
4337 (and including) the start tag up to (and excluding) the string
4338 <quote>document.referrer</quote> as <literal>$1</literal>, and the part following
4339 that string, up to and including the closing tag, as <literal>$2</literal>.
4343 Now the pattern is deciphered, but wasn't this about substituting things? So
4344 lets look at the substitute: <literal>$1"Not Your Business!"$2</literal> is
4345 easy to read: The text remembered as <literal>$1</literal>, followed by
4346 <literal>"Not Your Business!"</literal> (<emphasis>including</emphasis>
4347 the quotation marks!), followed by the text remembered as <literal>$2</literal>.
4348 This produces an exact copy of the original string, with the middle part
4349 (the <quote>document.referrer</quote>) replaced by <literal>"Not Your
4350 Business!"</literal>.
4354 The whole job now reads: Replace <quote>document.referrer</quote> by
4355 <literal>"Not Your Business!"</literal> wherever it appears inside a
4356 <script> tag. Note that this job won't break JavaScript syntax,
4357 since both the original and the replacement are syntactically valid
4358 string objects. The script just won't have access to the referrer
4359 information anymore.
4363 We'll show you two other jobs from the JavaScript taming department, but
4364 this time only point out the constructs of special interest:
4369 # The status bar is for displaying link targets, not pointless blahblah
4371 s/window\.status\s*=\s*['"].*?['"]/dUmMy=1/ig</screen>
4375 <literal>\s</literal> stands for whitespace characters (space, tab, newline,
4376 carriage return, form feed), so that <literal>\s*</literal> means: <quote>zero
4377 or more whitespace</quote>. The <literal>?</literal> in <literal>.*?</literal>
4378 makes this matching of arbitrary text ungreedy. (Note that the <literal>U</literal>
4379 option is not set). The <literal>['"]</literal> construct means: <quote>a single
4380 <emphasis>or</emphasis> a double quote</quote>.
4384 So what does this job do? It replaces assignments of single- or double-quoted
4385 strings to the <quote>window.status</quote> object with a dummy assignment
4386 (using a variable name that is hopefully odd enough not to conflict with
4387 real variables in scripts). Thus, it catches many cases where e.g. pointless
4388 descriptions are displayed in the status bar instead of the link target when
4389 you move your mouse over links.
4394 # Kill OnUnload popups. Yummy. Test: http://www.zdnet.com/zdsubs/yahoo/tree/yfs.html
4396 s/(<body .*)onunload(.*>)/$1never$2/iU</screen>
4401 <ulink url="http://www.w3.org/TR/2000/REC-DOM-Level-2-Events-20001113/events.html#Events-eventgroupings-htmlevents">OnUnload
4402 event binding</ulink> in the HTML DOM was a <emphasis>CRIME</emphasis>.
4403 When I close a browser window, I want it to close and die. Basta.
4404 This job replaces the <quote>onunload</quote> attribute in
4405 <quote><body></quote> tags with the dummy word <literal>never</literal>.
4406 Note that the <literal>i</literal> option makes the pattern matching
4411 The last example is from the fun department:
4416 FILTER: fun Fun text replacements
4418 # Spice the daily news:
4420 s/microsoft(?!\.com)/MicroSuck/ig</screen>
4424 Note the <literal>(?!\.com)</literal> part (a so-called negative lookahead)
4425 in the job's pattern, which means: Don't match, if the string
4426 <quote>.com</quote> appears directly following <quote>microsoft</quote>
4427 in the page. This prevents links to microsoft.com from being trashed, while
4428 still replacing the word everywhere else.
4433 # Buzzword Bingo (example for extended regex syntax)
4435 s* industry[ -]leading \
4437 | award[ -]winning # Comments are OK, too! \
4438 | high[ -]performance \
4439 | solutions[ -]based \
4443 *<font color="red"><b>BINGO!</b></font> \
4448 The <literal>x</literal> option in this job turns on extended syntax, and allows for
4449 e.g. the liberal use of (non-interpreted!) whitespace for nicer formatting.
4458 <!-- ~ End section ~ -->
4462 <!-- ~~~~~ New section ~~~~~ -->
4464 <sect1 id="templates">
4465 <title>Templates</title>
4467 All <application>Privoxy</application> built-in pages, i.e. error pages such as the
4468 <ulink url="http://show-the-404-error.page"><quote>404 - No Such Domain</quote>
4469 error page</ulink>, the <ulink
4470 url="http://ads.bannerserver.example.com/nasty-ads/sponsor.html"><quote>BLOCKED</quote>
4472 and all pages of its <ulink url="http://config.privoxy.org/">web-based
4473 user interface</ulink>, are generated from <emphasis>templates</emphasis>.
4474 (<application>Privoxy</application> must be running for the above links to work as
4479 These templates are stored in a subdirectory of the <link linkend="confdir">configuration
4480 directory</link> called <filename>templates</filename>. On Unixish platforms,
4482 <ulink url="file:///etc/privoxy/templates/"><filename>/etc/privoxy/templates/</filename></ulink>.
4486 The templates are basically normal HTML files, but with place-holders (called symbols
4487 or exports), which <application>Privoxy</application> fills at run time. You can
4488 edit the templates with a normal text editor, should you want to customize them.
4489 (<emphasis>Not recommended for the casual user</emphasis>). Note that
4490 just like in configuration files, lines starting with <literal>#</literal> are
4491 ignored when the templates are filled in.
4495 The place-holders are of the form <literal>@name@</literal>, and you will
4496 find a list of available symbols, which vary from template to template,
4497 in the comments at the start of each file. Note that these comments are not
4498 always accurate, and that it's probably best to look at the existing HTML
4499 code to find out which symbols are supported and what they are filled in with.
4503 A special application of this substitution mechanism is to make whole
4504 blocks of HTML code disappear when a specific symbol is set. We use this
4505 for many purposes, one of them being to include the beta warning in all
4506 our user interface (CGI) pages when <application>Privoxy</application>
4507 in in an alpha or beta development stage:
4512 <!-- @if-unstable-start -->
4514 ... beta warning HTML code goes here ...
4516 <!-- if-unstable-end@ --></screen>
4520 If the "unstable" symbol is set, everything in between and including
4521 <literal>@if-unstable-start</literal> and <literal>if-unstable-end@</literal>
4522 will disappear, leaving nothing but an empty comment:
4526 <screen><!-- --></screen>
4530 There's also an if-then-else construct and an <literal>#include</literal>
4531 mechanism, but you'll sure find out if you are inclined to edit the
4536 All templates refer to a style located at
4537 <ulink url="http://config.privoxy.org/send-stylesheet"><literal>http://config.privoxy.org/send-stylesheet</literal></ulink>.
4538 This is, of course, locally served by <application>Privoxy</application>
4539 and the source for it can be found and edited in the
4540 <filename>cgi-style.css</filename> template.
4545 <!-- ~ End section ~ -->
4549 <!-- ~~~~~ New section ~~~~~ -->
4551 <sect1 id="contact"><title>Contacting the Developers, Bug Reporting and Feature
4554 <!-- Include contacting.sgml boilerplate: -->
4556 <!-- end boilerplate -->
4560 <!-- ~ End section ~ -->
4563 <!-- ~~~~~ New section ~~~~~ -->
4564 <sect1 id="copyright"><title><application>Privoxy</application> Copyright, License and History</title>
4566 <!-- Include copyright.sgml: -->
4568 <!-- end copyright -->
4570 <!-- ~~~~~ New section ~~~~~ -->
4571 <sect2><title>License</title>
4572 <!-- Include copyright.sgml: -->
4574 <!-- end copyright -->
4576 <!-- ~ End section ~ -->
4579 <!-- ~~~~~ New section ~~~~~ -->
4581 <sect2 id="history"><title>History</title>
4582 <!-- Include history.sgml: -->
4584 <!-- end history -->
4587 <sect2 id="authors"><title>Authors</title>
4588 <!-- Include p-authors.sgml: -->
4590 <!-- end authors -->
4595 <!-- ~ End section ~ -->
4598 <!-- ~~~~~ New section ~~~~~ -->
4599 <sect1 id="seealso"><title>See Also</title>
4600 <!-- Include seealso.sgml: -->
4602 <!-- end seealso -->
4607 <!-- ~~~~~ New section ~~~~~ -->
4608 <sect1 id="appendix"><title>Appendix</title>
4611 <!-- ~~~~~ New section ~~~~~ -->
4613 <title>Regular Expressions</title>
4615 <application>Privoxy</application> uses Perl-style <quote>regular
4616 expressions</quote> in its <link linkend="actions-file">actions
4617 files</link> and <link linkend="filter-file">filter file</link>,
4618 through the <ulink url="http://www.pcre.org/">PCRE</ulink> and
4619 <ulink url="http://www.oesterhelt.org/pcrs/">PCRS</ulink> libraries.
4623 If you are reading this, you probably don't understand what <quote>regular
4624 expressions</quote> are, or what they can do. So this will be a very brief
4625 introduction only. A full explanation would require a <ulink
4626 url="http://www.oreilly.com/catalog/regex/">book</ulink> ;-)
4630 Regular expressions provide a language to describe patterns that can be
4631 run against strings of characters (letter, numbers, etc), to see if they
4632 match the string or not. The patterns are themselves (sometimes complex)
4633 strings of literal characters, combined with wild-cards, and other special
4634 characters, called meta-characters. The <quote>meta-characters</quote> have
4635 special meanings and are used to build complex patterns to be matched against.
4636 Perl Compatible Regular Expressions are an especially convenient
4637 <quote>dialect</quote> of the regular expression language.
4641 To make a simple analogy, we do something similar when we use wild-card
4642 characters when listing files with the <command>dir</command> command in DOS.
4643 <literal>*.*</literal> matches all filenames. The <quote>special</quote>
4644 character here is the asterisk which matches any and all characters. We can be
4645 more specific and use <literal>?</literal> to match just individual
4646 characters. So <quote>dir file?.text</quote> would match
4647 <quote>file1.txt</quote>, <quote>file2.txt</quote>, etc. We are pattern
4648 matching, using a similar technique to <quote>regular expressions</quote>!
4652 Regular expressions do essentially the same thing, but are much, much more
4653 powerful. There are many more <quote>special characters</quote> and ways of
4654 building complex patterns however. Let's look at a few of the common ones,
4655 and then some examples:
4660 <emphasis>.</emphasis> - Matches any single character, e.g. <quote>a</quote>,
4661 <quote>A</quote>, <quote>4</quote>, <quote>:</quote>, or <quote>@</quote>.
4663 </simplelist></para>
4667 <emphasis>?</emphasis> - The preceding character or expression is matched ZERO or ONE
4670 </simplelist></para>
4674 <emphasis>+</emphasis> - The preceding character or expression is matched ONE or MORE
4677 </simplelist></para>
4681 <emphasis>*</emphasis> - The preceding character or expression is matched ZERO or MORE
4684 </simplelist></para>
4688 <emphasis>\</emphasis> - The <quote>escape</quote> character denotes that
4689 the following character should be taken literally. This is used where one of the
4690 special characters (e.g. <quote>.</quote>) needs to be taken literally and
4691 not as a special meta-character. Example: <quote>example\.com</quote>, makes
4692 sure the period is recognized only as a period (and not expanded to its
4693 meta-character meaning of any single character).
4695 </simplelist></para>
4699 <emphasis>[]</emphasis> - Characters enclosed in brackets will be matched if
4700 any of the enclosed characters are encountered. For instance, <quote>[0-9]</quote>
4701 matches any numeric digit (zero through nine). As an example, we can combine
4702 this with <quote>+</quote> to match any digit one of more times: <quote>[0-9]+</quote>.
4704 </simplelist></para>
4708 <emphasis>()</emphasis> - parentheses are used to group a sub-expression,
4709 or multiple sub-expressions.
4711 </simplelist></para>
4715 <emphasis>|</emphasis> - The <quote>bar</quote> character works like an
4716 <quote>or</quote> conditional statement. A match is successful if the
4717 sub-expression on either side of <quote>|</quote> matches. As an example:
4718 <quote>/(this|that) example/</quote> uses grouping and the bar character
4719 and would match either <quote>this example</quote> or <quote>that
4720 example</quote>, and nothing else.
4722 </simplelist></para>
4725 These are just some of the ones you are likely to use when matching URLs with
4726 <application>Privoxy</application>, and is a long way from a definitive
4727 list. This is enough to get us started with a few simple examples which may
4728 be more illuminating:
4732 <emphasis><literal>/.*/banners/.*</literal></emphasis> - A simple example
4733 that uses the common combination of <quote>.</quote> and <quote>*</quote> to
4734 denote any character, zero or more times. In other words, any string at all.
4735 So we start with a literal forward slash, then our regular expression pattern
4736 (<quote>.*</quote>) another literal forward slash, the string
4737 <quote>banners</quote>, another forward slash, and lastly another
4738 <quote>.*</quote>. We are building
4739 a directory path here. This will match any file with the path that has a
4740 directory named <quote>banners</quote> in it. The <quote>.*</quote> matches
4741 any characters, and this could conceivably be more forward slashes, so it
4742 might expand into a much longer looking path. For example, this could match:
4743 <quote>/eye/hate/spammers/banners/annoy_me_please.gif</quote>, or just
4744 <quote>/banners/annoying.html</quote>, or almost an infinite number of other
4745 possible combinations, just so it has <quote>banners</quote> in the path
4750 A now something a little more complex:
4754 <emphasis><literal>/.*/adv((er)?ts?|ertis(ing|ements?))?/</literal></emphasis> -
4755 We have several literal forward slashes again (<quote>/</quote>), so we are
4756 building another expression that is a file path statement. We have another
4757 <quote>.*</quote>, so we are matching against any conceivable sub-path, just so
4758 it matches our expression. The only true literal that <emphasis>must
4759 match</emphasis> our pattern is <application>adv</application>, together with
4760 the forward slashes. What comes after the <quote>adv</quote> string is the
4765 Remember the <quote>?</quote> means the preceding expression (either a
4766 literal character or anything grouped with <quote>(...)</quote> in this case)
4767 can exist or not, since this means either zero or one match. So
4768 <quote>((er)?ts?|ertis(ing|ements?))</quote> is optional, as are the
4769 individual sub-expressions: <quote>(er)</quote>,
4770 <quote>(ing|ements?)</quote>, and the <quote>s</quote>. The <quote>|</quote>
4771 means <quote>or</quote>. We have two of those. For instance,
4772 <quote>(ing|ements?)</quote>, can expand to match either <quote>ing</quote>
4773 <emphasis>OR</emphasis> <quote>ements?</quote>. What is being done here, is an
4774 attempt at matching as many variations of <quote>advertisement</quote>, and
4775 similar, as possible. So this would expand to match just <quote>adv</quote>,
4776 or <quote>advert</quote>, or <quote>adverts</quote>, or
4777 <quote>advertising</quote>, or <quote>advertisement</quote>, or
4778 <quote>advertisements</quote>. You get the idea. But it would not match
4779 <quote>advertizements</quote> (with a <quote>z</quote>). We could fix that by
4780 changing our regular expression to:
4781 <quote>/.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/</quote>, which would then match
4786 <emphasis><literal>/.*/advert[0-9]+\.(gif|jpe?g)</literal></emphasis> - Again
4787 another path statement with forward slashes. Anything in the square brackets
4788 <quote>[]</quote> can be matched. This is using <quote>0-9</quote> as a
4789 shorthand expression to mean any digit one through nine. It is the same as
4790 saying <quote>0123456789</quote>. So any digit matches. The <quote>+</quote>
4791 means one or more of the preceding expression must be included. The preceding
4792 expression here is what is in the square brackets -- in this case, any digit
4793 one through nine. Then, at the end, we have a grouping: <quote>(gif|jpe?g)</quote>.
4794 This includes a <quote>|</quote>, so this needs to match the expression on
4795 either side of that bar character also. A simple <quote>gif</quote> on one side, and the other
4796 side will in turn match either <quote>jpeg</quote> or <quote>jpg</quote>,
4797 since the <quote>?</quote> means the letter <quote>e</quote> is optional and
4798 can be matched once or not at all. So we are building an expression here to
4799 match image GIF or JPEG type image file. It must include the literal
4800 string <quote>advert</quote>, then one or more digits, and a <quote>.</quote>
4801 (which is now a literal, and not a special character, since it is escaped
4802 with <quote>\</quote>), and lastly either <quote>gif</quote>, or
4803 <quote>jpeg</quote>, or <quote>jpg</quote>. Some possible matches would
4804 include: <quote>//advert1.jpg</quote>,
4805 <quote>/nasty/ads/advert1234.gif</quote>,
4806 <quote>/banners/from/hell/advert99.jpg</quote>. It would not match
4807 <quote>advert1.gif</quote> (no leading slash), or
4808 <quote>/adverts232.jpg</quote> (the expression does not include an
4809 <quote>s</quote>), or <quote>/advert1.jsp</quote> (<quote>jsp</quote> is not
4810 in the expression anywhere).
4814 We are barely scratching the surface of regular expressions here so that you
4815 can understand the default <application>Privoxy</application>
4816 configuration files, and maybe use this knowledge to customize your own
4817 installation. There is much, much more that can be done with regular
4818 expressions. Now that you know enough to get started, you can learn more on
4823 More reading on Perl Compatible Regular expressions:
4824 <ulink url="http://www.perldoc.com/perl5.6/pod/perlre.html">http://www.perldoc.com/perl5.6/pod/perlre.html</ulink>
4828 For information on regular expression based substitutions and their applications
4829 in filters, please see the <link linkend="filter-file">filter file tutorial</link>
4834 <!-- ~ End section ~ -->
4837 <!-- ~~~~~ New section ~~~~~ -->
4839 <title><application>Privoxy</application>'s Internal Pages</title>
4842 Since <application>Privoxy</application> proxies each requested
4843 web page, it is easy for <application>Privoxy</application> to
4844 trap certain special URLs. In this way, we can talk directly to
4845 <application>Privoxy</application>, and see how it is
4846 configured, see how our rules are being applied, change these
4847 rules and other configuration options, and even turn
4848 <application>Privoxy's</application> filtering off, all with
4854 The URLs listed below are the special ones that allow direct access
4855 to <application>Privoxy</application>. Of course,
4856 <application>Privoxy</application> must be running to access these. If
4857 not, you will get a friendly error message. Internet access is not
4870 <ulink url="http://config.privoxy.org/">http://config.privoxy.org/</ulink>
4874 There is a shortcut: <ulink url="http://p.p/">http://p.p/</ulink> (But it
4875 doesn't provide a fall-back to a real page, in case the request is not
4876 sent through <application>Privoxy</application>)
4882 Show information about the current configuration, including viewing and
4883 editing of actions files:
4887 <ulink url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>
4894 Show the source code version numbers:
4898 <ulink url="http://config.privoxy.org/show-version">http://config.privoxy.org/show-version</ulink>
4905 Show the browser's request headers:
4909 <ulink url="http://config.privoxy.org/show-request">http://config.privoxy.org/show-request</ulink>
4916 Show which actions apply to a URL and why:
4920 <ulink url="http://config.privoxy.org/show-url-info">http://config.privoxy.org/show-url-info</ulink>
4927 Toggle Privoxy on or off. In this case, <quote>Privoxy</quote> continues
4928 to run, but only as a pass-through proxy, with no actions taking place:
4932 <ulink url="http://config.privoxy.org/toggle">http://config.privoxy.org/toggle</ulink>
4936 Short cuts. Turn off, then on:
4940 <ulink url="http://config.privoxy.org/toggle?set=disable">http://config.privoxy.org/toggle?set=disable</ulink>
4945 <ulink url="http://config.privoxy.org/toggle?set=enable">http://config.privoxy.org/toggle?set=enable</ulink>
4954 These may be bookmarked for quick reference. See next.
4958 <sect3 id="bookmarklets">
4959 <title>Bookmarklets</title>
4961 Below are some <quote>bookmarklets</quote> to allow you to easily access a
4962 <quote>mini</quote> version of some of <application>Privoxy's</application>
4963 special pages. They are designed for MS Internet Explorer, but should work
4964 equally well in Netscape, Mozilla, and other browsers which support
4965 JavaScript. They are designed to run directly from your bookmarks - not by
4966 clicking the links below (although that should work for testing).
4969 To save them, right-click the link and choose <quote>Add to Favorites</quote>
4970 (IE) or <quote>Add Bookmark</quote> (Netscape). You will get a warning that
4971 the bookmark <quote>may not be safe</quote> - just click OK. Then you can run the
4972 Bookmarklet directly from your favorites/bookmarks. For even faster access,
4973 you can put them on the <quote>Links</quote> bar (IE) or the <quote>Personal
4974 Toolbar</quote> (Netscape), and run them with a single click.
4983 url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&set=enabled','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Enable</ulink>
4990 url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&set=disabled','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Disable</ulink>
4997 url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&set=toggle','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Toggle Privoxy</ulink> (Toggles between enabled and disabled)
5004 url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y','ijbstatus','width=250,height=2,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy- View Status</ulink>
5010 <ulink url="javascript:w=Math.floor(screen.width/2);h=Math.floor(screen.height*0.9);void(window.open('http://www.privoxy.org/actions/index.php?url='+escape(location.href),'Feedback','screenx='+w+',width='+w+',height='+h+',scrollbars=yes,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Submit Actions File Feedback</ulink>
5015 <ulink url="javascript:void(window.open('http://config.privoxy.org/show-url-info?url='+escape(location.href),'Why').focus());">Privoxy - Why?</ulink>
5022 Credit: The site which gave us the general idea for these bookmarklets is
5023 <ulink url="http://www.bookmarklets.com">www.bookmarklets.com</ulink>. They
5024 have more information about bookmarklets.
5033 <!-- ~~~~~ New section ~~~~~ -->
5035 <title>Chain of Events</title>
5037 Let's take a quick look at the basic sequence of events when a web page is
5038 requested by your browser and <application>Privoxy</application> is on duty:
5045 First, your web browser requests a web page. The browser knows to send
5046 the request to <application>Privoxy</application>, which will in turn,
5047 relay the request to the remote web server after passing the following
5053 <application>Privoxy</application> traps any request for its own internal CGI
5054 pages (e.g http://p.p/) and sends the CGI page back to the browser.
5059 Next, <application>Privoxy</application> checks to see if the URL
5061 linkend="BLOCK"><quote>+block</quote></link> patterns. If
5062 so, the URL is then blocked, and the remote web server will not be contacted.
5063 <link linkend="HANDLE-AS-IMAGE"><quote>+handle-as-image</quote></link>
5064 is then checked and if it does not match, an
5065 HTML <quote>BLOCKED</quote> page is sent back. Otherwise, if it does match,
5066 an image is returned. The type of image depends on the setting of <link
5067 linkend="SET-IMAGE-BLOCKER"><quote>+set-image-blocker</quote></link>
5068 (blank, checkerboard pattern, or an HTTP redirect to an image elsewhere).
5073 Untrusted URLs are blocked. If URLs are being added to the
5074 <filename>trust</filename> file, then that is done.
5079 If the URL pattern matches the <link
5080 linkend="FAST-REDIRECTS"><quote>+fast-redirects</quote></link> action,
5081 it is then processed. Unwanted parts of the requested URL are stripped.
5086 Now the rest of the client browser's request headers are processed. If any
5087 of these match any of the relevant actions (e.g. <link
5088 linkend="HIDE-USER-AGENT"><quote>+hide-user-agent</quote></link>,
5089 etc.), headers are suppressed or forged as determined by these actions and
5095 Now the web server starts sending its response back (i.e. typically a web page and related
5101 First, the server headers are read and processed to determine, among other
5102 things, the MIME type (document type) and encoding. The headers are then
5103 filtered as deterimined by the
5104 <link linkend="CRUNCH-INCOMING-COOKIES"><quote>+crunch-incoming-cookies</quote></link>,
5105 <link linkend="SESSION-COOKIES-ONLY"><quote>+session-cookies-only</quote></link>,
5106 and <link linkend="DOWNGRADE-HTTP-VERSION"><quote>+downgrade-http-version</quote></link>
5112 If the <link linkend="KILL-POPUPS"><quote>+kill-popups</quote></link>
5113 action applies, and it is an HTML or JavaScript document, the popup-code in the
5114 response is filtered on-the-fly as it is received.
5119 If a <link linkend="FILTER"><quote>+filter</quote></link>
5121 linkend="DEANIMATE-GIFS"><quote>+deanimate-gifs</quote></link>
5122 action applies (and the document type fits the action), the rest of the page is
5123 read into memory (up to a configurable limit). Then the filter rules (from
5124 <filename>default.filter</filename>) are processed against the buffered
5125 content. Filters are applied in the order they are specified in the
5126 <filename>default.filter</filename> file. Animated GIFs, if present, are
5127 reduced to either the first or last frame, depending on the action
5128 setting.The entire page, which is now filtered, is then sent by
5129 <application>Privoxy</application> back to your browser.
5132 If neither <link linkend="FILTER"><quote>+filter</quote></link>
5134 linkend="DEANIMATE-GIFS"><quote>+deanimate-gifs</quote></link>
5135 matches, then <application>Privoxy</application> passes the raw data through
5136 to the client browser as it becomes available.
5141 As the browser receives the now (probably filtered) page content, it
5142 reads and then requests any URLs that may be embedded within the page
5143 source, e.g. ad images, stylesheets, JavaScript, other HTML documents (e.g.
5144 frames), sounds, etc. For each of these objects, the browser issues a new
5145 request. And each such request is in turn processed as above. Note that a
5146 complex web page may have many such embedded URLs.
5156 <!-- ~~~~~ New section ~~~~~ -->
5157 <sect2 id="actionsanat">
5158 <title>Anatomy of an Action</title>
5161 The way <application>Privoxy</application> applies
5162 <link linkend="ACTIONS">actions</link> and <link linkend="FILTER">filters</link>
5163 to any given URL can be complex, and not always so
5164 easy to understand what is happening. And sometimes we need to be able to
5165 <emphasis>see</emphasis> just what <application>Privoxy</application> is
5166 doing. Especially, if something <application>Privoxy</application> is doing
5167 is causing us a problem inadvertently. It can be a little daunting to look at
5168 the actions and filters files themselves, since they tend to be filled with
5169 <link linkend="regex">regular expressions</link> whose consequences are not
5174 One quick test to see if <application>Privoxy</application> is causing a problem
5175 or not, is to disable it temporarily. This should be the first troubleshooting
5176 step. See <link linkend="bookmarklets">the Bookmarklets</link> section on a quick
5177 and easy way to do this (be sure to flush caches afterward!). Looking at the
5178 logs is a good idea too.
5182 <application>Privoxy</application> also provides the
5183 <ulink url="http://config.privoxy.org/show-url-info">http://config.privoxy.org/show-url-info</ulink>
5184 page that can show us very specifically how <application>actions</application>
5185 are being applied to any given URL. This is a big help for troubleshooting.
5189 First, enter one URL (or partial URL) at the prompt, and then
5190 <application>Privoxy</application> will tell us
5191 how the current configuration will handle it. This will not
5192 help with filtering effects (i.e. the <link
5193 linkend="FILTER"><quote>+filter</quote></link> action) from
5194 the <filename>default.filter</filename> file since this is handled very
5195 differently and not so easy to trap! It also will not tell you about any other
5196 URLs that may be embedded within the URL you are testing. For instance, images
5197 such as ads are expressed as URLs within the raw page source of HTML pages. So
5198 you will only get info for the actual URL that is pasted into the prompt area
5199 -- not any sub-URLs. If you want to know about embedded URLs like ads, you
5200 will have to dig those out of the HTML source. Use your browser's <quote>View
5201 Page Source</quote> option for this. Or right click on the ad, and grab the
5206 Let's try an example, <ulink url="http://google.com">google.com</ulink>,
5207 and look at it one section at a time:
5212 Matches for http://google.com:
5214 In file: default.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibutton>
5218 -crunch-outgoing-cookies
5219 -crunch-incoming-cookies
5220 +deanimate-gifs{last}
5221 -downgrade-http-version
5225 -filter{shockwave-flash}
5226 -filter{crude-parental}
5227 +filter{html-annoyances}
5228 +filter{js-annoyances}
5229 +filter{content-cookies}
5231 +filter{refresh-tags}
5233 +filter{banners-by-size}
5234 +hide-forwarded-for-headers
5235 +hide-from-header{block}
5236 +hide-referer{forge}
5241 +prevent-compression
5244 +session-cookies-only
5245 +set-image-blocker{pattern} }
5248 { -session-cookies-only }
5254 In file: user.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibutton>
5255 (no matches in this file)
5260 This tells us how we have defined our
5261 <link linkend="ACTIONS"><quote>actions</quote></link>, and
5262 which ones match for our example, <quote>google.com</quote>. The first listing
5263 is any matches for the <filename>standard.action</filename> file. No hits at
5264 all here on <quote>standard</quote>. Then next is <quote>default</quote>, or
5265 our <filename>default.action</filename> file. The large, multi-line listing,
5266 is how the actions are set to match for all URLs, i.e. our default settings.
5267 If you look at your <quote>actions</quote> file, this would be the section
5268 just below the <quote>aliases</quote> section near the top. This will apply to
5269 all URLs as signified by the single forward slash at the end of the listing
5270 -- <quote>/</quote>.
5274 But we can define additional actions that would be exceptions to these general
5275 rules, and then list specific URLs (or patterns) that these exceptions would
5276 apply to. Last match wins. Just below this then are two explicit matches for
5277 <quote>.google.com</quote>. The first is negating our previous cookie setting,
5279 linkend="SESSION-COOKIES-ONLY"><quote>+session-cookies-only</quote></link>
5280 (i.e. not persistent). So we will allow persistent cookies for google. The
5281 second turns <emphasis>off</emphasis> any
5283 linkend="FAST-REDIRECTS"><quote>+fast-redirects</quote></link>
5284 action, allowing this to take place unmolested. Note that there is a leading
5285 dot here -- <quote>.google.com</quote>. This will match any hosts and
5286 sub-domains, in the google.com domain also, such as
5287 <quote>www.google.com</quote>. So, apparently, we have these two actions
5288 defined somewhere in the lower part of our <filename>default.action</filename>
5289 file, and <quote>google.com</quote> is referenced somewhere in these latter
5294 Then, for our <filename>user.action</filename> file, we again have no hits.
5298 And finally we pull it all together in the bottom section and summarize how
5299 <application>Privoxy</application> is applying all its <quote>actions</quote>
5300 to <quote>google.com</quote>:
5311 -crunch-outgoing-cookies
5312 -crunch-incoming-cookies
5313 +deanimate-gifs{last}
5314 -downgrade-http-version
5318 -filter{shockwave-flash}
5319 -filter{crude-parental}
5320 +filter{html-annoyances}
5321 +filter{js-annoyances}
5322 +filter{content-cookies}
5324 +filter{refresh-tags}
5326 +filter{banners-by-size}
5327 +hide-forwarded-for-headers
5328 +hide-from-header{block}
5329 +hide-referer{forge}
5334 +prevent-compression
5337 -session-cookies-only
5338 +set-image-blocker{pattern}
5343 Notice the only difference here to the previous listing, is to
5344 <quote>fast-redirects</quote> and <quote>session-cookies-only</quote>.
5348 Now another example, <quote>ad.doubleclick.net</quote>:
5354 { +block +handle-as-image }
5357 { +block +handle-as-image }
5360 { +block +handle-as-image }
5366 We'll just show the interesting part here, the explicit matches. It is
5367 matched three different times. Each as an <quote>+block +handle-as-image</quote>,
5368 which is the expanded form of one of our aliases that had been defined as:
5369 <quote>+imageblock</quote>. (<link
5370 linkend="ALIASES"><quote>Aliases</quote></link> are defined in
5371 the first section of the actions file and typically used to combine more
5376 Any one of these would have done the trick and blocked this as an unwanted
5377 image. This is unnecessarily redundant since the last case effectively
5378 would also cover the first. No point in taking chances with these guys
5379 though ;-) Note that if you want an ad or obnoxious
5380 URL to be invisible, it should be defined as <quote>ad.doubleclick.net</quote>
5381 is done here -- as both a <link
5382 linkend="BLOCK"><quote>+block</quote></link>
5383 <emphasis>and</emphasis> an
5385 linkend="HANDLE-AS-IMAGE"><quote>+handle-as-image</quote></link>.
5386 The custom alias <quote>+imageblock</quote> just simplifies the process and make
5391 One last example. Let's try <quote>http://www.rhapsodyk.net/adsl/HOWTO/</quote>.
5392 This one is giving us problems. We are getting a blank page. Hmmm ...
5398 Matches for http://www.rhapsodyk.net/adsl/HOWTO/:
5400 In file: default.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibutton>
5404 -crunch-incoming-cookies
5405 -crunch-outgoing-cookies
5407 -downgrade-http-version
5409 +filter{html-annoyances}
5410 +filter{js-annoyances}
5411 +filter{kill-popups}
5414 +filter{banners-by-size}
5417 +hide-forwarded-for-headers
5418 +hide-from-header{block}
5419 +hide-referer{forge}
5423 +prevent-compression
5426 +session-cookies-only
5427 +set-image-blocker{blank} }
5430 { +block +handle-as-image }
5436 Ooops, the <quote>/adsl/</quote> is matching <quote>/ads</quote>! But
5437 we did not want this at all! Now we see why we get the blank page. We could
5438 now add a new action below this that explicitly does <emphasis>not</emphasis>
5439 block (<quote>{-block}</quote>) paths with <quote>adsl</quote>. There are
5440 various ways to handle such exceptions. Example:
5452 Now the page displays ;-) Be sure to flush your browser's caches when
5453 making such changes. Or, try using <literal>Shift+Reload</literal>.
5457 But now what about a situation where we get no explicit matches like
5464 { +block +handle-as-image }
5470 That actually was very telling and pointed us quickly to where the problem
5471 was. If you don't get this kind of match, then it means one of the default
5472 rules in the first section is causing the problem. This would require some
5473 guesswork, and maybe a little trial and error to isolate the offending rule.
5474 One likely cause would be one of the <quote>{+filter}</quote> actions. These
5475 tend to be harder to troubleshoot. Try adding the URL for the site to one of
5476 aliases that turn off <quote>+filter</quote>:
5484 .worldpay.com # for quietpc.com
5492 <quote>{shop}</quote> is an <quote>alias</quote> that expands to
5493 <quote>{ -filter -session-cookies-only }</quote>.
5494 Or you could do your own exception to negate filtering:
5507 This would turn off all filtering for that site. This would probably be most
5508 appropriately put in <filename>user.action</filename>, for local site
5513 Images that are inexplicably being blocked, may well be hitting the
5514 <quote>+filter{banners-by-size}</quote> rule, which assumes
5515 that images of certain sizes are ad banners (works well most of the time
5516 since these tend to be standardized).
5520 <quote>{fragile}</quote> is an alias that disables most actions. This can be
5521 used as a last resort for problem sites. Remember to flush caches! If this
5522 still does not work, you will have to go through the remaining actions one by
5523 one to find which one(s) is causing the problem.
5532 This program is free software; you can redistribute it
5533 and/or modify it under the terms of the GNU General
5534 Public License as published by the Free Software
5535 Foundation; either version 2 of the License, or (at
5536 your option) any later version.
5538 This program is distributed in the hope that it will
5539 be useful, but WITHOUT ANY WARRANTY; without even the
5540 implied warranty of MERCHANTABILITY or FITNESS FOR A
5541 PARTICULAR PURPOSE. See the GNU General Public
5542 License for more details.
5544 The GNU General Public License should be included with
5545 this file. If not, you can view it at
5546 http://www.gnu.org/copyleft/gpl.html
5547 or write to the Free Software Foundation, Inc., 59
5548 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
5550 $Log: user-manual.sgml,v $
5551 Revision 1.123.2.9 2002/07/11 03:40:28 david__schmidt
5553 Updated Mac OSX sections due to installation location change
5555 Revision 1.123.2.8 2002/06/09 16:36:32 hal9
5556 Clarifications on filtering and MIME. Hardcode 'latest release' in index.html.
5558 Revision 1.123.2.7 2002/06/09 00:29:34 hal9
5559 Touch ups on filtering, in actions section and Anatomy.
5561 Revision 1.123.2.6 2002/06/06 23:11:03 hal9
5562 Fix broken link. Linkchecked all docs.
5564 Revision 1.123.2.5 2002/05/29 02:01:02 hal9
5565 This is break out of the entire config section from u-m, so it can
5566 eventually be used to generate the comments, etc in the main config file
5567 so that these are in sync with each other.
5569 Revision 1.123.2.4 2002/05/27 03:28:45 hal9
5570 Ooops missed something from David.
5572 Revision 1.123.2.3 2002/05/27 03:23:17 hal9
5573 Fix FIXMEs for OS2 and OSX startup. Fix Redhat typos (should be Red Hat).
5574 That's a wrap, I think.
5576 Revision 1.123.2.2 2002/05/26 19:02:09 hal9
5577 Move Amiga stuff around to take of FIXME in start up section.
5579 Revision 1.123.2.1 2002/05/26 17:04:25 hal9
5580 -Spellcheck, very minor edits, and sync across branches
5582 Revision 1.123 2002/05/24 23:19:23 hal9
5583 Include new image (Proxy setup). More fun with guibutton.
5584 Minor corrections/clarifications here and there.
5586 Revision 1.122 2002/05/24 13:24:08 oes
5587 Added Bookmarklet for one-click pre-filled access to show-url-info
5589 Revision 1.121 2002/05/23 23:20:17 oes
5590 - Changed more (all?) references to actions to the
5591 <literal><link> style.
5592 - Small fixes in the actions chapter
5593 - Small clarifications in the quickstart to ad blocking
5594 - Removed <emphasis> from <title>s since the new doc CSS
5595 renders them red (bad in TOC).
5597 Revision 1.120 2002/05/23 19:16:43 roro
5598 Correct Debian specials (installation and startup).
5600 Revision 1.119 2002/05/22 17:17:05 oes
5603 Revision 1.118 2002/05/21 04:54:55 hal9
5604 -New Section: Quickstart to Ad Blocking
5605 -Reformat Actions Anatomy to match new CGI layout
5607 Revision 1.117 2002/05/17 13:56:16 oes
5608 - Reworked & extended Templates chapter
5609 - Small changes to Regex appendix
5610 - #included authors.sgml into (C) and hist chapter
5612 Revision 1.116 2002/05/17 03:23:46 hal9
5613 Fixing merge conflict in Quickstart section.
5615 Revision 1.115 2002/05/16 16:25:00 oes
5616 Extended the Filter File chapter & minor fixes
5618 Revision 1.114 2002/05/16 09:42:50 oes
5619 More ulink->link, added some hints to Quickstart section
5621 Revision 1.113 2002/05/15 21:07:25 oes
5622 Extended and further commented the example actions files
5624 Revision 1.112 2002/05/15 03:57:14 hal9
5625 Spell check. A few minor edits here and there for better syntax and
5628 Revision 1.111 2002/05/14 23:01:36 oes
5631 Revision 1.110 2002/05/14 19:10:45 oes
5632 Restored alphabetical order of actions
5634 Revision 1.109 2002/05/14 17:23:11 oes
5635 Renamed the prevent-*-cookies actions, extended aliases section and moved it before the example AFs
5637 Revision 1.108 2002/05/14 15:29:12 oes
5638 Completed proofreading the actions chapter
5640 Revision 1.107 2002/05/12 03:20:41 hal9
5641 Small clarifications for 127.0.0.1 vs localhost for listen-address since this
5642 apparently an important distinction for some OS's.
5644 Revision 1.106 2002/05/10 01:48:20 hal9
5645 This is mostly proposed copyright/licensing additions and changes. Docs
5646 are still GPL, but licensing and copyright are more visible. Also, copyright
5647 changed in doc header comments (eliminate references to JB except FAQ).
5649 Revision 1.105 2002/05/05 20:26:02 hal9
5650 Sorting out license vs copyright in these docs.
5652 Revision 1.104 2002/05/04 08:44:45 swa
5655 Revision 1.103 2002/05/04 00:40:53 hal9
5656 -Remove the TOC first page kludge. It's fixed proper now in ldp.dsl.in.
5657 -Some minor additions to Quickstart.
5659 Revision 1.102 2002/05/03 17:46:00 oes
5660 Further proofread & reactivated short build instructions
5662 Revision 1.101 2002/05/03 03:58:30 hal9
5663 Move the user-manual config directive to top of section. Add note about
5664 Privoxy needing read permissions for configs, and write for logs.
5666 Revision 1.100 2002/04/29 03:05:55 hal9
5667 Add clarification on differences of new actions files.
5669 Revision 1.99 2002/04/28 16:59:05 swa
5670 more structure in starting section
5672 Revision 1.98 2002/04/28 05:43:59 hal9
5673 This is the break up of configuration.html into multiple files. This
5674 will probably break links elsewhere :(
5676 Revision 1.97 2002/04/27 21:04:42 hal9
5677 -Rewrite of Actions File example.
5678 -Add section for user-manual directive in config.
5680 Revision 1.96 2002/04/27 05:32:00 hal9
5681 -Add short section to Filter Files to tie in with +filter action.
5682 -Start rewrite of examples in Actions Examples (not finished).
5684 Revision 1.95 2002/04/26 17:23:29 swa
5685 bookmarks cleaned, changed structure of user manual, screen and programlisting cleanups, and numerous other changes that I forgot
5687 Revision 1.94 2002/04/26 05:24:36 hal9
5688 -Add most of Andreas suggestions to Chain of Events section.
5689 -A few other minor corrections and touch up.
5691 Revision 1.92 2002/04/25 18:55:13 hal9
5692 More catchups on new actions files, and new actions names.
5693 Other assorted cleanups, and minor modifications.
5695 Revision 1.91 2002/04/24 02:39:31 hal9
5696 Add 'Chain of Events' section.
5698 Revision 1.90 2002/04/23 21:41:25 hal9
5699 Linuxconf is deprecated on RH, substitute chkconfig.
5701 Revision 1.89 2002/04/23 21:05:28 oes
5702 Added hint for startup on Red Hat
5704 Revision 1.88 2002/04/23 05:37:54 hal9
5705 Add AmigaOS install stuff.
5707 Revision 1.87 2002/04/23 02:53:15 david__schmidt
5708 Updated OSX installation section
5709 Added a few English tweaks here an there
5711 Revision 1.86 2002/04/21 01:46:32 hal9
5712 Re-write actions section.
5714 Revision 1.85 2002/04/18 21:23:23 hal9
5715 Fix ugly typo (mine).
5717 Revision 1.84 2002/04/18 21:17:13 hal9
5718 Spell Redhat correctly (ie Red Hat). A few minor grammar corrections.
5720 Revision 1.83 2002/04/18 18:21:12 oes
5721 Added RPM install detail
5723 Revision 1.82 2002/04/18 12:04:50 oes
5726 Revision 1.81 2002/04/18 11:50:24 oes
5727 Extended Install section - needs fixing by packagers
5729 Revision 1.80 2002/04/18 10:45:19 oes
5730 Moved text to buildsource.sgml, renamed some filters, details
5732 Revision 1.79 2002/04/18 03:18:06 hal9
5733 Spellcheck, and minor touchups.
5735 Revision 1.78 2002/04/17 18:04:16 oes
5738 Revision 1.77 2002/04/17 13:51:23 oes
5739 Proofreading, part one
5741 Revision 1.76 2002/04/16 04:25:51 hal9
5742 -Added 'Note to Upgraders' and re-ordered the 'Quickstart' section.
5743 -Note about proxy may need requests to re-read config files.
5745 Revision 1.75 2002/04/12 02:08:48 david__schmidt
5746 Remove OS/2 building info... it is already in the developer-manual
5748 Revision 1.74 2002/04/11 00:54:38 hal9
5749 Add small section on submitting actions.
5751 Revision 1.73 2002/04/10 18:45:15 swa
5754 Revision 1.72 2002/04/10 04:06:19 hal9
5755 Added actions feedback to Bookmarklets section
5757 Revision 1.71 2002/04/08 22:59:26 hal9
5758 Version update. Spell chkconfig correctly :)
5760 Revision 1.70 2002/04/08 20:53:56 swa
5763 Revision 1.69 2002/04/06 05:07:29 hal9
5764 -Add privoxy-man-page.sgml, for man page.
5765 -Add authors.sgml for AUTHORS (and p-authors.sgml)
5766 -Reworked various aspects of various docs.
5767 -Added additional comments to sub-docs.
5769 Revision 1.68 2002/04/04 18:46:47 swa
5770 consistent look. reuse of copyright, history et. al.
5772 Revision 1.67 2002/04/04 17:27:57 swa
5773 more single file to be included at multiple points. make maintaining easier
5775 Revision 1.66 2002/04/04 06:48:37 hal9
5776 Structural changes to allow for conditional inclusion/exclusion of content
5777 based on entity toggles, e.g. 'entity % p-not-stable "INCLUDE"'. And
5778 definition of internal entities, e.g. 'entity p-version "2.9.13"' that will
5779 eventually be set by Makefile.
5780 More boilerplate text for use across multiple docs.
5782 Revision 1.65 2002/04/03 19:52:07 swa
5783 enhance squid section due to user suggestion
5785 Revision 1.64 2002/04/03 03:53:43 hal9
5786 A few minor bug fixes, and touch ups. Ready for review.
5788 Revision 1.63 2002/04/01 16:24:49 hal9
5789 Define entities to include boilerplate text. See doc/source/*.
5791 Revision 1.62 2002/03/30 04:15:53 hal9
5792 - Fix privoxy.org/config links.
5793 - Paste in Bookmarklets from Toggle page.
5794 - Move Quickstart nearer top, and minor rework.
5796 Revision 1.61 2002/03/29 01:31:08 hal9
5799 Revision 1.60 2002/03/27 01:57:34 hal9
5800 Added more to Anatomy section.
5802 Revision 1.59 2002/03/27 00:54:33 hal9
5803 Touch up intro for new name.
5805 Revision 1.58 2002/03/26 22:29:55 swa
5806 we have a new homepage!
5808 Revision 1.57 2002/03/24 20:33:30 hal9
5809 A few minor catch ups with name change.
5811 Revision 1.56 2002/03/24 16:17:06 swa
5812 configure needs to be generated.
5814 Revision 1.55 2002/03/24 16:08:08 swa
5815 we are too lazy to make a block-built
5816 privoxy logo. hence removed the option.
5818 Revision 1.54 2002/03/24 15:46:20 swa
5819 name change related issue.
5821 Revision 1.53 2002/03/24 11:51:00 swa
5822 name change. changed filenames.
5824 Revision 1.52 2002/03/24 11:01:06 swa
5827 Revision 1.51 2002/03/23 15:13:11 swa
5828 renamed every reference to the old name with foobar.
5829 fixed "application foobar application" tag, fixed
5830 "the foobar" with "foobar". left junkbustser in cvs
5831 comments and remarks to history untouched.
5833 Revision 1.50 2002/03/23 05:06:21 hal9
5836 Revision 1.49 2002/03/21 17:01:05 hal9
5837 New section in Appendix.
5839 Revision 1.48 2002/03/12 06:33:01 hal9
5840 Catching up to Andreas and re_filterfile changes.
5842 Revision 1.47 2002/03/11 13:13:27 swa
5843 correct feedback channels
5845 Revision 1.46 2002/03/10 00:51:08 hal9
5846 Added section on JB internal pages in Appendix.
5848 Revision 1.45 2002/03/09 17:43:53 swa
5851 Revision 1.44 2002/03/09 17:08:48 hal9
5852 New section on Jon's actions file editor, and move some stuff around.
5854 Revision 1.43 2002/03/08 00:47:32 hal9
5855 Added imageblock{pattern}.
5857 Revision 1.42 2002/03/07 18:16:55 swa
5860 Revision 1.41 2002/03/07 16:46:43 hal9
5861 Fix a few markup problems for jade.
5863 Revision 1.40 2002/03/07 16:28:39 swa
5864 provide correct feedback channels
5866 Revision 1.39 2002/03/06 16:19:28 hal9
5867 Note on perceived filtering slowdown per FR.
5869 Revision 1.38 2002/03/05 23:55:14 hal9
5870 Stupid I did it again. Double hyphen in comment breaks jade.
5872 Revision 1.37 2002/03/05 23:53:49 hal9
5873 jade barfs on '- -' embedded in comments. - -user option broke it.
5875 Revision 1.36 2002/03/05 22:53:28 hal9
5876 Add new - - user option.
5878 Revision 1.35 2002/03/05 00:17:27 hal9
5879 Added section on command line options.
5881 Revision 1.34 2002/03/04 19:32:07 oes
5882 Changed default port to 8118
5884 Revision 1.33 2002/03/03 19:46:13 hal9
5885 Emphasis on where/how to report bugs, etc
5887 Revision 1.32 2002/03/03 09:26:06 joergs
5888 AmigaOS changes, config is now loaded from PROGDIR: instead of
5889 AmiTCP:db/junkbuster/ if no configuration file is specified on the
5892 Revision 1.31 2002/03/02 22:45:52 david__schmidt
5895 Revision 1.30 2002/03/02 22:00:14 hal9
5896 Updated 'New Features' list. Ran through spell-checker.
5898 Revision 1.29 2002/03/02 20:34:07 david__schmidt
5899 Update OS/2 build section
5901 Revision 1.28 2002/02/24 14:34:24 jongfoster
5902 Formatting changes. Now changing the doctype to DocBook XML 4.1
5903 will work - no other changes are needed.
5905 Revision 1.27 2002/01/11 14:14:32 hal9
5906 Added a very short section on Templates
5908 Revision 1.26 2002/01/09 20:02:50 hal9
5909 Fix bug re: auto-detect config file changes.
5911 Revision 1.25 2002/01/09 18:20:30 hal9
5912 Touch ups for *.action files.
5914 Revision 1.24 2001/12/02 01:13:42 hal9
5917 Revision 1.23 2001/12/02 00:20:41 hal9
5918 Updates for recent changes.
5920 Revision 1.22 2001/11/05 23:57:51 hal9
5921 Minor update for startup now daemon mode.
5923 Revision 1.21 2001/10/31 21:11:03 hal9
5924 Correct 2 minor errors
5926 Revision 1.18 2001/10/24 18:45:26 hal9
5927 *** empty log message ***
5929 Revision 1.17 2001/10/24 17:10:55 hal9
5930 Catching up with Jon's recent work, and a few other things.
5932 Revision 1.16 2001/10/21 17:19:21 swa
5933 wrong url in documentation
5935 Revision 1.15 2001/10/14 23:46:24 hal9
5936 Various minor changes. Fleshed out SEE ALSO section.
5938 Revision 1.13 2001/10/10 17:28:33 hal9
5941 Revision 1.12 2001/09/28 02:57:04 hal9
5944 Revision 1.11 2001/09/28 02:25:20 hal9
5947 Revision 1.9 2001/09/27 23:50:29 hal9
5948 A few changes. A short section on regular expression in appendix.
5950 Revision 1.8 2001/09/25 00:34:59 hal9
5951 Some additions, and re-arranging.
5953 Revision 1.7 2001/09/24 14:31:36 hal9
5956 Revision 1.6 2001/09/24 14:10:32 hal9
5957 Including David's OS/2 installation instructions.
5959 Revision 1.2 2001/09/13 15:27:40 swa
5962 Revision 1.1 2001/09/12 15:36:41 swa
5963 source files for junkbuster documentation
5965 Revision 1.3 2001/09/10 17:43:59 swa
5966 first proposal of a structure.
5968 Revision 1.2 2001/06/13 14:28:31 swa
5969 docs should have an author.
5971 Revision 1.1 2001/06/13 14:20:37 swa
5972 first import of project's documentation for the webserver.