4 By: Junkbuster Developers
6 $Id: user-manual.sgml,v 1.15 2001/10/14 23:46:24 hal9 Exp $
8 The user manual gives the users information on how to install and
9 configure Internet Junkbuster. Internet Junkbuster is an application
10 that provides privacy and security to users of the World Wide Web.
12 You can find the latest version of the user manual at
13 [1]http://ijbswa.sourceforge.net/user-manual/.
15 Feel free to send a note to the developers at
16 <[2]ijbswa-developers@lists.sourceforge.net>.
17 _________________________________________________________________
33 3. [12]Junkbuster Configuration
35 3.1. [13]The Main Configuration File
36 3.2. [14]The Actions File
37 3.3. [15]The Filter File
39 4. [16]Quickstart to Using Junkbuster
40 5. [17]Contact the Developers
41 6. [18]Copyright and History
49 8.1. [23]Regular Expressions
53 Internet Junkbuster is a web proxy with advanced filtering
54 capabilities for protecting privacy, filtering web page content,
55 managing cookies, controlling access, and removing ads, banners,
56 pop-ups and other obnoxious Internet Junk. Junkbuster has a very
57 flexible configuration and can be customized to suit individual needs
58 and tastes. Internet Junkbuster has application for both stand-alone
59 systems and multi-user networks.
61 This documentation is included with the current development version of
62 Internet Junkbuster and is incomplete at this point. The most up to
63 date reference for the time being is still the comments in the source
64 files and in the individual configuration files. Development of
65 version 3.0 is currently underway, and includes many significant
66 changes and enhancements over earlier verions. The target release date
67 for stable v3.0 is December 2001.
69 Since this is a development version, some features are in the process
70 of being implemented. This documentation may be slightly out of sync
71 as a result. And there are bugs, though hopefully not many!
72 _________________________________________________________________
76 In addition to Junkbuster's traditional features of ad and banner
77 blocking and cookie management, this is a list of new features
78 currently under development:
80 * Modularized configuration that will allow for system wide
81 settings, and individual user settings.
82 * A browser based GUI configuration utility (not finished).
83 * Blocking of annoying pop-up browser windows (previously available
85 * Partial support for HTTP/1.1.
86 * Support for Perl Compatible Regular Expressions in the
87 configuration files, and generally a more sophisticated
88 configuration syntax over previous versions.
89 * Web page content filtering.
91 _________________________________________________________________
95 Junkbuster is available as raw source code, or pre-compiled binaries.
96 See the [24]Junkbuster Home Page for current release info. Junkbuster
97 is also available via [25]CVS. This is the recommended approach at
98 this time. But please be aware that CVS is constantly changing, and it
99 may break in mysterious ways.
100 _________________________________________________________________
104 For gzipped tar archives, unpack the source:
106 tar zxvf ijb_source_2.9*
109 For retrieving the current CVS sources, you'll need the CVS package
110 installed first. To download CVS source:
112 cvs -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa login
113 cvs -z3 -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa co cu
117 This will create a directory named current/, which will contain the
120 Then, in either case, to build from source:
127 For Redhat and SuSE Linux RPM packages, see below.
128 _________________________________________________________________
132 To build Redhat RPM packages, install source as above. Then:
137 This will create both binary and src RPMs in the usual places.
140 /usr/src/redhat/RPMS/i686/junkbuster-2.9.8-1.i686.rpm
142 /usr/src/redhat/SRPMS/junkbuster-2.9.9-1.src.rpm
144 To install, of course:
146 rpm -Uvv /usr/src/redhat/RPMS/i686/junkbuster-2.9.9-1.i686.rpm
148 This will place the Junkbuster configuration files in
149 /etc/junkbuster/, and log files in /var/log/junkbuster/.
150 _________________________________________________________________
154 To build SuSE RPM packages, install source as above. Then:
159 This will create both binary and src RPMs in the usual places.
162 /usr/src/suse/RPMS/i686/junkbuster-2.9.9-1.i686.rpm
164 /usr/src/suse/SRPMS/junkbuster-2.9.9-1.src.rpm
166 To install, of course:
168 rpm -Uvv /usr/src/suse/RPMS/i686/junkbuster-2.9.9-1.i686.rpm
170 This will place the Junkbuster configuration files in
171 /etc/junkbuster/, and log files in /var/log/junkbuster/.
172 _________________________________________________________________
176 The OS/2 version of Junkbuster requires the EMX runtime library to be
177 installed. The EMX runtime library is available on the hobbes OS/2
178 archive, among many other locations:
179 [26]http://hobbes.nmsu.edu/cgi-bin/h-search?sh=1&button=Search&key=emx
180 rt.zip&stype=all&sort=type&dir=%2Fpub%2Fos2%2Fdev%2Femx%2Fv0.9d
182 Junkbuster is packaged in a WarpIN self- installing archive. The
183 self-installing program will be named depending on the release
184 version, something like: ijbos123.exe. In order to install it, simply
185 run this executable or double-click on its icon and follow the WarpIN
186 installation panels. A shadow of the Junkbuster executable will be
187 placed in your startup folder so it will start automatically whenever
190 The directory you choose to install Junkbuster into will contain all
191 of the configuration files.
193 If you would like to build binary images on OS/2 yourself, you will
194 need a working EMX/GCC environment, plus several Unix-like tools. The
195 Hobbes OS/2 archive is a good place to start when building such an
196 environment. A set of Unix-like tools named gnupack is located here:
197 [27]http://hobbes.nmsu.edu/cgi-bin/h-search?sh=1&key=gnupack&stype=all
198 &sort=type&dir=%2Fpub%2Fos2%2Fapps
200 Once you have the source code unpacked as above, you can build the
201 binaries from the current/ directory:
206 _________________________________________________________________
210 Click-click. (I need help on this. Not a clue here. Also for
211 configuration section below. HB.)
212 _________________________________________________________________
216 Some quick notes on other Operating Systems.
218 For FreeBSD (and other *BSDs?), the build will need gmake instead of
219 the included make. gmake is available from [28]http://www.gnu.org. The
220 rest should be the same as above for Linux/Unix.
221 _________________________________________________________________
223 3. Junkbuster Configuration
225 For Unix, *BSD and Linux, all configuraton files are located in
226 /etc/junkbuster/ by default. For MS Windows and OS/2, these are all in
227 the same directory as the Junkbuster executable. The name and number
228 of configuration files has changed from previous versions, and is
229 subject to change as development progresses.
231 The installed defaults provide a reasonable starting point. For the
232 time being, there are only three default configuration files (this
233 will change in time):
235 * The main configuration file is named config on Linux, Unix, BSD,
236 and OS/2, and junkbustr.txt on Windows. On Amiga, it is
237 AmiTCP:db/junkbuster/config.
238 * The actionsfile file is used to define various actions relating to
239 images, banners, pop-ups, banners and cookies.
240 * The re_filterfile file can be used to rewrite the raw page
241 content, including text as well as embedded HTML and JavaScript.
243 actionsfile and re_filterfile can use Perl style regular expressions
244 for maximum flexibility. All files use the "#" character to denote a
245 comment. Such lines are not processed by Junkbuster. After making any
246 changes, restart Junkbuster in order for the changes to take effect.
247 _________________________________________________________________
249 3.1. The Main Configuration File
251 Again, the main configuration file is named config on Linux/Unix/BSD
252 and OS/2, and junkbustr.txt on Windows. Configuration lines consist of
253 an initial keyword followed by a list of values, all separated by
254 whitespace (any number of spaces or tabs). For example:
256 blockfile blocklist.ini
258 Indicates that the blockfile is named "blocklist.ini".
260 The "#" indicates a comment. Any part of a line following a "#" is
261 ignored, except if the "#" is preceded by a "\".
263 Thus, by placing a "#" at the start of an existing configuration line,
264 you can make it a comment and it will be treated as if it weren't
265 there. This is called "commenting out" an option and can be useful to
266 turn off features: If you comment out the "logfile" line, junkbuster
267 will not log to a file at all. Watch for the "default:" section in
268 each explanation to see what happens if the option is left unset (or
271 Long lines can be continued on the next line by using a "\" as the
274 There are various aspects of Junkbuster behavior that can be adjusted.
275 _________________________________________________________________
277 3.1.1. Defining Other Configuration Files
279 Junkbuster can use a number of other files to tell it what ads to
280 block, what cookies to accept, etc. This section of the configuration
281 file tells Junkbuster where to find all those other files.
283 On Windows, Junkbuster looks for these files in the same directory as
284 the executable. On Unix and OS/2, Junkbuster looks for these files in
285 the current working directory. In either case, an absolute path name
286 can be used to avoid problems.
288 When development goes modular and multiuser, the blocker, filter, and
289 per-user config will be stored in subdirectories of "confdir". For
290 now, only confdir/templates is used for storing HTML templates for CGI
293 The location of the configuration files:
295 confdir /etc/junkbuster # No trailing /, please.
297 The directory where all logging (i.e. logfile and jarfile) takes
298 place. No trailing "/", please:
300 logdir /var/log/junkbuster
302 Note that all file specifications below are relative to the above two
305 The "actionsfile" contains patterns to specify the actions to apply to
306 requests for each site. Default: Cookies to and from all destinations
307 are filtered. Popups are disabled for all sites. All sites are
308 filtered if re_filterfile specified. No sites are blocked. An empty
309 image is displayed for filtered ads and other images (formerly
310 "tinygif"). The syntax of this file is explained in detail [29]below.
312 actionsfile actionsfile
314 The "re_filterfile" file contains content modification rules. These
315 rules permit powerful changes on the content of Web pages, e.g., you
316 could disable your favourite JavaScript annoyances, rewrite the actual
317 content, or just have some fun replacing "Microsoft" with "MicroSuck"
318 wherever it appears on a Web page. Default: No content modification,
319 or whatever the developers are playing with :-/
321 re_filterfile re_filterfile
323 The logfile is where all logging and error messages are written. The
324 logfile can be useful for tracking down a problem with Junkbuster
325 (e.g., it's not blocking an ad you think it should block) but in most
326 cases you probably will never look at it.
328 Your logfile will grow indefinitely, and you will probably want to
329 periodically remove it. On Unix systems, you can do this with a cron
330 job (see "man cron"). For Redhat, a logrotate script has been
333 On SuSE Linux systems, you can place a line like
334 "/var/log/junkbuster.* +1024k 644 nobody.nogroup" in /etc/logfiles,
335 with the effect that cron.daily will automatically archive, gzip, and
336 empty the log, when it exceeds 1M size.
338 Default: Log to the a file named logfile. Comment out to disable
343 The "jarfile" defines where Junkbuster stores the cookies it
344 intercepts. Note that if you use a "jarfile", it may grow quite large.
345 Default: Don't store intercepted cookies.
349 If you specify a "trustfile", Junkbuster will only allow access to
350 sites that are named in the trustfile. You can also mark sites as
351 trusted referrers, with the effect that access to untrusted sites will
352 be granted, if a link from a trusted referrer was used. The link
353 target will then be added to the "trustfile". This is a very
354 restrictive feature that typical users most propably want to leave
355 disabled. Default: Disabled, don't use the trust mechanism.
359 If you use the trust mechanism, it is a good idea to write up some
360 online documentation about your blocking policy and to specify the
361 URL(s) here. They will appear on the page that your users receive when
362 they try to access untrusted content. Use multiple times for multiple
363 URLs. Default: Don't display links on the "untrusted" info page.
365 trust-info-url http://www.your-site.com/why_we_block.html
366 trust-info-url http://www.your-site.com/what_we_allow.html
367 _________________________________________________________________
369 3.1.2. Other Configuration Options
371 This part of the configuration file contains options that control how
374 "Admin-address" should be set to the email address of the proxy
375 administrator. It is used in many of the proxy-generated pages.
376 Default: fill@me.in.please.
378 #admin-address fill@me.in.please
380 "Proxy-info-url" can be set to a URL that contains more info about
381 this Junkbuster installation, it's configuration and policies. It is
382 used in many of the proxy-generated pages and its use is highly
383 recommended in multi-user installations, since your users will want to
384 know why certain content is blocked or modified. Default: Don't show a
385 link to online documentation.
387 proxy-info-url http://www.your-site.com/proxy.html
389 "Listen-address" specifies the address and port where Junkbuster will
390 listen for connections from your Web browser. The default is to listen
391 on the localhost port 8000, and this is suitable for most users. (In
392 your web browser, under proxy configuration, list the proxy server as
393 "localhost" and the port as "8000").
395 If you already have another service running on port 8000, or if you
396 want to serve requests from other machines (e.g. on your local
397 network) as well, you will need to override the default. The syntax is
398 "listen-address [<ip-address>]:<port>". If you leave out the IP
399 adress, junkbuster will bind to all interfaces (addresses) on your
400 machine and may become reachable from the internet. In that case,
401 consider using access control lists (acl's) (see "aclfile" above).
403 For example, suppose you are running Junkbuster on a machine which has
404 the address 192.168.0.1 on your local private network (192.168.0.0)
405 and has another outside connection with a different address. You want
406 it to serve requests from inside only:
408 listen-address 192.168.0.1:8000
410 If you want it to listen on all addresses (including the outside
415 If you do this, consider using ACLs (see "aclfile" above). Note: you
416 will need to point your browser(s) to the address and port that you
417 have configured here. Default: localhost:8000 (127.0.0.1:8000).
419 The debug option sets the level of debugging information to log in the
420 logfile (and to the console in the Windows version). A debug level of
421 1 is informative because it will show you each request as it happens.
422 Higher levels of debug are probably only of interest to developers.
424 debug 1 # GPC = show each GET/POST/CONNECT request
425 debug 2 # CONN = show each connection status
426 debug 4 # IO = show I/O status
427 debug 8 # HDR = show header parsing
428 debug 16 # LOG = log all data into the logfile
429 debug 32 # FRC = debug force feature
430 debug 64 # REF = debug regular expression filter
431 debug 128 # = debug fast redirects
432 debug 256 # = debug GIF deanimation
433 debug 512 # CLF = Common Log Format
434 debug 1024 # = debug kill popups
435 debug 4096 # INFO = Startup banner and warnings.
436 debug 8192 # ERROR = Non-fatal errors
438 It is highly recommended that you enable ERROR reporting (debug 8192),
439 at least until the next stable release.
441 The reporting of FATAL errors (i.e. ones which crash JunkBuster) is
442 always on and cannot be disabled.
444 If you want to use CLF (Common Log Format), you should set "debug 512"
445 ONLY, do not enable anything else.
447 Multiple "debug" directives, are OK - they're logical-OR'd together.
449 debug 15 # same as setting the first 4 listed above
455 debug 8192 # Errors - *we highly recommended enabling this*
457 Junkbuster normally uses "multi-threading", a software technique that
458 permits it to handle many different requests simultaneously. In some
459 cases you may wish to disable this -- particularly if you're trying to
460 debug a problem. The "single-threaded" option forces Junkbuster to
461 handle requests sequentially. Default: Multi-threaded mode.
465 "toggle" allows you to temporarily disable all Junkbuster's filtering.
468 The Windows version of Junkbuster puts an icon in the system tray,
469 which allows you to change this option without having to edit this
470 file. If you right-click on that icon (or select the "Options" menu),
471 one choice is "Enable". Clicking on enable toggles Junkbuster on and
472 off. This is useful if you want to temporarily disable Junkbuster,
473 e.g., to access a site that requires cookies which you normally have
476 "toggle 1" means Junkbuster runs normally, "toggle 0" means that
477 Junkbuster becomes a non-anonymizing non-blocking proxy. Default: 1.
480 _________________________________________________________________
482 3.1.3. Access Control List (ACL)
484 Access controls are included at the request of some ISPs and systems
485 administrators, and are not usually needed by individual users. Please
486 note the warnings in the FAQ that this proxy is not intended to be a
487 substitute for a firewall or to encourage anyone to defer addressing
488 basic security weaknesses.
490 If no access settings are specified, the proxy talks to anyone that
491 connects. If any access settings file are specified, then the proxy
492 talks only to IP addresses permitted somewhere in this file and not
493 denied later in this file.
495 Summary -- if using an ACL:
497 Client must have permission to receive service.
499 LAST match in ACL wins.
501 Default behavior is to deny service.
503 The syntax for an entry in the Access Control List is:
505 ACTION SRC_ADDR[/SRC_MASKLEN] [ DST_ADDR[/DST_MASKLEN] ]
507 Where the individual fields are:
509 ACTION = "permit-access" or "deny-access"
510 SRC_ADDR = client hostname or dotted IP address
511 SRC_MASKLEN = number of bits in the subnet mask for the source
512 DST_ADDR = server or forwarder hostname or dotted IP address
513 DST_MASKLEN = number of bits in the subnet mask for the target
515 The field separator (FS) is whitespace (space or tab).
517 IMPORTANT NOTE: If the junkbuster is using a forwarder (see below) or
518 a gateway for a particular destination URL, the DST_ADDR that is
519 examined is the address of the forwarder or the gateway and NOT the
520 address of the ultimate target. This is necessary because it may be
521 impossible for the local Junkbuster to determine the address of the
522 ultimate target (that's often what gateways are used for).
524 Here are a few examples to show how the ACL features work:
526 "localhost" is OK -- no DST_ADDR implies that ALL destination
529 permit-access localhost
531 A silly example to illustrate permitting any host on the class-C
532 subnet with Junkbuster to go anywhere:
534 permit-access www.junkbusters.com/24
536 Except deny one particular IP address from using it at all:
538 deny-access ident.junkbusters.com
540 You can also specify an explicit network address and subnet mask.
541 Explicit addresses do not have to be resolved to be used.
543 permit-access 207.153.200.0/24
545 A subnet mask of 0 matches anything, so the next line permits
548 permit-access 0.0.0.0/0
550 Note, you cannot say:
554 to allow all *.org domains. Every IP address listed must resolve
557 An ISP may want to provide a Junkbuster that is accessible by "the
558 world" and yet restrict use of some of their private content to hosts
559 on its internal network (i.e. its own subscribers). Say, for instance
560 the ISP owns the Class-B IP address block 123.124.0.0 (a 16 bit
561 netmask). This is how they could do it:
563 permit-access 0.0.0.0/0 0.0.0.0/0 # other clients can go anywhere
564 # with the following exceptions
567 deny-access 0.0.0.0/0 123.124.0.0/16 # block all external request
569 # sites on the ISP's network
570 permit 0.0.0.0/0 www.my_isp.com # except for the ISP's main
572 permit 123.124.0.0/16 0.0.0.0/0 # the ISP's clients can go
575 Note that if some hostnames are listed with multiple IP addresses, the
576 primary value returned by DNS (via gethostbyname()) is used. Default:
577 Anyone can access the proxy.
578 _________________________________________________________________
582 This feature allows chaining of HTTP requests via multiple proxies. It
583 can be used to better protect privacy and confidentiality when
584 accessing specific domains by routing requests to those domains to a
585 special purpose filtering proxy such as lpwa.com.
587 It can also be used in an environment with multiple networks to route
588 requests via multiple gateways allowing transparent access to multiple
589 networks without having to modify browser configurations.
591 Also specified here are SOCKS proxies. Junkbuster SOCKS 4 and SOCKS
592 4A. The difference is that SOCKS 4A will resolve the target hostname
593 using DNS on the SOCKS server, not our local DNS client.
595 The syntax of each line is:
597 forward target_domain[:port] http_proxy_host[:port]
598 forward-socks4 target_domain[:port] socks_proxy_host[:port]
599 http_proxy_host[:port]
600 forward-socks4a target_domain[:port] socks_proxy_host[:port]
601 http_proxy_host[:port]
603 If http_proxy_host is ".", then requests are not forwarded to a HTTP
604 proxy but are made directly to the web servers.
606 Lines are checked in sequence, and the last match wins.
608 There is an implicit line equivalent to the following, which specifies
609 that anything not finding a match on the list is to go out without
610 forwarding or gateway protocol, like so:
612 forward .* . # implicit
614 In the following common configuration, everything goes to Lucent's
615 LPWA, except SSL on port 443 (which it doesn't handle):
617 forward .* lpwa.com:8000
620 See the FAQ for instructions on how to automate the login procedure
621 for LPWA. Some users have reported difficulties related to LPWA's use
622 of "." as the last element of the domain, and have said that this can
625 forward lpwa. lpwa.com:8000
627 (NOTE: the syntax for specifiying target_domain has changed since the
628 previous paragraph was written -- it will not work now. More
629 information is welcome.)
631 In this fictitious example, everything goes via an ISP's caching
632 proxy, except requests to that ISP:
634 forward .* caching.myisp.net:8000
637 For the @home network, we're told the forwarding configuration is
640 forward .* proxy:8080
642 Also, we're told they insist on getting cookies and JavaScript, so you
643 need to add home.com to the cookie file. We consider JavaScript a
644 security risk. Java need not be enabled.
646 In this example direct connections are made to all "internal" domains,
647 but everything else goes through Lucent's LPWA by way of the company's
648 SOCKS gateway to the Internet.
650 forward_socks4 .* lpwa.com:8000 firewall.my_company.com:1080
651 forward my_company.com .
653 This is how you could set up a site that always uses SOCKS but no
656 forward_socks4a .* . firewall.my_company.com:1080
658 An advanced example for network administrators:
660 If you have links to multiple ISPs that provide various special
661 content to their subscribers, you can configure forwarding to pass
662 requests to the specific host that's connected to that ISP so that
663 everybody can see all of the content on all of the ISPs.
665 This is a bit tricky, but here's an example:
667 host-a has a PPP connection to isp-a.com. And host-b has a PPP
668 connection to isp-b.com. host-a can run a Junkbuster proxy with
669 forwarding like this:
672 forward isp-b.com host-b:8000
674 host-b can run a Junkbuster proxy with forwarding like this:
677 forward isp-a.com host-a:8000
679 Now, anyone on the Internet (including users on host-a and host-b) can
680 set their browser's proxy to either host-a or host-b and be able to
681 browse the content on isp-a or isp-b.
683 Here's another practical example, for University of Kent at Canterbury
684 students with a network connection in their room, who need to use the
685 University's Squid web cache.
687 forward *. ssbcache.ukc.ac.uk:3128 # Use the proxy, except for:
688 forward .ukc.ac.uk . # Anything on the same domain as us
689 forward * . # Host with no domain specified
690 forward 129.12.*.* . # A dotted IP on our /16 network.
691 forward 127.*.*.* . # Loopback address
692 forward localhost.localdomain . # Loopback address
693 forward www.ukc.mirror.ac.uk . # Specific host
695 If you intend to chain Junkbuster and squid locally, then chain as
696 browser -> squid -> junkbuster is the recommended way.
698 Your squid configuration could then look like this:
700 # Define junkbuster as parent cache
702 cache_peer 127.0.0.1 parent 8000 0 no-query
704 # Define ACL for protocol FTP
706 # Do not forward ACL FTP to junkbuster
707 always_direct allow FTP
708 # Do not forward ACL CONNECT (https) to junkbuster
709 always_direct allow CONNECT
710 # Forward the rest to junkbuster
711 never_direct allow all
712 _________________________________________________________________
714 3.1.5. Windows GUI Options
716 Junkbuster has a number of options specific to the Windows GUI
719 If "activity-animation" is set to 1, the Junkbuster icon will animate
720 when "Junkbuster" is active. To turn off, set to 0.
724 If "log-messages" is set to 1, Junkbuster will log messages to the
729 If "log-buffer-size" is set to 1, the size of the log buffer, i.e. the
730 amount of memory used for the log messages displayed in the console
731 window, will be limited to "log-max-lines" (see below).
733 Warning: Setting this to 0 will result in the buffer to grow
734 infinitely and eat up all your memory!
738 log-max-lines is the maximum number of lines held in the log buffer.
743 If "log-highlight-messages" is set to 1, Junkbuster will highlight
744 portions of the log messages with a bold-faced font:
746 log-highlight-messages 1
748 The font used in the console window:
750 log-font-name Comic Sans MS
752 Font size used in the console window:
756 "show-on-task-bar" controls whether or not Junkbuster will appear as a
757 button on the Task bar when minimized:
761 If "close-button-minimizes" is set to 1, the Windows close button will
762 minimize Junkbuster instead of closing the program (close with the
763 exit option on the File menu).
765 close-button-minimizes 1
767 The "hide-console" option is specific to the MS-Win console version of
768 JunkBuster. If this option is used, Junkbuster will disconnect from
769 and hide the command console.
772 _________________________________________________________________
774 3.2. The Actions File
776 The "actionsfile" is used to define what actions Junkbuster takes, and
777 thus determines how images, cookies and various other aspects of HTTP
778 content and transactions are handled. Images can be anything you want,
779 including ads, banners, or just some obnoxious image that you would
780 rather not see. Cookies can be accepted or rejected. The default file
781 is in fact named actionsfile.
783 To determine which actions apply to a request, the URL of the request
784 is compared to all patterns in this file. Every time it matches, the
785 list of applicable actions for the URL is incrementally updated. You
786 can trace this process by visiting [30]http://i.j.b/show-url-info.
788 There are four types of lines in this file: comments (begin with a "#"
789 character), actions, aliases and patterns, all of which are explained
791 _________________________________________________________________
793 3.2.1. URL Domain and Path Syntax
795 Generally, a pattern has the form <domain>/<path>, where both the
796 <domain> and <path> part are optional. If you only specify a domain
797 part, the "/" can be left out:
799 www.example.com - is a domain only pattern and will match any request
800 to "www.example.com".
802 www.example.com/ - means exactly the same.
804 www.example.com/index.html - matches only the single document
805 "/index.html" on "www.example.com".
807 /index.html - matches the document "/index.html", regardless of the
810 index.html - matches nothing, since it would be interpreted as a
811 domain name and there is no top-level domain called ".html".
813 The matching of the domain part offers some flexible options: if the
814 domain starts or ends with a dot, it becomes unanchored at that end.
817 .example.com - matches any domain that ENDS in ".example.com".
819 www. - matches any domain that STARTS with "www".
821 Additionally, there are wildcards that you can use in the domain names
822 themselves. They work pretty similar to shell wildcards: "*" stands
823 for zero or more arbitrary characters, "?" stands for any single
824 character. And you can define charachter classes in square brackets
825 and they can be freely mixed:
827 ad*.example.com - matches "adserver.example.com", "ads.example.com",
828 etc but not "sfads.example.com".
830 *ad*.example.com - matches all of the above, and then some.
832 .?pix.com - matches "www.ipix.com", "pictures.epix.com",
833 "a.b.c.d.e.upix.com", etc.
835 www[1-9a-ez].example.com - matches "www1.example.com",
836 "www4.example.com", "wwwd.example.com", "wwwz.example.com", etc., but
837 not "wwww.example.com".
839 If Junkbuster was compiled with "pcre" support (default), Perl
840 compatible regular expressions can be used. See the pcre/docs/
841 direcory or "man perlre" (also available on
842 [31]http://www.perldoc.com/perl5.6/pod/perlre.html) for details. A
843 brief discussion of regular expressions is in the [32]Appendix. For
846 /.*/advert[0-9]+\.jpe?g - would match a URL from any domain, with any
847 path that includes "advert" followed immediately by one or more
848 digits, then a "." and ending in either "jpeg" or "jpg". So we match
849 "example.com/ads/advert2.jpg", and
850 "www.example.com/ads/banners/advert39.jpeg", but not
851 "www.example.com/ads/banners/advert39.gif" (no gifs in the example
854 Please note that matching in the path is case INSENSITIVE by default,
855 but you can switch to case sensitive at any point in the pattern by
856 using the "(?-i)" switch:
858 www.example.com/(?-i)PaTtErN.* - will match only documents whose path
859 starts with "PaTtErN" in exactly this capitalization.
860 _________________________________________________________________
864 Actions are enabled if preceded with a "+", and disabled if preceded
865 with a "-". Actions are invoked by enclosing the action name in curly
866 braces (e.g. {+some_action}), followed by a list of URLs to which the
867 action applies. There are three classes of actions:
869 * Boolean (e.g. "+/-block"):
870 {+name} # enable this action
871 {-name} # disable this action
873 * Parameterized (e.g. "+/-hide-user-agent"):
874 {+name{param}} # enable action and set parameter to "param"
875 {-name} # disable action
877 * Multi-value (e.g. "{+/-add-header{Name: value}}",
878 "{+/-wafer{name=value}}"):
879 {+name{param}} # enable action and add parameter "param"
880 {-name{param}} # remove the parameter "param"
881 {-name} # disable this action totally
883 If nothing is specified in this file, no "actions" are taken. So in
884 this case JunkBuster would just be a normal, non-blocking,
885 non-anonymizing proxy. You must specifically enable the privacy and
886 blocking features you need (although the provided default actionsfile
887 file will give a good starting point).
889 Later defined actions always over-ride earlier ones. For multi-valued
890 actions, the actions are applied in the order they are specified.
892 The list of valid Junkbuster "actions" are:
894 * Add the specified HTTP header, which is not checked for validity.
895 You may specify this many times to specify many different headers:
896 +add-header{Name: value}
898 * Block this URL totally.
901 * De-animate all animated GIF images, i.e. reduce them to their last
902 frame. This will also shrink the images considerably (in bytes,
903 not pixels!). If the option "first" is given, the first frame of
904 the animation is used as the replacement. If "last" is given, the
905 last frame of the animation is used instead, which propably makes
906 more sense for most banner animations, but also has the risk of
907 not showing the entire last frame (if it is only a delta to an
909 +deanimate-gifs{last}
910 +deanimate-gifs{first}
912 * "+downgrade" will downgrade HTTP/1.1 client requests to HTTP/1.0
913 and downgrade the responses as well. Use this action for servers
914 that use HTTP/1.1 protocol features that Junkbuster doesn't handle
915 well yet. HTTP/1.1 is only partially implemented. Default is not
916 to downgrade requests.
919 * Many sites, like yahoo.com, don't just link to other sites.
920 Instead, they will link to some script on their own server, giving
921 the destination as a parameter, which will then redirect you to
922 the final target. URLs resulting from this scheme typically look
923 like: http://some.place/some_script?http://some.where-else.
924 Sometimes, there are even multiple consecutive redirects encoded
925 in the URL. These redirections via scripts make your web browing
926 more traceable, since the server from which you follow such a link
927 can see where you go to. Apart from that, valuable bandwidth and
928 time is wasted, while your browser ask the server for one redirect
929 after the other. Plus, it feeds the advertisers.
930 The "+fast-redirects" option enables interception of these
931 requests by Junkbuster, who will cut off all but the last valid
932 URL in the request and send a local redirect back to your browser
933 without contacting the remote site.
936 * Filter the website through the re_filterfile:
939 * Block any existing X-Forwarded-for header, and do not add a new
943 * If the browser sends a "From:" header containing your e-mail
944 address, this either completely removes the header ("block"), or
945 changes it to the specified e-mail address.
947 +hide-from{spam@sittingduck.xqq}
949 * Don't send the "Referer:" (sic) header to the web site. You can
950 block it, forge a URL to the same server as the request (which is
951 preferred because some sites will not send images otherwise) or
952 set it to a constant string of your choice.
955 +hide-referer{http://nowhere.com}
957 * Alternative spelling of "+hide-referer". It has the same
958 parameters, and can be freely mixed with, "+hide-referer".
959 ("referrer" is the correct English spelling, however the HTTP
960 specification has a bug - it requires it to be spelled "referer".)
963 * Change the "User-Agent:" header so web servers can't tell your
964 browser type. Warning! This breaks many web sites. Specify the
965 user-agent value you want. Example, pretend to be using Netscape
967 +hide-user-agent{Mozilla (X11; I; Linux 2.0.32 i586)}
969 * Treat this URL as an image. This only matters if it's also
970 "+block"ed, in which case a "blocked" image can be sent rather
971 than a HTML page. See "+image-blocker{}" below for the control
972 over what is actually sent.
975 * Decides what to do with URLs that end up tagged with "{+block
976 +image}". There are 4 options. "-image-blocker" will send a HTML
977 "blocked" page, usually resulting in a "broken image" icon.
978 "+image-blocker{logo}" will send a "JunkBuster" image.
979 "+image-blocker{blank}" will send a 1x1 transparent GIF image. And
980 finally, "+image-blocker{http://xyz.com}" will send a HTTP
981 temporary redirect to the specified image. This has the advantage
982 of the icon being being cached by the browser, which will speed up
985 +image-blocker{blank}
986 +image-blocker{http://i.j.b/send-banner}
988 * By default (i.e. in the absence of a "+limit-connect" action),
989 Junkbuster will only allow CONNECT requests to port 443, which is
990 the standard port for https as a precaution.
991 The CONNECT methods exists in HTTP to allow access to secure
992 websites (https:// URLs) through proxies. It works very simply:
993 the proxy connects to the server on the specified port, and then
994 short-circuits its connections to the client and to the remote
995 proxy. This can be a big security hole, since CONNECT-enabled
996 proxies can be abused as TCP relays very easily.
997 If you want to allow CONNECT for more ports than this, or want to
998 forbid CONNECT altogether, you can specify a comma separated list
999 of ports and port ranges (the latter using dashes, with the
1000 minimum defaulting to 0 and max to 65K):
1001 +limit-connect{443} # This is the default and need no be
1003 +limit-connect{80,443} # Ports 80 and 443 are OK.
1004 +limit-connect{-3, 7, 20-100, 500-} # Port less than 3, 7, 20 to
1006 #and above 500 are OK.
1008 * "+no-compression" prevents the website from compressing the data.
1009 Some websites do this, which can be a problem for Junkbuster,
1010 since "+filter", "+no-popup" and "+gif-deanimate" will not work on
1011 compressed data. This will slow down connections to those
1012 websites, though. Default is "nocompression" is turned on.
1015 * Prevent the website from reading cookies:
1018 * Prevent the website from setting cookies:
1021 * Filter the website through a built-in filter to disable those
1022 obnoxious JavaScript pop-up windows via window.open(), etc. The
1023 two alternative spellings are equivalent.
1027 * This action only applies if you are using a jarfile for saving
1028 cookies. It sends a cookie to every site stating that you do not
1029 accept any copyright on cookies sent to you, and asking them not
1030 to track you. Of course, this is a (relatively) unique header they
1031 could use to track you.
1034 * This allows you to add an arbitrary cookie. It can be specified
1035 multiple times in order to add as many cookies as you like.
1038 The meaning of any of the above is reversed by preceding the action
1039 with a "-", in place of the "+".
1043 Turn off cookies by default, then allow a few through for specified
1046 # Turn off all cookies
1047 { +no-cookies-read }
1049 # Execeptions to the above, sites that need cookies
1050 { -no-cookies-read }
1057 # Alternative way of saying the same thing
1058 {-no-cookies-set -no-cookies-read}
1062 Now turn off "fast redirects", and then we allow two exceptions:
1067 # Reverse it for these two sites, which don't work right without it.
1069 www.ukc.ac.uk/cgi-bin/wac\.cgi\?
1072 Turn on page filtering, with one exception for sourceforge:
1074 # Run everything through the default filter file (re_filterfile):
1077 # But please don't re_filter code from sourceforge!
1079 .cvs.sourceforge.net
1081 Now some URLs that we want "blocked", ie we won't see them. Many of
1082 these use regular expressions that will expand to match multiple URLs:
1086 /.*/(.*[-_.])?ads?[0-9]?(/|[-_.].*|\.(gif|jpe?g))
1087 /.*/(.*[-_.])?count(er)?(\.cgi|\.dll|\.exe|[?/])
1088 /.*/(ng)?adclient\.cgi
1089 /.*/(plain|live|rotate)[-_.]?ads?/
1090 /.*/(sponsor)s?[0-9]?/
1091 /.*/_?(plain|live)?ads?(-banners)?/
1093 /.*/ad(sdna_image|gifs?)/
1094 /.*/ad(server|stream|juggler)\.(cgi|pl|dll|exe)
1098 /.*/adv((er)?ts?|ertis(ing|ements?))?/
1102 /.*/cgi-bin/centralad/getimage
1103 /.*/images/addver\.gif
1104 /.*/images/marketing/.*\.(gif|jpe?g)
1108 /.*/sponsors?[0-9]?/
1109 /.*/advert[0-9]+\.jpg
1116 /graphics/defaultAd/
1118 /image\.ng/transactionID
1119 /images/.*/.*_anim\.gif # alvin brattli
1120 /ip_img/.*\.(gif|jpe?g)
1124 /cgi-bin/nph-adclick.exe/
1125 /.*/Image/BannerAdvertising/
1127 /.*/adlib/server\.cgi
1129 _________________________________________________________________
1133 Custom "actions", known to Junkbuster as "aliases", can be defined by
1134 combining other "actions". These can in turn be invoked just like the
1135 built-in "actions". Currently, an alias can contain any character
1136 except space, tab, "=", "{" or "}". But please use only "a"- "z",
1137 "0"-"9", "+", and "-". Alias names are not case sensitive, and must be
1138 defined before anything else in actionsfile! And there can only be one
1139 set of "aliases" of defined.
1141 Now let's define a few aliases:
1143 # Useful customer aliases we can use later. These must come first!
1145 +no-cookies = +no-cookies-set +no-cookies-read
1146 -no-cookies = -no-cookies-set -no-cookies-read
1147 fragile = -block -no-cookies -filter -fast-redirects -hide-refere
1149 shop = -no-cookies -filter -fast-redirects
1150 +imageblock = +block +image
1151 #For people who don't like to type too much: ;-)
1154 c2 = -no-cookies-set +no-cookies-read
1155 c3 = +no-cookies-set -no-cookies-read
1156 #... etc. Customize to your heart's content.
1158 Some examples using our "shop" and "fragile" aliases from above:
1160 # These sites are very complex and require
1161 # minimal interference.
1163 .office.microsoft.com
1164 .windowsupdate.microsoft.com
1166 # Shopping sites - still want to block ads.
1169 .worldpay.com # for quietpc.com
1172 # These shops require pop-ups
1176 _________________________________________________________________
1178 3.3. The Filter File
1180 The filter file defines what filtering of web pages Junkbuster does.
1181 The default filter file is re_filterfile, located in the config
1182 directory. In this file, any document content, whether viewable text
1183 or embedded non-visible content, can be changed.
1185 This file uses regular expressions to alter or remove any string in
1186 the target page. Some examples from the included default
1189 Stop web pages from displaying annoying messages in the status bar by
1190 deleting such references:
1192 # The status bar is for displaying link targets, not pointless buzzwo
1194 # Again, check it out on http://www.airport-cgn.de/.
1195 s/status='.*?';*//ig
1197 Just for kicks, replace any occurrence of "Microsoft" with
1200 s/microsoft(?!.com)/MicroSuck/ig
1202 Kill those auto-refresh tags:
1204 # Kill refresh tags. I like to refresh myself. Manually.
1205 # check it out on http://www.airport-cgn.de/ and go to the arrivals p
1208 s/<meta[^>]*http-equiv[^>]*refresh.*URL=([^>]*?)"?>/<link rev="x-refr
1210 s/<meta[^>]*http-equiv="?page-enter"?[^>]*content=[^>]*>/<!--no page
1212 _________________________________________________________________
1214 4. Quickstart to Using Junkbuster
1216 Install package, then run and enjoy! Junbuster accepts only one
1217 command line option -- the configuration file to be used. Example Unix
1221 # /usr/sbin/junkbuster /etc/junkbuster/config &
1224 If no configuration file is specified on the command line, Junkbuster
1225 will look for a file named config in the current directory. Except on
1226 Amiga where it will look for AmiTCP:db/junkbuster/config and Win32
1227 where it will try junkbstr.txt. If no file is specified on the command
1228 line and no default configuration file can be found, Junkbuster will
1231 Be sure your browser is set to use the proxy which is by default at
1232 localhost, port 8000. With Netscape (and Mozilla), this can be set
1233 under Edit -> Preferences -> Advanced -> Proxies -> HTTP Proxy. For
1234 Internet Explorer: Tools > Internet Properties -> Connections -> LAN
1235 Setting. Then, check "Use Proxy" and fill in the appropriate info
1236 (Address: localhost, Port: 8000). Include if HTTPS proxy support too.
1238 The included default configuration files should give a reasonable
1239 starting point, though may be somewhat aggressive in blocking junk.
1240 You will probably want to keep an eye out for sites that require
1241 cookies, and add these to actionsfile as needed. By default, most of
1242 these will be blocked until you add them to the configuration. If you
1243 want the browser to handle this instead, you will need to edit
1244 actionsfile and disable this feature. If you use more than one
1245 browser, it would make more sense to let Junkbuster handle this. In
1246 which case, the browser(s) should be set to accept all cookies.
1248 If a particular site shows problems loading properly, try adding it to
1249 the {fragile} section of actionsfile. This will turn off most actions
1252 HTTP/1.1 support is not fully implemented. If browsers that support
1253 HTTP/1.1 (like Mozilla or recent versions of I.E.) experience
1254 problems, you might try to force HTTP/1.0 compatiblity. For Mozilla,
1255 look under Edit -> Preferences -> Debug -> Networking. Or set the
1256 "+downgrade" config option in actionsfile.
1258 After running Junkbuster for a while, you can start to fine tune the
1259 configuration to suit your personal, or site, preferences and
1260 requirements. There are many, many aspects that can be customized.
1262 If you encounter problems, please verify it is a Junkbuster bug, by
1263 disabling Junkbuster, and then trying the same page. Also, try another
1264 browser if possible to eliminate browser or site problems. Before
1265 reporting it as a bug, see if there is not a configuration option that
1266 is enabled that is causing the page not to load. You can then add an
1267 exception for that page or site. If a bug, please report it to the
1268 developers (see below).
1269 _________________________________________________________________
1271 5. Contact the Developers
1273 Feature requests and other questions should be posted to the
1274 [33]Feature request page at SourceForge. There is also an archive
1277 Anyone interested in actively participating in development and related
1278 discussions can join the appropriate mailing list [34]here. Archives
1279 are available here too.
1281 Please report bugs, using the form at [35]Sourceforge. Please try to
1282 verify that it is a Junkbuster bug, and not a browser or site bug
1283 first. Also, check to make sure this is not already a known bug.
1284 _________________________________________________________________
1286 6. Copyright and History
1290 Internet Junkbuster is free software; you can redistribute it and/or
1291 modify it under the terms of the GNU General Public License as
1292 published by the Free Software Foundation; either version 2 of the
1293 License, or (at your option) any later version.
1295 This program is distributed in the hope that it will be useful, but
1296 WITHOUT ANY WARRANTY; without even the implied warranty of
1297 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
1298 General Public License for more details, which is available from
1299 [36]the Free Software Foundation, Inc, 59 Temple Place - Suite 330,
1300 Boston, MA 02111-1307, USA.
1301 _________________________________________________________________
1305 Junkbuster was originally written by Anonymous Coders and
1306 [37]JunkBusters Corporation, and was released as free open-source
1307 software under the GNU GPL. [38]Stefan Waldherr made many
1308 improvements, and started the [39]SourceForge project to rekindle
1309 development. The last stable release was v2.0.2, which has now grown
1311 _________________________________________________________________
1315 [40]http://sourceforge.net/projects/ijbswa
1317 [41]http://ijbswa.sourceforge.net/
1319 [42]http://ijbswa.sourceforge.net/config/
1321 [43]http://www.junkbusters.com/ht/en/cookies.html
1323 [44]http://www.waldherr.org/junkbuster/
1325 [45]http://privacy.net/analyze/
1327 [46]http://www.squid-cache.org/
1328 _________________________________________________________________
1332 8.1. Regular Expressions
1334 Junkbuster can use "regular expressions" in various config files.
1335 Assuming support for "pcre" (Perl Compatible Regular Expressions) is
1336 compiled in, which is the default. Such configuration directives do
1337 not require regular expressions, but they can be used to increase
1338 flexibility by matching a pattern with wildcards against URLs.
1340 If you are reading this, you probably don't understand what "regular
1341 expressions" are, or what they can do. So this will be a very brief
1342 introduction only. A full explanation would require a book ;-)
1344 "Regular expressions" is a way of matching one character expression
1345 against another to see if it matches or not. One of the "expressions"
1346 is a literal string of readable characters (letter, numbers, etc), and
1347 the other is a complex string of literal characters combined with
1348 wildcards, and other special characters, called metacharacters. The
1349 "metacharacters" have special meanings and are used to build the
1350 complex pattern to be matched against. Perl Compatible Regular
1351 Expressions is an enhanced form of the regular expression language
1352 with backward compatibility.
1354 To make a simple analogy, we do something similar when we use wildcard
1355 characters when listing files with the dir command in DOS. *.* matches
1356 all filenames. The "special" character here is the asterik which
1357 matches any and all characters. We can be more specific and use ? to
1358 match just individual characters. So "dir file?.text" would match
1359 "file1.txt", "file2.txt", etc. We are pattern matching, using a
1360 similar technique to "regular expressions"!
1362 Regular expressions do essentially the same thing, but are much, much
1363 more powerful. There are many more "special characters" and ways of
1364 building complex patterns however. Let's look at a few of the common
1365 ones, and then some examples:
1367 . - Matches any single character, e.g. "a", "A", "4", ":", or "@".
1369 ? - The preceding character or expression is matched ZERO or ONE
1372 + - The preceding character or expression is matched ONE or MORE
1375 * - The preceding character or expression is matched ZERO or MORE
1378 \ - The "escape" character denotes that the following character should
1379 be taken literally. This is used where one of the special characters
1380 (e.g. ".") needs to be taken literally and not as a special
1383 [] - Characters enclosed in brackets will be matched if any of the
1384 enclosed characters are encountered.
1386 () - Pararentheses are used to group a sub-expression, or multiple
1389 | - The "bar" character works like an "or" conditional statement. A
1390 match is successful if the sub-expression on either side of "|"
1393 s/string1/string2/g - This is used to rewrite strings of text.
1394 "string1" is replaced by "string2" in this example.
1396 These are just some of the ones you are likely to use when matching
1397 URLs with Junkbuster, and is a long way from a definitive list. This
1398 is enough to get us started with a few simple examples which may be
1401 /.*/banners/.* - A simple example that uses the common combination of
1402 "." and "*" to denote any character, zero or more times. In other
1403 words, any string at all. So we start with a literal forward slash,
1404 then our regular expression pattern (".*") another literal forward
1405 slash, the string "banners", another forward slash, and lastly another
1406 ".*". We are building a directory path here. This will match any file
1407 with the path that has a directory named "banners" in it. The ".*"
1408 matches any characters, and this could conceivably be more forward
1409 slashes, so it might expand into a much longer looking path. For
1410 example, this could match:
1411 "/eye/hate/spammers/banners/annoy_me_please.gif", or just
1412 "/banners/annoying.html", or almost an infinite number of other
1413 possible combinations, just so it has "banners" in the path somewhere.
1415 A now something a little more complex:
1417 /.*/adv((er)?ts?|ertis(ing|ements?))?/ - We have several literal
1418 forward slashes again ("/"), so we are building another expression
1419 that is a file path statement. We have another ".*", so we are
1420 matching against any conceivable sub-path, just so it matches our
1421 expression. The only true literal that must match our pattern is adv,
1422 together with the forward slashes. What comes after the "adv" string
1423 is the interesting part.
1425 Remember the "?" means the preceding expression (either a literal
1426 character or anything grouped with "(...)" in this case) can exist or
1427 not, since this means either zero or one match. So
1428 "((er)?ts?|ertis(ing|ements?))" is optional, as are the individual
1429 sub-expressions: "(er)", "(ing|ements?)", and the "s". The "|" means
1430 "or". We have two of those. For instance, "(ing|ements?)", can expand
1431 to match either "ing" OR "ements?". What is being done here, is an
1432 attempt at matching as many variations of "advertisement", and
1433 similar, as possible. So this would expand to match just "adv", or
1434 "advert", or "adverts", or "advertising", or "advertisement", or
1435 "advertisements". You get the idea. But it would not match
1436 "advertizements" (with a "z"). We could fix that by changing our
1437 regular expression to: "/.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/",
1438 which would then match either spelling.
1440 /.*/advert[0-9]+\.(gif|jpe?g) - Again another path statement with
1441 forward slashes. Anything in the square brackets "[]" can be matched.
1442 This is using "0-9" as a shorthand expression to mean any digit one
1443 through nine. It is the same as saying "0123456789". So any digit
1444 matches. The "+" means one or more of the preceding expression must be
1445 included. The preceding expression here is what is in the square
1446 brackets -- in this case, any digit one through nine. Then, at the
1447 end, we have a grouping: "(gif|jpe?g)". This includes a "|", so this
1448 needs to match the expression on either side of that bar character
1449 also. A simple "gif" on one side, and the other side will in turn
1450 match either "jpeg" or "jpg", since the "?" means the letter "e" is
1451 optional and can be matched once or not at all. So we are building an
1452 expression here to match image GIF or JPEG type image file. It must
1453 include the literal string "advert", then one or more digits, and a
1454 "." (which is now a literal, and not a special character, since it is
1455 escaped with "\"), and lastly either "gif", or "jpeg", or "jpg". Some
1456 possible matches would include: "//advert1.jpg",
1457 "/nasty/ads/advert1234.gif", "/banners/from/hell/advert99.jpg". It
1458 would not match "advert1.gif" (no leading slash), or "/adverts232.jpg"
1459 (the expression does not include an "s"), or "/advert1.jsp" ("jsp" is
1460 not in the expression anywhere).
1462 s/microsoft(?!.com)/MicroSuck/i - This is a substitution. "MicroSuck"
1463 will replace any occurence of "microsoft". The "i" at the end of the
1464 expression means ignore case. The "(?!.com)" means the match should
1465 fail if "microsoft" is followed by ".com". In other words, this acts
1466 like a "NOT" modifier. In case this is a hyperlink, we don't want to
1469 We are barely scratching the surface of regular expressions here so
1470 that you can understand the default Junkbuster configuration files,
1471 and maybe use this knowledge to customize your own installation. There
1472 is much, much more that can be done with regular expressions. Now that
1473 you know enough to get started, you can learn more on your own :/
1475 More reading on Perl Compatible Regular expressions:
1476 [47]http://www.perldoc.com/perl5.6/pod/perlre.html
1480 1. http://ijbswa.sourceforge.net/user-manual/
1481 2. mailto:ijbswa-developers@lists.sourceforge.net
1482 3. file://localhost/home/swa/sf/current/doc/source/tmp.html#INTRODUCTION
1483 4. file://localhost/home/swa/sf/current/doc/source/tmp.html#AEN27
1484 5. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION
1485 6. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION-SOURCE
1486 7. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION-RH
1487 8. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION-SUSE
1488 9. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION-OS2
1489 10. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION-WIN
1490 11. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION-OTHER
1491 12. file://localhost/home/swa/sf/current/doc/source/tmp.html#CONFIGURATION
1492 13. file://localhost/home/swa/sf/current/doc/source/tmp.html#AEN152
1493 14. file://localhost/home/swa/sf/current/doc/source/tmp.html#ACTIONSFILE
1494 15. file://localhost/home/swa/sf/current/doc/source/tmp.html#FILTERFILE
1495 16. file://localhost/home/swa/sf/current/doc/source/tmp.html#QUICKSTART
1496 17. file://localhost/home/swa/sf/current/doc/source/tmp.html#CONTACT
1497 18. file://localhost/home/swa/sf/current/doc/source/tmp.html#COPYRIGHT
1498 19. file://localhost/home/swa/sf/current/doc/source/tmp.html#AEN1103
1499 20. file://localhost/home/swa/sf/current/doc/source/tmp.html#AEN1109
1500 21. file://localhost/home/swa/sf/current/doc/source/tmp.html#SEEALSO
1501 22. file://localhost/home/swa/sf/current/doc/source/tmp.html#APPENDIX
1502 23. file://localhost/home/swa/sf/current/doc/source/tmp.html#REGEX
1503 24. http://sourceforge.net/projects/ijbswa/
1504 25. http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/ijbswa/current/
1505 26. http://hobbes.nmsu.edu/cgi-bin/h-search?sh=1&button=Search&key=emxrt.zip&stype=all&sort=type&dir=%2Fpub%2Fos2%2Fdev%2Femx%2Fv0.9d
1506 27. http://hobbes.nmsu.edu/cgi-bin/h-search?sh=1&key=gnupack&stype=all&sort=type&dir=%2Fpub%2Fos2%2Fapps
1507 28. http://www.gnu.org/
1508 29. file://localhost/home/swa/sf/current/doc/source/tmp.html#ACTIONSFILE
1509 30. http://i.j.b/show-url-info
1510 31. http://www.perldoc.com/perl5.6/pod/perlre.html
1511 32. file://localhost/home/swa/sf/current/doc/source/tmp.html#REGEX
1512 33. http://sourceforge.net/tracker/?atid=361118&group_id=11118&func=browse
1513 34. http://sourceforge.net/mail/?group_id=11118
1514 35. http://sourceforge.net/tracker/?group_id=11118&atid=111118
1515 36. http://www.gnu.org/copyleft/gpl.html
1516 37. http://www.junkbusters.com/ht/en/ijbfaq.html
1517 38. http://www.waldherr.org/junkbuster/
1518 39. http://sourceforge.net/projects/ijbswa/
1519 40. http://sourceforge.net/projects/ijbswa
1520 41. http://ijbswa.sourceforge.net/
1521 42. http://ijbswa.sourceforge.net/config/
1522 43. http://www.junkbusters.com/ht/en/cookies.html
1523 44. http://www.waldherr.org/junkbuster/
1524 45. http://privacy.net/analyze/
1525 46. http://www.squid-cache.org/
1526 47. http://www.perldoc.com/perl5.6/pod/perlre.html