Options · Checking Options · Installation · Copyright · (FAQ)
Development of JunkBuster is ongoing and this document is no longer current. However, it may provide some assistance. If you have problems, please use the Yahoo Groups mailing list (which includes an archive of mail), the SourceForge.net project page, or see the project's home page. Please also bear in mind that versions 2.9.x of JunkBuster are development releases, and are not production quality.
A copy of this page in standard man
macro
format is included in the tar
archive.
junkbuster
- The Internet Junkbuster
Proxy
TM
junkbuster
configfile (Unix)
junkbstr.exe
[configfile]
(Windows)
junkbuster
is an instrumentable proxy
that filters the HTTP stream between web servers and browsers.
Its main purposes are to block adverts and enhance privacy.
It is configured using a configuration
file and several files listing URL patterns. The
configuration file must be specified on the command line.
The Windows version will default to using the configuration
file junkbstr.ini
if it exists and no argument was
given.
All files except the main configuration file are checked for changes before each page is fetched, so they may edited without restarting the proxy.
blockfile
blockfileBlock requests to
URLs matching any pattern given in the lines of the
blockfile. The junkbuster
instead
returns status 202, indicating that the request has been
accepted (though not completed), and a message identifying itself (though
the browser may display only a broken image icon).
The syntax of a pattern is
[domain][:port][/path]
(the http://
or
https://
protocol part is omitted). To decide
if a pattern matches a target, the domains are compared
first, then the paths.
To compare the domains, the
pattern domain and the target domain specified in the URL
are each broken into their components. (Components are
separated by the .
(period) character.) Next
each of the target components is compared with the
corresponding pattern component: last with last,
next-to-last with next-to-last, and so on. (This is called
right-anchored matching.) If all of the
pattern components find their match in the target, then the
domains are considered a match. Case is irrelevant when
comparing domain components.
A successfully matching pattern
can be an anchored substring of a target, but not vice
versa. Thus if a pattern doesn't specify a domain, it
matches all domains. Furthermore,
when comparing two components, the components must either
match in their entirety or up to a wildcard *
(star character) in the pattern. The wildcard feature
implements only a "prefix" match capability ("abc*" vs.
"abcdefg"), not suffix matching ("*efg" vs. "abcdefg") or
infix matching ("abc*efg" vs. "abcdefg"). The feature is
restricted to the domain component; it is unrelated to the
optional regular expression feature in the path (described below).
If a numeric port is specified in the pattern domain, then the target port must match as well. The default port in a target is port 80.
If the domain and port match, then
the target URL path is checked for a match against the path
in the pattern. Paths are compared with a simple
case-sensitive left-anchored substring comparison. Once
again, the pattern can be an anchored substring of the
target, but not vice versa. A path of /
(slash) would match all paths. Wildcards are not considered
in path comparisons.
For example, the target URL
the.yellow-brick-road.com/TinMan/has_no_brain
would be matched (and blocked) by the following
patterns
yellow-brick-road.com
and
Yellow*.COM
and
/TinM
but not
follow.the.yellow-brick-road.com
or
/tinman
Comments in a blockfile start
with a #
(hash) character and end at a new
line. Blank lines are also ignored.
Lines beginning with a
~
(tilde) character are taken to be exceptions: a URL blocked by
previous patterns that matches the rest of the line is let
through. (The last match wins.)
Patterns may contain POSIX regular expressions provided the
junkbuster
was compiled with this
option (the default in Version 2.0 on). The idiom
/*.*/ad
can then be used to match any URL containing
/ad
(such as
http://nomatterwhere.com/images/advert/g3487.gif
for
example). These expressions don't
work in the domain part.
In version 1.3 and later the blockfile and cookiefile are checked for changes before each request.
wafer
NAME=VALUESpecifies a pair to be sent as a cookie with every request to the server. (Such boring cookies are called wafers.) This option may be called more than once to generate multiple wafers. The original Netscape specification prohibited semi-colons, commas and white space; these characters will be URL-encoded if used in wafers. The Path and Domain attributes are not currently supported.
cookiefile
cookiefileEnforce the cookie management policy specified in the cookiefile. If this option is not used all cookies are silently crunched, so that users who never want cookies aren't bothered by browsers asking whether each cookie should be accepted. However, cookies can still get through via JavaScript and SSL, so alerts should be left on.
In Version 1.2 and later this
option must be followed by a
filename containing instructions on which sites are
allowed to receive and set cookies. By
default cookies are dropped in both the browser's request
and the server's response, unless the URL requested matches
an entry in the cookiefile. The matching algorithm
is the same as for the blockfile. A leading
>
character allows server-bound cookies only; a
<
allows only browser-bound cookies; a
~
character stops cookies in both directions. Thus a
cookiefile containing a single line with the two characters
>*
will pass on all cookies to servers but
not give any new ones to the browser.
jarfile
jarfileAll Set-cookie attempts by the server are logged to jarfile. If no wafer is specified, one containing a canned notice (the vanilla wafer) is added as an alert to the server unless the suppress-vanilla-wafer option is invoked.
suppress-vanilla-wafer
Suppress the vanilla wafer.
from
fromIf the browser discloses an
email address in the FROM
header (most
don't), replace it with from. If from is set
to . (the period character) the FROM
is
passed to the server unchanged. The default is to delete
the FROM
header.
referer
refererWhenever the browser discloses the URL that led to the current request, replace it with referer. If referer is set to . (period) the URL is passed to the server unchanged. If referer is set to @ (at) the URL is sent in cases where the cookiefile specifies that a cookie would be sent. (No way to send bogus referers selectively is provided.) The default is to delete Referer.
Junkbuster also accepts the
spelling referrer
, which most dictionaries
consider correct.
user-agent
user-agentInformation disclosed by the browser about itself is replaced with the
value user-agent. If user-agent is set to
. (period) the User-Agent
header is passed
to the server unchanged, along with any UA
headers produced by MS-IE (which would otherwise be
deleted). If user-agent is set to @ (at)
these headers are sent unchanged in cases where the
cookiefile specifies that a cookie would be sent, otherwise
only default User-Agent
header is sent. That
default is Mozilla/3.0 (Netscape) with an unremarkable Macintosh configuration. If
used with a browser less advanced than Mozilla/3.0 or IE-3,
the default may encourage pages containing extensions that
confuse the browser.
listen-address
[host][:port]If host is specified, bind the
junkbuster
to that IP address. If a port
is specified, use it. The default port is 8000; the default
host is localhost
.
This default host setting means that you can only connect to the proxy from ther local computer. This is a security measure - if you allow anyone to use the proxy, then hackers or fraudsters could use it to help hide their identity. It also provides a lot of protection against any undiscovered security flaws in JunkBuster - if they can't connect to it, then they can't attack it.
If you change this value, we recommend you either
set the host to localhost
:
listen-address
localhost:8080
or, if you want to share a single internet
connection over your internal network, then set it to the
address of your internal ethernet card:
listen-address
10.1.1.1:8080
(replace 10.1.1.1 with your internal IP address),
or set up an aclfile. To
make the proxy accessible from everywhere (e.g. if you're
using an access control list or if you just don't care
about security), specify just the port number - e.g:
listen-address :8000
(This binds the proxy to all IP addresses
(INADDR_ANY
)).
forwardfile
forwardfileJunkbuster has a flexible syntax for forwarding HTTP requests. This is used e.g. if you are behind a firewall and need to connect through it, or if you want to use a cacheing proxy to speed up your web browsing.
Every line in the forwardfile consists of four
components, seperated by whitespace. These are:
target forward_to via_gateway_type
gateway
target is a pattern used to select which line of
the forwardfile is used. "*
" is the most
commonly used value, and matches every URL. As usual, the
last matching target wins. (If no pattern matches, a
direct connection will be used)
forward_to specifies the HTTP proxy server to
use, or ".
" for none. This is used to connect
to a cacheing proxy such as Squid, and for most types of
firewall. The port number defaults to 8000 if it is not
specified.
Here is a typical line.
* lpwa.com:8000 . .
The target domain need not be a fully qualified
hostname; it can be a general domain such as
com
or co.uk
or even just a port
number. For example, because LPWA does not handle SSL, the line above will
typically be followed by a line such as
:443 . . .
to allow SSL transactions to proceed directly. The cautious would also add an entry in their blockfile to stop transactions to port 443 for all but specified trusted sites.
Configure with care: no loop detection is performed. When setting up chains of proxies that might loop back, try adding Squid.
via_gateway_type and gateway are used to
support SOCKS proxies. Some firewalls provide this type of
proxy. If you do not not want to use a SOCKS proxy, specify
both of these fields as ".
".
Note that
JunkBuster is a SOCKS client, not a SOCKS
server. The user's browser should not be configured to use
SOCKS
; the proxy conducts the negotiations, not the
browser.
The SOCKS4
protocol may be specified by
setting via_gateway_type to socks
or
socks4
. The SOCKS4A
protocol is
specified as socks4a
. The SOCKS5
protocol is not currently supported.
gateway should be the host and port of the SOCKS server. If you just specify a hostname, then the port number defaults to 1080.
The user identification capabilities of
SOCKS4
are deliberately not used; the user is always
identified to the SOCKS
server as
userid=anonymous
. If the server's policy is to
reject requests from anonymous
, the proxy will
not work. Use a debug value of 3 to see
the status returned by the server.
If you specify both a HTTP proxy (with forward_to) and a SOCKS proxy (with gateway) then the SOCKS proxy is used to connect to the HTTP proxy. If you just specify a SOCKS proxy, it is used to connect directly to the websites.
debug
NSet debug mode. The most common value is 1, to pinpoint offensive URLs, so they
can be added to the blockfile. The value of N is a
bitwise logical-OR of the following values:
1 = URLs (show each URL requested by the browser);
2 = Connections (show each connection to or from the
proxy);
4 = I/O (log I/O errors);
8 = Headers (as each header is scanned, show the header
and what is done to it);
16 = Log everything (including debugging traces and the
contents of the pages).
32 = Record accesses in Common Log Format, as used by most
web and proxy servers.
Multiple debug
lines are
permitted; they are logical OR-ed together.
Because most browsers send several requests in parallel the debugging output may appear intermingled, so the single-threaded option is recommended when using debug with N greater than 1.
add-forwarded-header
Add X-Forwarded-For
headers to the
server-bound HTTP stream indicating the client IP address
to the server, in the new
style of Squid 1.1.4. If you want the
traditional HTTP_FORWARDED
response header,
add it manually with the -x option. This
also allows other X-Forwarded-For
headers to
be transmitted - usually they are discarded.
add-header
HeaderTextAdd the HeaderText verbatim to requests to the
server. Typical uses include adding old-style forwarding
notices such as Forwarded: by
http://pro-privacy-isp.net
and reinstating the
Proxy-Connection: Keep-Alive
header (which the
junkbuster
deletes so as not to reveal its existence). No
checking is done for correctness or plausibility, so it can
be used to throw any old trash into the server-bound HTTP
stream. Please don't litter.
single-threaded
Doesn't fork()
a separate process (or
create a separate thread) to handle each connection. Useful
when debugging to keep the process single threaded.
logfile
logfileWrite all debugging data into logfile. The default logfile is the standard output.
aclfile
aclfileUnless this option is used, the proxy talks to anyone who can connect to it, and everyone who can has equal permissions on where they can go. An access file allows restrictions to be placed on these two policies, by distinguishing some source IP addresses and/or some destination addresses. (If a forwarder or a gateway is being used, its address is considered the destination address, not the ultimate IP address of the URL requested.)
Each line of the access file begins
with either the word permit
or
deny
followed by source and (optionally) destination
addresses to be matched against those of the HTTP request.
The last matching line specifies the result: if it was a
deny
line or if no line matched, the request
will be refused.
A source or destination can be
specified as a single numeric IP address, or with a
hostname, provided that the host's name can be resolved to
a numeric address: this cannot be used to block all
.mil
domains for example, because there is no single
address associated with that domain name. Either form may
be followed by a slash and an integer N
,
specifying a subnet mask of N
bits. For
example, permit 207.153.200.72/24
matches the
entire Class-C subnet from 207.153.200.0 through
207.153.200.255. (A netmask of 255.255.255.0 corresponds to
24 bits of ones in the netmask, as with
*_MASKLEN=24
.) A value of 16 would be used for a
Class-B subnet. A value of zero for N
in the
subnet mask length will cause any address to match; this
can be used to express a default rule. For more information
see the example file provided with the distribution.
If you like these access controls you should probably have firewall; they are not intended to replace one.
trustfile
trustfileThis feature is experimental, has not been fully documented and is very subject to change. The goal is for parents to be able to choose a page or site whose links they regard suitable for their young children and for the proxy to allow access only to sites mentioned there. To do this the proxy examines the referer variable on each page request to check they resulted from a click on the ``trusted referer'' site: if so the referred site is added to a list of trusted sites, so that the child can then move around that site. There are several uncertainties in this scheme that experience may be able to iron out; check back in the months ahead.
trust_info_url
trust_info_urlWhen access is denied due to lack of a trusted referer, this URL is displayed with a message pointing the user to it for further information.
hide-console
In the Windows command-line version only, instructs the program to disconnect from and hide the command console after starting.
Browsers must be told where to find the
junkbuster
(e.g. localhost
port 8000).
To set the HTTP proxy in Netscape 3.0, go through: Options; Network Preferences; Proxies; Manual Proxy
Configuration; View. See the FAQ for other browsers. The Security Proxy should also be set to
the same values, otherwise shttp:
URLs won't
work.
Note the limitations explained in the FAQ.
To allow users to check that
a junkbuster
is running and how it is
configured, it intercepts requests for any URL ending in
/show-proxy-args
and blocks it, returning instead
returns information on its version number and current
configuration including the contents of its blockfile. To get
an explicit warning that no junkbuster
intervened if the proxy was not configured, it's best to point
it to a URL that does this, such as
http://internet.junkbuster.com/cgi-bin/show-proxy-args on
Junkbusters's website.
http://www.junkbusters.com/ht/en/ijbfaq.html
http://www.junkbusters.com/ht/en/cookies.html
http://internet.junkbuster.com/cgi-bin/show-proxy-args
http://www.cis.ohio-state.edu/htbin/rfc/rfc2109.html
http://squid.nlanr.net/Squid/
http://www-math.uni-paderborn.de/~axel/
Written and copyright by the Anonymous Coders and Junkbusters Corporation and made available under the GNU General Public License (GPL). This software comes with NO WARRANTY. Internet Junkbuster Proxy is a trademark of Junkbusters Corporation.
Copyright © 1996-8 Junkbusters ® Corporation. Copyright © 2001 Jon Foster. Copying and distribution permitted under the GNU General Public License.