268 lines
11 KiB
Text
268 lines
11 KiB
Text
= Security analysis of irker =
|
|
|
|
This is an analysis of security and DoS vulnerabilities associated
|
|
with irker, exploring and explaining certain design choices. Much of
|
|
it derives from a code audit and report by Daniel Franke.
|
|
|
|
== Assumptions and Goals ==
|
|
|
|
We begin by stating some assumptions about how irker will be deployed,
|
|
and articulating a set of security goals.
|
|
|
|
Communication flow in an irker deployment will look like this:
|
|
|
|
-----------------------------------------------------------------------------
|
|
Committers
|
|
|
|
|
|
|
|
Version-control repositories
|
|
|
|
|
|
|
|
irkerhook.py
|
|
|
|
|
|
|
|
irkerd
|
|
|
|
|
|
|
|
IRC servers
|
|
-----------------------------------------------------------------------------
|
|
|
|
Here are our assumptions:
|
|
|
|
1. The repositories are hosted on a public forge sites such as
|
|
SourceForge, GitHub, Gitorious, Savannah, or Gna and must be
|
|
accessible to untrusted users.
|
|
|
|
2. Repository project owners can set properties on their repositories
|
|
(including but not limited to irker.*), and may be able to set custom
|
|
post-commit hooks which can execute arbitrary code on the repository
|
|
server. In particular, these people my be able to modify the local
|
|
copy of irkerhook.py.
|
|
|
|
3. The machine which hosts irkerd has the same owner as the machine which
|
|
hosts the the repo; these machines are possibly but not necessarily
|
|
one and the same.
|
|
|
|
4. The network is protected by a perimeter firewall, and only a
|
|
trusted group is able to emit arbitrary packets from inside the
|
|
perimeter; committers are not necessarily part of this group.
|
|
|
|
5. irkerd communicates with IRC servers over the open internet,
|
|
and an IRC server's administrator is assumed to hold no position of
|
|
trust with any other party.
|
|
|
|
We can, accordingly, identify the following groups of security
|
|
principals:
|
|
|
|
A. irker administrators.
|
|
B. Project committers.
|
|
C. Project owners
|
|
D. IRC server administrators.
|
|
E. Other people on irker's internal network.
|
|
F. irkerd-IRC men-in-the-middle (i.e. people who control the network path
|
|
between irkerd and the IRC server).
|
|
G. Random people on the internet.
|
|
|
|
Our security goals for irker can be enumerated as follows:
|
|
|
|
* Control: We don't want anyone outside group A gaining control of
|
|
the machines which host irkerd or the git repos.
|
|
|
|
* Availability: Only group A should be able to to deny or degrade
|
|
irkerd's ability to receive commit messages and relay them to the
|
|
IRC server. We recognize and accept as inevitable that MITMs (groups
|
|
E and F) can do this too (by ARP spoofing, cable-cutting, etc.).
|
|
But, in particular, we would like irker-mediated services to be
|
|
resilient against DoS (denial of service) attacks.
|
|
|
|
* Authentication/integrity: Notifications should be truthful, i.e.,
|
|
commit messages sent to IRC channels should actually reflect that a
|
|
corresponding commit has taken place. We accept that groups A, C,
|
|
D, and E can violate this property.
|
|
|
|
* Secrecy: irker shouldn't aid spammers (group G) in harvesting
|
|
committers' email addresses.
|
|
|
|
* Auditability: If people abuse irkerd, we want to be able to identify
|
|
the abusive account or IP address.
|
|
|
|
== Control Issues ==
|
|
|
|
We have audited the irker and irkerhook.py code for exploitable
|
|
vulnerabilities. We have not found any in the code itself, and the
|
|
use of Python gives us confidence in the absence of large classes of errors
|
|
(such as buffer overruns) that afflict C programs.
|
|
|
|
However, the fact that irkerhook.py relies on external binaries to
|
|
mine data out of its repository opens up a well-known set of
|
|
vulnerabilities if a malicious user is able to insert binaries in a
|
|
carelessly-set execution path. Normal precautions against this should
|
|
be taken.
|
|
|
|
== Availability ==
|
|
|
|
=== Solved problems ===
|
|
|
|
When the original implementation of irkerd saw a nick collision it
|
|
generated new nicks in a predictable sequence. A malicious IRC user
|
|
could have continuously changed his own nick to the next one that
|
|
irkerd is going to try. Some randomness has been added to nick
|
|
generation to prevent this.
|
|
|
|
=== Unsolved problems ===
|
|
|
|
DoS attacks on any networked application can never completely
|
|
prevented, only mitigated by forcing attackers to invest more
|
|
resources. Here we consider the easiest attack paths against irker,
|
|
and possible countermeasures.
|
|
|
|
irker handles each connection to a particular IRC server in a separate
|
|
thread - actually, due to server limits on open channels per
|
|
connection, there may be multiple sessions per server. This may not
|
|
scale well, especially on 32-bit architectures.
|
|
|
|
Thread instance overhead, combined with the lack of any restriction on
|
|
how many URLs can appear in the 'to' list, is a DoS vulnerability. If
|
|
a repository's properties specify that notifications should go to more
|
|
than about 500 unique hostnames, then on 32-bit architectures we'll
|
|
hit the 4GB cap on virtual memory (even while the resident set size
|
|
remains small).
|
|
|
|
Another ceiling to watch out for is the ulimit on file descriptors,
|
|
which defaults to 1024 on many Linux systems but can safely be set
|
|
much larger. Each connection instance costs a file descriptor.
|
|
|
|
We consider some possible ways of addressing the problem:
|
|
|
|
1. Limit the number of URLs in a request. Pretty painless - it will
|
|
be very rare that anyone wants to specify a larger set than a project
|
|
channel plus freenode #commits - but also ineffective. A malicious
|
|
hook could achieve DoS simply by spamming lots of requests.
|
|
|
|
2. Limit the total number of requests than can be queued. Completely
|
|
ineffective - just sets a target for the DoS attack.
|
|
|
|
3. Limit the number of requests that can be queued by source IP address.
|
|
This might be worth doing; it would stymie a single-source DoS attack through
|
|
a publicly-exposed irkerd, though not a DDoS by a botnet. But there isn't
|
|
a lot of win here for a properly installed irker (e.g. behind a firewall),
|
|
which is typically going to get all its requests from a single repo host
|
|
anyway.
|
|
|
|
4. Rate-limit requests by source IP address - that is, after any request
|
|
discard additional ones during some timeout period. Again, good for
|
|
stopping a single-source DoS against an exposed irker, won't stop a
|
|
DDoS. The real problem though, is that any such rate limit might interfere
|
|
with legitimate high-volume use by a very active repo site.
|
|
|
|
After this we appear to have run out of easy options, as source IP address
|
|
is the only thing irkerd can see that an attacker can't spoof.
|
|
|
|
We mitigate some availability risks by reaping old sessions when we're
|
|
near resource limits. An ordinary DoS attack would then be prevented
|
|
from completely blocking all message traffic; the cost would be a
|
|
whole lot of join/leave spam due to connection churn.
|
|
|
|
== Authentication/Integrity ==
|
|
|
|
One way to help prevent DoS attacks would be in-band authentication -
|
|
requiring irkerd submitters to present a credential along with each
|
|
message submission. In principle this, if it existed, could also be used
|
|
to verify that a submitter is authorized to issue notifications with
|
|
respect to a given project.
|
|
|
|
We rejected this approach. The design goal for irker was to make
|
|
submissions fast, cheap, and stateless; baking an authentication
|
|
system directly into the irkerd codebase would have conflicted with
|
|
these objectives, not to mention probably becoming the camel's nose
|
|
for a godawful amount of code bloat.
|
|
|
|
The deployment advice in the installation instructions assumes that
|
|
irkerd submitters are "authenticated" by being inside a firewall - that is,
|
|
mesages are issued from an intranet and it can be trusted that anyone
|
|
issuing messages from within a given intranet is authorized to do so.
|
|
This fits the assumption that irker instances will run on forge sites
|
|
receiving requests from instances of irkerhook.py.
|
|
|
|
One larger issue (not unique to irker) is that because of the
|
|
insecured nature of IRC it is essentially impossible to secure
|
|
#commits against commit notifications that are either garbled by
|
|
software errors and misconfigurations or maliciously crafted to
|
|
confuse anyone attempting to gather statistics from that channel. The
|
|
lesson here is that IRC monitoring isn't a good method for that
|
|
purpose; going direct to the repositories via a toolkit such as Ohloh
|
|
is a far better idea.
|
|
|
|
When this analysis was originally written, we recommended using spiped
|
|
or stunnel to solve the problem of passing notifications from irkerd
|
|
to IRC servers over a potentially hostile network that might interfere
|
|
with them. Later, SSL/TLS support proved easy to add and is now in
|
|
irkerd itself.
|
|
|
|
== Secrecy ==
|
|
|
|
irkerd has no inherent secrecy risks.
|
|
|
|
The distributed version of irkerhook.py removes the host part of
|
|
author addresses specifically in order to prevent address harvesting
|
|
from the notifications.
|
|
|
|
== Auditability ==
|
|
|
|
We previously noted that source IP address is the only thing irker can
|
|
see that an attacker can't spoof. This makes auditability difficult
|
|
unless we impose conventions on the notifications passing though it.
|
|
|
|
The irkerhook.py that we ship inherits an auditability property from
|
|
the CIA service it was designed to replace: the first field of every
|
|
notification (terminated by a colon) is the name of the issuing
|
|
project. The only other competitor to replace CIA known to us
|
|
(kgb_bot) shares this property.
|
|
|
|
In the general case we cannot guarantee this property against
|
|
groups A and F.
|
|
|
|
== Risks relative to centralized services ==
|
|
|
|
irker and irkerhook.py were written as a replacement for the
|
|
now-defunct CIA notification service. The author has written
|
|
a critique of that service: "CIA and the perils of overengineering"
|
|
at <http://esr.ibiblio.org/?p=4540>. It is thus worth considering how
|
|
a risk assessment of CIA compares to this one.
|
|
|
|
The principal advantages of CIA from a security point of view were (a)
|
|
it provided a single point at which spam filtering and source blocking
|
|
could be done with benefit to all projects using the service, and (b)
|
|
since it had to have a database anyway for routing messages to project
|
|
channels, the incremental overhead for an authentication feature would
|
|
have been relatively low.
|
|
|
|
As a matter of fact rather than theory CIA never fully exploited
|
|
either possibility. Anyone could create a CIA project entry with
|
|
fanout to any desired set of IRC channels. Notifications were not
|
|
authenticated, so anyone could masquerade as a member of any project.
|
|
The only check on abuse was human intervention to source-block
|
|
spammers, and this was by no means completely effective - spam shipped
|
|
via CIA was occasionally seen on on the freenode #commits channel.
|
|
|
|
The principal security disadvantage of CIA was that it meant the
|
|
entire notification system was subject to single-point failure due
|
|
to software or hosting failures on cia.vc, or to DoS attacks
|
|
against the server. While there is no evidence that the site
|
|
was ever deliberately DoSed, failures were sufficiently common
|
|
that a half-hearted DoS attack might not have been even noticed.
|
|
|
|
Despite the absence of authentication, irker instances on
|
|
properly firewalled intranets do not obviously pose additional
|
|
spamming risks beyond those incurred by the CIA service. The
|
|
overall robustness of the notification system as a whole should
|
|
be greatly improved.
|
|
|
|
== Conclusions ==
|
|
|
|
The security and DoS issues irker has are not readily addressable by
|
|
changing the irker codebase itself, short of a complete (much more
|
|
complex and heavyweight) redesign. They are largely implicit risks of
|
|
its operating environment and must be managed by properly controlling
|
|
access to irker instances.
|
|
|