summaryrefslogtreecommitdiffstats
path: root/security.adoc
diff options
context:
space:
mode:
Diffstat (limited to 'security.adoc')
-rw-r--r--security.adoc268
1 files changed, 268 insertions, 0 deletions
diff --git a/security.adoc b/security.adoc
new file mode 100644
index 0000000..5a652d2
--- /dev/null
+++ b/security.adoc
@@ -0,0 +1,268 @@
+= Security analysis of irker =
+
+This is an analysis of security and DoS vulnerabilities associated
+with irker, exploring and explaining certain design choices. Much of
+it derives from a code audit and report by Daniel Franke.
+
+== Assumptions and Goals ==
+
+We begin by stating some assumptions about how irker will be deployed,
+and articulating a set of security goals.
+
+Communication flow in an irker deployment will look like this:
+
+-----------------------------------------------------------------------------
+ Committers
+ |
+ |
+ Version-control repositories
+ |
+ |
+ irkerhook.py
+ |
+ |
+ irkerd
+ |
+ |
+ IRC servers
+-----------------------------------------------------------------------------
+
+Here are our assumptions:
+
+1. The repositories are hosted on a public forge sites such as
+SourceForge, GitHub, Gitorious, Savannah, or Gna and must be
+accessible to untrusted users.
+
+2. Repository project owners can set properties on their repositories
+(including but not limited to irker.*), and may be able to set custom
+post-commit hooks which can execute arbitrary code on the repository
+server. In particular, these people my be able to modify the local
+copy of irkerhook.py.
+
+3. The machine which hosts irkerd has the same owner as the machine which
+hosts the the repo; these machines are possibly but not necessarily
+one and the same.
+
+4. The network is protected by a perimeter firewall, and only a
+trusted group is able to emit arbitrary packets from inside the
+perimeter; committers are not necessarily part of this group.
+
+5. irkerd communicates with IRC servers over the open internet,
+and an IRC server's administrator is assumed to hold no position of
+trust with any other party.
+
+We can, accordingly, identify the following groups of security
+principals:
+
+A. irker administrators.
+B. Project committers.
+C. Project owners
+D. IRC server administrators.
+E. Other people on irker's internal network.
+F. irkerd-IRC men-in-the-middle (i.e. people who control the network path
+ between irkerd and the IRC server).
+G. Random people on the internet.
+
+Our security goals for irker can be enumerated as follows:
+
+* Control: We don't want anyone outside group A gaining control of
+ the machines which host irkerd or the git repos.
+
+* Availability: Only group A should be able to to deny or degrade
+ irkerd's ability to receive commit messages and relay them to the
+ IRC server. We recognize and accept as inevitable that MITMs (groups
+ E and F) can do this too (by ARP spoofing, cable-cutting, etc.).
+ But, in particular, we would like irker-mediated services to be
+ resilient against DoS (denial of service) attacks.
+
+* Authentication/integrity: Notifications should be truthful, i.e.,
+ commit messages sent to IRC channels should actually reflect that a
+ corresponding commit has taken place. We accept that groups A, C,
+ D, and E can violate this property.
+
+* Secrecy: irker shouldn't aid spammers (group G) in harvesting
+ committers' email addresses.
+
+* Auditability: If people abuse irkerd, we want to be able to identify
+ the abusive account or IP address.
+
+== Control Issues ==
+
+We have audited the irker and irkerhook.py code for exploitable
+vulnerabilities. We have not found any in the code itself, and the
+use of Python gives us confidence in the absence of large classes of errors
+(such as buffer overruns) that afflict C programs.
+
+However, the fact that irkerhook.py relies on external binaries to
+mine data out of its repository opens up a well-known set of
+vulnerabilities if a malicious user is able to insert binaries in a
+carelessly-set execution path. Normal precautions against this should
+be taken.
+
+== Availability ==
+
+=== Solved problems ===
+
+When the original implementation of irkerd saw a nick collision it
+generated new nicks in a predictable sequence. A malicious IRC user
+could have continuously changed his own nick to the next one that
+irkerd is going to try. Some randomness has been added to nick
+generation to prevent this.
+
+=== Unsolved problems ===
+
+DoS attacks on any networked application can never completely
+prevented, only mitigated by forcing attackers to invest more
+resources. Here we consider the easiest attack paths against irker,
+and possible countermeasures.
+
+irker handles each connection to a particular IRC server in a separate
+thread - actually, due to server limits on open channels per
+connection, there may be multiple sessions per server. This may not
+scale well, especially on 32-bit architectures.
+
+Thread instance overhead, combined with the lack of any restriction on
+how many URLs can appear in the 'to' list, is a DoS vulnerability. If
+a repository's properties specify that notifications should go to more
+than about 500 unique hostnames, then on 32-bit architectures we'll
+hit the 4GB cap on virtual memory (even while the resident set size
+remains small).
+
+Another ceiling to watch out for is the ulimit on file descriptors,
+which defaults to 1024 on many Linux systems but can safely be set
+much larger. Each connection instance costs a file descriptor.
+
+We consider some possible ways of addressing the problem:
+
+1. Limit the number of URLs in a request. Pretty painless - it will
+be very rare that anyone wants to specify a larger set than a project
+channel plus freenode #commits - but also ineffective. A malicious
+hook could achieve DoS simply by spamming lots of requests.
+
+2. Limit the total number of requests than can be queued. Completely
+ineffective - just sets a target for the DoS attack.
+
+3. Limit the number of requests that can be queued by source IP address.
+This might be worth doing; it would stymie a single-source DoS attack through
+a publicly-exposed irkerd, though not a DDoS by a botnet. But there isn't
+a lot of win here for a properly installed irker (e.g. behind a firewall),
+which is typically going to get all its requests from a single repo host
+anyway.
+
+4. Rate-limit requests by source IP address - that is, after any request
+discard additional ones during some timeout period. Again, good for
+stopping a single-source DoS against an exposed irker, won't stop a
+DDoS. The real problem though, is that any such rate limit might interfere
+with legitimate high-volume use by a very active repo site.
+
+After this we appear to have run out of easy options, as source IP address
+is the only thing irkerd can see that an attacker can't spoof.
+
+We mitigate some availability risks by reaping old sessions when we're
+near resource limits. An ordinary DoS attack would then be prevented
+from completely blocking all message traffic; the cost would be a
+whole lot of join/leave spam due to connection churn.
+
+== Authentication/Integrity ==
+
+One way to help prevent DoS attacks would be in-band authentication -
+requiring irkerd submitters to present a credential along with each
+message submission. In principle this, if it existed, could also be used
+to verify that a submitter is authorized to issue notifications with
+respect to a given project.
+
+We rejected this approach. The design goal for irker was to make
+submissions fast, cheap, and stateless; baking an authentication
+system directly into the irkerd codebase would have conflicted with
+these objectives, not to mention probably becoming the camel's nose
+for a godawful amount of code bloat.
+
+The deployment advice in the installation instructions assumes that
+irkerd submitters are "authenticated" by being inside a firewall - that is,
+mesages are issued from an intranet and it can be trusted that anyone
+issuing messages from within a given intranet is authorized to do so.
+This fits the assumption that irker instances will run on forge sites
+receiving requests from instances of irkerhook.py.
+
+One larger issue (not unique to irker) is that because of the
+insecured nature of IRC it is essentially impossible to secure
+#commits against commit notifications that are either garbled by
+software errors and misconfigurations or maliciously crafted to
+confuse anyone attempting to gather statistics from that channel. The
+lesson here is that IRC monitoring isn't a good method for that
+purpose; going direct to the repositories via a toolkit such as Ohloh
+is a far better idea.
+
+When this analysis was originally written, we recommended using spiped
+or stunnel to solve the problem of passing notifications from irkerd
+to IRC servers over a potentially hostile network that might interfere
+with them. Later, SSL/TLS support proved easy to add and is now in
+irkerd itself.
+
+== Secrecy ==
+
+irkerd has no inherent secrecy risks.
+
+The distributed version of irkerhook.py removes the host part of
+author addresses specifically in order to prevent address harvesting
+from the notifications.
+
+== Auditability ==
+
+We previously noted that source IP address is the only thing irker can
+see that an attacker can't spoof. This makes auditability difficult
+unless we impose conventions on the notifications passing though it.
+
+The irkerhook.py that we ship inherits an auditability property from
+the CIA service it was designed to replace: the first field of every
+notification (terminated by a colon) is the name of the issuing
+project. The only other competitor to replace CIA known to us
+(kgb_bot) shares this property.
+
+In the general case we cannot guarantee this property against
+groups A and F.
+
+== Risks relative to centralized services ==
+
+irker and irkerhook.py were written as a replacement for the
+now-defunct CIA notification service. The author has written
+a critique of that service: "CIA and the perils of overengineering"
+at <http://esr.ibiblio.org/?p=4540>. It is thus worth considering how
+a risk assessment of CIA compares to this one.
+
+The principal advantages of CIA from a security point of view were (a)
+it provided a single point at which spam filtering and source blocking
+could be done with benefit to all projects using the service, and (b)
+since it had to have a database anyway for routing messages to project
+channels, the incremental overhead for an authentication feature would
+have been relatively low.
+
+As a matter of fact rather than theory CIA never fully exploited
+either possibility. Anyone could create a CIA project entry with
+fanout to any desired set of IRC channels. Notifications were not
+authenticated, so anyone could masquerade as a member of any project.
+The only check on abuse was human intervention to source-block
+spammers, and this was by no means completely effective - spam shipped
+via CIA was occasionally seen on on the freenode #commits channel.
+
+The principal security disadvantage of CIA was that it meant the
+entire notification system was subject to single-point failure due
+to software or hosting failures on cia.vc, or to DoS attacks
+against the server. While there is no evidence that the site
+was ever deliberately DoSed, failures were sufficiently common
+that a half-hearted DoS attack might not have been even noticed.
+
+Despite the absence of authentication, irker instances on
+properly firewalled intranets do not obviously pose additional
+spamming risks beyond those incurred by the CIA service. The
+overall robustness of the notification system as a whole should
+be greatly improved.
+
+== Conclusions ==
+
+The security and DoS issues irker has are not readily addressable by
+changing the irker codebase itself, short of a complete (much more
+complex and heavyweight) redesign. They are largely implicit risks of
+its operating environment and must be managed by properly controlling
+access to irker instances.
+