What is a VPN?
The Internet is an incredible thing. Computers around the planet can communicate with each other through it. However, the Internet is public space. And in public spaces, bad things can sometimes happen. That’s why we have private space, which we can secure and trust. But how can private space be created among widely separated computers, which can only reach each other through the Internet?
Virtual private network (VPN) connections make that possible. Superficially, they serve as private wormhole tunnels through the Internet. We’ll unpack that description shortly. In any case, whatever VPNs actually are, they are used primarily in two ways. First, organizations and groups use VPNs to securely interconnect widespread locations. Second, they use VPNs to enable secure access by remote staff and customers.
This article focuses on VPN services, which provide enhanced security and privacy to their users. By default, users reach the Internet directly through Internet Service Provider (ISP) gateways, which they typically reach through dialup, DSL, cable, fiber, LTE or WiFi:
ISPs know what sites users are accessing. They can see and modify all content that’s not end-to-end encrypted. For example, they can add tracking supercookies. They can also block or throttle traffic, based on destination, traffic type, aggregate bandwidth usage, or whatever.
Those concerns are not problematic, as long as ISPs are serving their users’ interests, respecting their privacy, and adequately securing their networks. But they do become problematic when ISPs act against users’ interests. For example, governments may pressure ISPs to block access to certain sites. They may require ISPs to log and report online activity. ISPs may even provide full-traffic intercept capability.
VPN services route their users’ Internet traffic through private tunnels to remote exit servers:
There’s nothing private beyond the exit server, just the Internet. Still, VPN services protect their users in three main ways. First, they protect users from voyeurs, trackers, hackers, censors and other adversaries who can access networks between the user and the VPN service. Such adversaries can detect the VPN tunnel, and they can measure traffic volume. They can block the tunnel, but it’s all or none. And they can’t see, modify or specifically block any traffic inside the tunnel, whether it’s end-to-end encrypted or not.
Second, VPN services may allow users to bypass geographic access restrictions imposed by some websites. Websites normally see traffic coming from a user’s Internet Protocol (IP) address, which is assigned by their ISP. And they can get the geographical location of that IP address from services such as MaxMind.1 You can see your current IP address using What Is My IP Address. While you’re using a VPN service, websites instead see the VPN service’s IP address. And so you just pick a VPN exit server with an IP address that’s acceptable to the website that you want to access.
Third, VPN services allow users to be more anonymous. That’s because users are typically sharing a particular VPN exit server with many other users. And they can easily switch to a different VPN exit server. However, websites can identify and track users in many ways.2 The Wall Street Journal published an excellent series on tracking in 2010-2012. You may have seen a No Not Track option in your browser. However, “by and large, the advertising industry ignores them”. The W3C Technical Architecture Group (TAG) has published its opinion that “unsanctioned tracking” is harmful to the Web.
Of course, users are vulnerable to VPN services in the same ways that they’re vulnerable to ISPs. But there’s a crucial distinction: People have far more freedom and discretion in choosing VPN services. Let’s say that your government censors and/or monitors Internet access. And let’s say that it has compromised all available ISPs. Even if that’s so, you can choose a VPN service in another jurisdiction. And you can choose one where it’s much harder for your government to compromise things and obtain information.
What Are VPNs?
OK, so what is a virtual private network? As you probably discovered before finding this page, there’s a lot out there about VPNs. Unfortunately, most of it is either highly technical, or highly simplistic. Worse, much of the technical material is dated and/or misguided, and much of the simplistic material merely promotes a particular VPN service. This article takes a middle course. It does mention various technical issues, but for the most part leaves that to linked resources.
First, what is a network? In this context, a network is a system of computers and other devices that are interconnected by communications links. Those links may be wires, coaxial cables, optical fibers, microwave beams, and so on. Most simply, one may consider networks to include just the communications links and terminating gateways. The gateways mediate and regulate connections by other devices.
The first computer networks were all private, comprising devices located in private space, such as a building or an institutional campus. For such Local Area Networks (LANs) in private spaces, physical access control may provide sufficient security. But private LANs are nontrivial for geographically widespread devices. Dedicated connections are expensive, and they don’t scale well. And so it’s generally necessary to share long-distance communications links. Today, that pretty much means connecting through the Internet.
If you need secure and private connections, that’s a serious problem. In devising network communication protocols, engineers at first assumed that connected devices (and their users) could trust each other, and could also trust the network itself. That was an acceptable assumption for private LANs, operated by the military. But it becomes iffy for shared networks. And it fails utterly for the Internet. The Internet is an utterly public network, and it cannot prudently be trusted.3
The solution was virtual private network (VPN) connections through the untrusted Internet. Efforts in the 80s to secure government and commercial networks culminated in the Internet Protocol Security suite (IPsec). It was the first secure VPN technology. IPSec and other VPNs rely on encapsulation.
In the early 90s, Netscape spearheaded development of the Secure Sockets Layer (SSL) protocol for secure (authenticated and encrypted) web browsing. It’s been largely replaced by the more-secure Transport Layer Security (TLS) protocol. Three notable open-source VPN packages now implement network tunneling with SSL/TLS for security: OpenVPN, OpenConnect and SoftEther. Many VPN services provide IPsec combined with a tunneling protocol (L2TP) that simplifies setup. L2TP/IPsec works best on iOS and Android. However, it’s apparently more vulnerable than OpenVPN to exploitation by the NSA and friends. Microsoft introduced its Point-to-Point Tunneling Protocol (PPTP) in Windows NT and Windows 95. It is not very secure. Please see this comparison of PPTP, L2TP/IPSec and OpenVPN .
To reiterate, encrypted traffic between a VPN server and a client creates a virtual armored cable between them. Intermediaries (and adversaries with access) can see the virtual cable, but they can’t see the data that it carries. VPNs are actually more like very tough yet elastic hoses, which change “diameter” depending on how much data is flowing through them. That provides adversaries with some information about online activity, but not actual traffic data.
Why Do We Need VPNs When We Have HTTPS?
The TLS protocol in Secure HTTP (HTTPS) provides solid transport security. That is, it protects ongoing connections from adversaries. But otherwise, HTTPS is fatally flawed. It’s fatally flawed because server authentication depends on hierarchical systems of certificate authorities, starting with trusted root certificates bundled in browsers. That’s a problem. Consider the Superfish adware that Lenovo included on consumer notebooks. By adding its own root certificate to browsers, Superfish could intercept HTTPS connections, and replace websites’ ads with its own ones. In other words, it carried out man-in-the-middle (MitM) attacks on Lenovo customers.
But it’s far from the worst problem. Let’s say that you visit https://search.disconnect.me/. How does your browser know that it’s connected directly to that site, and that the connection hasn’t been intercepted in a MitM attack? Supposedly, the browser knows because it can follow a chain of trust from the site’s certificate through various intermediate certificate authorities, back to one of the root certificates that it trusts. But trust chains are typically very long and complex. And if one of those intermediate certificate authorities has done something foolish or been compromised, websites can be spoofed or MitM’ed.
Using a VPN service, you get certificates from the provider. Once you’ve securely obtained them, there is no ambiguity when client apps authenticate the provider’s VPN servers. A client won’t connect unless a server proves that it has the requisite certificate authority (CA) certificate. There are no intermediate certificate authorities that must be trusted. And so MitM attacks are much harder. Even so, VPNs only protect against adversaries between a user and a VPN server.
There Are Bigger Problems
Even after decades of security hardening, the Internet remains vulnerable in fundamental ways through unwarranted trust. There are two key vulnerabilities. First, let’s say that you want to use Google. In order to load the page, your browser must translate www.google.com into a suitable IP address. Google has many server clusters, in data centers around the world. The name servers specified in Google’s domain registration are the best source for the IP address of a nearby Google server that’s not too busy. But if everyone hit Google’s primary name servers directly, they would crash and burn. And so there is a hierarchical global network of name servers, known as the Domain Name System (DNS), which forward and temporarily cache that information.
The process begins with name servers that your computer knows about. By default, those typically belong to your ISP. Google being so popular, those name servers will likely have the answer. But if they didn’t, they would ask their ISP’s name servers. And so on up the hierarchy to Google’s primary name servers. Although the system works well for the most part, it is vulnerable to spoofing and denial of service (DOS) attacks by adversaries.
For example, let’s say that your government doesn’t want you to use Google. And so it requires all domestic ISPs to point www.google.com at some non-Google IP address. That’s called DNS spoofing (or cache poisoning). And it’s a common practice.4 There is an easy workaround: just configure your computer to use third party DNS servers.5 However, that isn’t always sufficient, because traffic to those DNS servers can also be blocked or misdirected.
Second, there is a fundamental vulnerability in the Internet Border Gateway Protocol (BGP). Once your browser knows a website’s IP address, BGP enables your ISP (and other intervening ISPs) to properly route your traffic to that destination. What’s problematic is that BGP foolishly assumes that Internet routers can trust each other. But that doesn’t always work out.
Sometimes it’s just mistakes. In June-2015, Telekom Malaysia announced routes to much of southeast Asia and Australia, and then it promptly choked on the massive traffic that ensued. That is, Telekom Malaysia’s mistake prevented people in London (for example) from accessing sites in Singapore, Hong Kong, Sydney and so on. But sometimes one wonders. In 2010, China Telecom “hijacked” a large chunk of the Internet. Although there’s no proof, the Chinese might have monitored and logged on a massive scale. Or instead, they could have just null routed everything.
VPN services can mitigate at least some DNS vulnerabilities, by tunneling beyond area controlled by an adversary. Most countries use DNS spoofing (cache poisoning) to deny access to forbidden websites. But most countries can’t poison the entire DNS hierarchy. For example, in 2014 the Turkish government banned Twitter and YouTube through DNS poisoning. And then, as users started using Google’s DNS servers to get around the ban, it blocked access to them as well. However, all of those blocks were implemented through Turkish ISPs. So VPN users could reach routes and DNS servers that were not under Turkish control.
But VPNs Aren’t Perfect
ISPs can also block VPN connections. Iran and China notoriously do. It’s not hard to detect VPNs. The OpenVPN and IPSec protocols are both distinctive. ISPs can just look at packet types, sequences and patterns. That’s known as deep packet inspection. Also, their systems test suspected VPN servers for VPN-specific response patterns.
One can hide (encapsulate) VPN traffic in other tunnels. There are good introductions here and here. Open-source tools include SSH, SSL (e.g., stunnel) and obfsproxy (developed by the Tor Project). There is also a patch for OpenVPN. And some VPN services use various methods that are proprietary and closed-source. However, the shape of the initial connection dialog between client and server is distinctive. And that’s hard to obfuscate without padding. But padding wastes bandwidth, so there’s a trade-off.6
If your ISP is hijacking BGP, you can bypass using VPNs. As long as they’re not blocked, anyway. More generally, that’s the case whenever you’re inside of some controlled space (e.g., corporate and university networks) or subject to a national firewall. As long as the VPN exit is outside the controlled space, it doesn’t see the BGP hijacking.
Otherwise, it’s hard to get around BGP hijacking. Consider Telekom Malaysia’s mistake. Let’s say that there’s a VPN provider with servers in London and Singapore. If another route existed from London to Singapore that didn’t pass through Telekom Malaysia, that VPN provider could hard-code it into their servers. Even though Telekom Malaysia was hijacking BGP to Singapore, traffic through the London-Singapore VPN tunnel would ignore it. However, unless such problems persisted, it’s unlikely that VPN providers would route around them manually. But corporate, academic and government VPNs might.
- Geolocation based on IP address isn’t perfect. That’s because services like MaxMind typically report central addresses of ISPs, rather than the actual addresses of ISP customers. But they get the country right, and that’s enough to enforce geographic access restrictions.
- Smartphones are especially vulnerable to tracking. Users have far less control over app behavior on smartphones. And there are multiple data sources for accurate geolocation, including GPS, cell towers and WiFi hotspots.
- Indeed, not even fundamental Internet links can be reliably secured over long distances. Cables are cut on land and under oceans. And they are tapped.
- The US FBI uses DNS poisoning for so-called domain name seizures, and the Motion Picture Association of America (MPAA) wants to take down sites hosting pirated content. Various countries use DNS poisoning to ban Interpol’s “worst of the worst” list. The Cyberspace Administration of China (CAC) does one better: it redirects users from banned sites to other sites that it wants to attack.
- Some malware also does that for ad injection or fraud.
- See Chapter 5 of Sambuddho Chakravarty’s thesis.