Earlier this year, I came across Oracle Cloud's generous free ARM compute resources (up to 4CPU and 24GB RAM). It sounded too good to be true, but I tried running a couple of services on an instance and it seemed to work pretty well. In fact, I was wondering why I should pay $5/month for a Vultr VPS when this is free...
So, I took the plunge and tried to migrate my tunnel server from Vultr to Oracle Cloud. I created a new instance and then ran my Ansible provisioning script on the new instance, updated my DNS to point at the new instance and then... things were broken. My hosted services aren't accessible outside of my local network, and I couldn't SSH into any of my machines in the local network even with VPN activated.
Ingress Firewall Rules
Traditional cloud services gives you a lot more control over your VM's network, so does Oracle Cloud. By default, all ports except 22 (SSH) are closed, and you need to explicitly state which ports to open for the VM.
Updating iptable rules to the correct physical interface
So I decided it was time to dig deep and look into the firewall rules on the VM instance itself. Running the following commands will list all the active rules:
$ sudo iptables -S
And it seems to me that Oracle Cloud has defined a couple of rules by default. I disabled the default rules by going to /etc/iptables/rules.v4 and commented out the rules, leaving only the following rules enabled:
Pretty much everything is enabled, which you might think is bad for security, but I had ufw configured so it should be fine...
Defining Wireguard's Interface MTU
After disabling the iptable rules, curl no longer returns Connection Refused, but it seems to be stuck forever. Time to inspect packets with tcpdump:
$ sudo tcpdump -i wg0 port 80
I ran the above commands on the Oracle Cloud instance and the port forward destination, and here are the logs:
I found out that the connection is being properly forwarded, but some packets were not received by the Oracle Cloud instance. Upon closer look packets with large lengths (1000+) are missing from the Oracle Cloud instance side. Weird.
After comparing the states of the old VM instance on Vultr it seems like the MTU for the Wireguard interface is different; Oracle Cloud's MTU is 9000 (!).
$ ip link1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:002: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 link/ether 02:00:17:00:8f:64 brd ff:ff:ff:ff:ff:ff7: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 9000 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/none123456
Googling for "Wireguard MTU" seems to suggest that the MTU is the problem here. This Github gist here states that the optimal MTU is 1420 for the server and 1384 for the peer so I tried it out by adding the MTU into my wireguard config files. curl finally worked!
...or so I thought.
curl to an HTTPS address however, seems to remain stuck. This blog post here recommends an MTU of 1280 throughout the network and with this value HTTPS finally worked as well.